Rock, Paper, Scissors Robot

View project on GitHub: https://tinyurl.com/2p96n7zk


Overview

In this solo project, I set out to create a rock, paper, scissors robotic partner that would allow a user to play in a similar fashion as they would another human. Meeting this requirement meant that some creative computer vision techniques were necessary to facilitate gameplay using natural gestures. The project made use of the Wonik Allegro Robotic Hand, OpenCV computer vision library, and MediaPipe machine learning library in order to create the final result.

System Breakdown

In order to complete this project, it was divided into two nodes:
1. Perception - Determine location of the ball relative to the camera
2. Motion Control - Control the robot to move to the desired location

Perception

The primary mechanism of playing rock, paper, scissors is proper recognition of the game's associated gestures. To accomplish this, I utilized the MediaPipe machine learning library and its built-in hand recognition API. The MediaPipe library is able to take inputted video frames, identify a human hand, and attach landmark coordinates to pre-defined parts of the hand. I then used relationships between these landmarks to define gestures that would eventually become rock, paper, and scissors.

For example, the "scissors" gesture was defined as the pinky and ring fingertips being within a certain small distance threshold of their respective knuckle landmarks, the index and middle fingertips being within a certain larger distance threshold of their respective knuckle landmarks, and the index and middle fingers forming an acute angle. Of course, these various thresholds required some tuning to result in consistent results with different hands and distances from the camera.

After the gestures were defined, starting criteria for the game were needed. In a typical game of rock, paper, scissors, the game is usually started by the players moving their fists up and down in a countdown to "shoot!" In a first iteration, this countdown was done by simply iterating a counter everytime the landmark at the base of the palm crossed a specified row in frame. This approach is not very robust or intuitive to use, so a better method was needed.

To improve the countdown, the velocity of the hand was calculated to keep track of the direction and magnitude that it's moving in frame. When the user's hand velocity displays a pattern of positive and negative velocities in the y-axis that corresponds to a typical countdown pattern, the counter is then incremented in a similar fashion to the previous iteration.

Once the counter is incremented enough and the game is started, the perception node then publishes the next detected gesture from the user. It is this published gesture that lets the robot know that the game has started and that it should pick a gesture.

Motion Control

Once the motion control node receives a human gesture from the perception node, it knows that the game has officially started and it must picked a gesture. The robot hand gestures for rock, paper, and scissors are defined as joint state vectors and published to the Allegro hand through the use of the BHand library. This library takes in a list of joint states for each motor on the hand and runs a control system to maintain them.

At the moment, the gestures that the hand chooses to play are randomly chosen. This results in a rock, paper, scissors experience that is very similar to how two humans would play. In future iterations, prediction techniques could be implemented to guess the human's gesture choice immediately before it chooses its own pose in an attempt to "cheat" and always pick the winning gesture.