Rock, Paper, Scissors Robot
View project on GitHub: https://tinyurl.com/2p96n7zk
Overview
In this solo project, I set out to create a rock, paper, scissors robotic partner that would allow a user to play in a similar fashion as they would another human. Meeting this requirement meant that some creative computer vision techniques were necessary to facilitate gameplay using natural gestures. The project made use of the Wonik Allegro Robotic Hand, OpenCV computer vision library, and MediaPipe machine learning library in order to create the final result.
System Breakdown
In order to complete this project, it was divided into two nodes:
1. Perception - Determine location of the ball relative to the camera
2. Motion Control - Control the robot to move to the desired location
Perception
The primary mechanism of playing rock, paper, scissors is proper recognition of the game's
associated gestures. To accomplish this, I utilized the MediaPipe machine learning library
and its built-in hand recognition API. The MediaPipe library is able to take inputted video frames,
identify a human hand, and attach landmark coordinates to pre-defined parts of the hand. I then used
relationships between these landmarks to define gestures that would eventually become rock, paper,
and scissors.
For example, the "scissors" gesture was defined as the pinky and ring fingertips being within a certain
small distance threshold of their respective knuckle landmarks, the index and middle fingertips being within a
certain larger distance threshold of their respective knuckle landmarks, and the index and middle fingers
forming an acute angle. Of course, these various thresholds required some tuning to result in consistent
results with different hands and distances from the camera.
After the gestures were defined, starting criteria for the game were needed. In a typical game of
rock, paper, scissors, the game is usually started by the players moving their fists up and down in
a countdown to "shoot!" In a first iteration, this countdown was done by simply iterating a counter
everytime the landmark at the base of the palm crossed a specified row in frame. This approach is not
very robust or intuitive to use, so a better method was needed.
To improve the countdown, the velocity of the hand was calculated to keep track of the direction and
magnitude that it's moving in frame. When the user's hand velocity displays a pattern of positive and
negative velocities in the y-axis that corresponds to a typical countdown pattern, the counter is then
incremented in a similar fashion to the previous iteration.
Once the counter is incremented enough and the game is started, the perception node then publishes the
next detected gesture from the user. It is this published gesture that lets the robot know that the
game has started and that it should pick a gesture.
Motion Control
Once the motion control node receives a human gesture from the perception node, it knows that the game
has officially started and it must picked a gesture. The robot hand gestures for rock, paper, and scissors
are defined as joint state vectors and published to the Allegro hand through the use of the BHand library.
This library takes in a list of joint states for each motor on the hand and runs a control system to maintain
them.