Python OpenCV Gesture Controlling Game Project
https://techvidvan.com/tutorials/python-opencv-gesture-controlling-game/
Python OpenCV Gesture Controlling Game Project
E N D
Presentation Transcript
Python OpenCV Gesture Controlling Game Project Gesture Controlling Game Using OpenCV is a cool technology that uses your hand movements to control a game. It uses a special computer program called OpenCV to detect your hand gestures through a camera and translates them into game commands. For example, raising your index finger might move the game character forward. This makes gaming more interactive and natural. The technology has many other potential uses besides gaming, like controlling robots or helping doctors with medical procedures. MediaPipe MediaPipe is a powerful computer vision framework that was created by Google to help developers build machine learning-based applications. It comes with pre-built models that can detect things like faces and hands. By using the hand detection model, we can create a unique digital drawing application that lets users draw with their hands, without needing a stylus or any other input device. The hand detection model works by detecting the
position of the user’s hand and then outputting a set of coordinates that can be used to track its movement and gestures. This innovative technology has the potential to revolutionize the way we interact with digital devices and create a more intuitive and immersive user experience. Pyautogui PyAutoGUI is a Python library that allows users to automate repetitive tasks on their computer such as keyboard and mouse movements. It’s easy to use and compatible with different platforms. It can help improve productivity, but should be used responsibly. Gesture Recognition Gesture recognition technology enables computers to detect and respond to body or hand movements, providing an intuitive and enjoyable way to draw digitally. OpenCV is an open-source library that can be used to recognize gestures in digital drawing by capturing images of the user’s hands and analyzing them using OpenCV functions. This technology can identify specific hand gestures, such as swipes or circles, and convert them into digital drawing commands.
Prerequisites for Gesture Controlling Game Using Python Opencv It is important to have a solid understanding of the Python programming language and the OpenCV library. Apart from this, you should have the following system requirements. ● Python 3.7 (64-bit) and above ● Any Python editor (VS code, Pycharm) Download Python Opencv Gesture Controlling Game Project Please download the source code of Python Opencv Gesture Controlling Game Project: Python Opencv Gesture Controlling Game Project Code Installation Open Windows cmd as administrator
1. To install the opencv library, run the command from the cmd. pip install opencv-python 2. To install the mediapipe library, run the command from the cmd. pip install mediapipe 3. To install the pyautogui library, run the command from the cmd. pip install pyautogui Let’s Implement To implement this follow the below steps.
1. We need to import some libraries that will be used in our implementation. import cv2 import mediapipe as mp from tensorflow.keras.models import load_model import pyautogui 2. The code uses the MediaPipe library in Python to detect and draw one hand in an image or video. It only detects hands with a confidence level of 70% or higher and uses MediaPipe’s drawing module to draw the hand landmarks on the image. mpHands = mp.solutions.hands hands = mpHands.Hands(max_num_hands=1, min_detection_confidence=0.7) mpDraw = mp.solutions.drawing_utils 3. It opens the integrated camera. cap = cv2.VideoCapture(0)
4. Start the while loop. while True: 5. cap.read() function returns two values, the first is stored in ‘ret’, which is a boolean value. The function cap.read() reads the frame and returns two values: ret (a boolean indicating if the frame was successfully read) and frame (an array of pixel values in the captured frame). These values can be used to process or display the image ret, frame = cap.read() 6. This code calculates the dimensions of an input frame and stores the values in the variables x, y, and c. Specifically, x represents the height of the frame, y represents the width of the frame, and c represents the number of color channels in the frame. x, y, c = frame.shape
7. This code flips and mirrors the input frame horizontally and changes its color space to RGB. It saves the modified frame in a variable called an image, which can be used for further processing. frame = cv2.flip(frame, 1) framergb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 8. In this code, a hand detection model called MediaPipe is used to detect hands in the input frame. The detected results are stored in a variable called “result”. The second line of code initializes a variable called “num_fingers” to 0, which can be used to count the number of fingers detected in a hand gesture. result = hands.process(framergb) num_fingers = 0 9. The if statement checks whether the “result” variable contains any detected hand landmarks. If it does, the code creates an empty list called “landmarks” to store the coordinates of these landmarks.
if result.multi_hand_landmarks: landmarks = [] 10. This code starts a loop that goes through each hand detected in the input frame by MediaPipe’s hand detection model. for handslms in result.multi_hand_landmarks: 11. The code calculates the exact position of each detected landmark point by multiplying its relative position with the width and height of the input frame, and stores the result in a list called “landmarks”. The int() function is used to convert the floating-point values to integers to get whole numbers. lmx = int(lm.x * x) lmy = int(lm.y * y) landmarks.append([lmx, lmy]) 12. The following code uses the mpDraw module to draw the hand landmarks and connections on the input frame. It takes the landmarks detected by the
MediaPipe hand detection model and connects them with lines to create the shape of the hand. The resulting image displays the detected hand with dots representing the landmarks and lines connecting them. mpDraw.draw_landmarks(frame, handslms, mpHands.HAND_CONNECTIONS) 13. This code checks if any hand landmarks were detected by checking the length of the “landmarks” list. If landmarks are present, it checks the position of specific landmarks to determine the number of extended fingers and increments the “num_fingers” variable accordingly. if len(landmarks) > 0: if landmarks[4][0] < landmarks[3][0]: num_fingers += 1 if landmarks[8][1] < landmarks[6][1]: num_fingers += 1 if landmarks[12][1] < landmarks[10][1]: num_fingers += 1 if landmarks[16][1] < landmarks[14][1]: num_fingers += 1 if landmarks[20][1] < landmarks[18][1]: num_fingers += 1
14. This code block checks if the user is holding up one finger to indicate “Accelerate”. It displays the text “Accelerate” on the screen and triggers a series of keyboard commands using the pyautogui library to simulate pressing the “up” arrow key and releasing the “down” and “right” arrow keys if num_fingers == 1: cv2.putText(frame, "Accelerate", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2, cv2.LINE_AA) pyautogui.keyUp("down") pyautogui.keyUp("right") pyautogui.keyDown("up") pyautogui.keyUp("left") 15. This code block corresponds to a break action in a driving game. It displays the “Break” text on the input frame and simulates the pressing of the “down” arrow key and releasing of the “up”, “right”, and “left” arrow keys using the PyAutoGUI library when two fingers are extended. elif num_fingers == 2: cv2.putText(frame, "Break", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2, cv2.LINE_AA) pyautogui.keyUp("up") pyautogui.keyDown("down") pyautogui.keyUp("right") pyautogui.keyUp("left")
16. If three fingers are detected, the code sets the gesture to “Right,” indicating that the user wants to move to the right. It then sends keyboard inputs to the system to simulate the right arrow key press. elif num_fingers == 3: cv2.putText(frame, "Right", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2, cv2.LINE_AA) pyautogui.keyUp("up") pyautogui.keyUp("down") pyautogui.keyDown("right") pyautogui.keyUp("left") 17. This code block executes if the number of extended fingers is 4. It displays the text “Left” on the input frame and simulates the pressing of the “left” arrow key and releasing of the “up”, “down”, and “right” arrow keys using the PyAutoGUI library. This code likely corresponds to a left-turn action in a driving game. celif num_fingers == 4: cv2.putText(frame, "Left", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2, cv2.LINE_AA) pyautogui.keyUp("up")
pyautogui.keyUp("down") pyautogui.keyUp("right") pyautogui.keyDown("left") 18. This code block executes if the number of extended fingers is either 0 or more than 5. It displays the text “Nothing” on the input frame and releases all the arrow keys using the PyAutoGUI library, indicating that no action should be taken. elif num_fingers == 5: cv2.putText(frame, "Nothing", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2, cv2.LINE_AA) pyautogui.keyUp("up") pyautogui.keyUp("down") pyautogui.keyUp("right") pyautogui.keyUp("left") else: cv2.putText(frame, "Nothing", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,0,255), 2, cv2.LINE_AA) pyautogui.keyUp("up") pyautogui.keyUp("down") pyautogui.keyUp("right") pyautogui.keyUp("left") Note:- step 11-18 must be written under step 10th for loop 19. This code displays the output frame with landmarks and text using the imshow method of the cv2 library. It also waits for the user to press the ‘q’ key to exit the program using the waitKey method with a parameter of 1.
cv2.imshow("TechVidvan", frame) if cv2.waitKey(1) == ord('q'): break Note:- Steps 5-19 must be written under the while loop. 20. Icap.release() frees the resources used to capture the video or camera stream, allowing them to be used by other applications. cv2.destroyAllWindows() closes all OpenCV windows, freeing system resources and ensuring the program ends cleanly. cap.release() cv2.destroyAllWindows() Conclusion The gesture control game using OpenCV is a cool project that uses computer vision to detect hand landmarks and interpret simple hand gestures to control a game. With the help of PyAutoGUI, the program can simulate keyboard inputs based on the detected gestures, allowing the user to control the game using hand movements. It’s an impressive example of the potential of computer vision and machine learning in creating interactive gaming experiences.