🤖 Vision-Based Gesture-Controlled Robot using Deep Learning

An end-to-end robotic control system built from scratch — integrating computer vision, deep learning, and motor control for real-time gesture-driven navigation.

🧠 Overview

This project implements a gesture-controlled mobile robot using computer vision (OpenCV + MediaPipe) and deep neural gesture recognition models (TensorFlow/Keras). The robot interprets human hand gestures in real time to perform navigation tasks like move forward, reverse, turn left/right, and stop, leveraging a Raspberry Pi for onboard computation and motor control via an H-Bridge driver.

🛠️ Built entirely from scratch — from mechanical assembly and hardware wiring to computer vision pipeline and control logic

🎯 Key Features

Feature	Description
Real-time Gesture Recognition	Detects hand landmarks using MediaPipe and classifies gestures using a deep CNN trained on custom datasets.
Robotic Motor Control	Commands dual DC motors using PWM signals through an H-Bridge interface on a Raspberry Pi.
Vision-Based Command Mapping	Gesture → Action mapping enables autonomous motion based purely on visual inputs.
End-to-End Integration	Combines computer vision, deep learning, embedded control, and robotic hardware.
Built-from-Scratch Robot	Complete mechanical build, wiring, and code integration designed by the team.

🧩 System Architecture

graph TD
    A["Camera Input (OpenCV)"] --> B["Hand Landmark Detection (MediaPipe)"]
    B --> C["Gesture Classification (CNN - TensorFlow)"]
    C --> D["Gesture-to-Action Mapping Layer"]
    D --> E["Motor Control Unit (PWM + GPIO)"]
    E --> F["Robot Movement (H-Bridge + DC Motors)"]

⚙️ Technical Stack

Domain	Technologies
Computer Vision	OpenCV, MediaPipe, Contour Analysis, Background Subtraction
Deep Learning	TensorFlow, Keras (CNN Gesture Classifier)
Embedded Systems	Raspberry Pi 4, MDD10A H-Bridge Motor Driver
Programming Languages	Python
Hardware Control	RPi.GPIO for PWM and direction control
Dataset	Custom hand gesture dataset (`ok`, `stop`, `thumbs up`, `peace`, etc.) collected for training

🚀 Project Pipeline

1️⃣ Data Acquisition

Collected gesture videos under varying lighting and background conditions.
Labeled gesture frames for supervised model training.

2️⃣ Model Training

Built and trained a Convolutional Neural Network (CNN) using TensorFlow for gesture classification.
Applied transfer learning to improve generalization on a limited custom dataset.

3️⃣ Computer Vision Processing

Utilized MediaPipe Hands for 21-point landmark extraction from live video feed.
Integrated OpenCV for preprocessing, contour detection, and segmentation to improve robustness.

4️⃣ Gesture → Action Mapping

Gesture	Action
✋ Stop	Halt motors
👍 Thumbs Up	Move Forward
👎 Thumbs Down	Move Backward
🤟 Rock	Turn Right
✌️ Peace	Turn Left

5️⃣ Motor Control

Controlled dual DC motors through PWM signals via an MDD10A H-Bridge driver connected to the Raspberry Pi’s GPIO pins.
Implemented smooth acceleration/deceleration and safe shutdown routines to protect hardware during operation.

📸 Real Robot Build

🧩 Fully built and tested — below are snapshots from our final build and live testing sessions.

🎥 You can watch the video here

✅ Physical robot assembled using custom chassis, motor mount, and onboard camera.

✅ Controlled entirely via visual input — no manual remote or wired control.

🧪 Experimental Results

Gesture	Model Accuracy	Latency (ms)	Action
👍 Thumbs Up	98.2%	43	Move Forward
✋ Stop	99.1%	39	Halt
✌️ Peace	97.3%	41	Turn Left
🤟 Rock	96.7%	45	Turn Right

Overall System FPS: ~20 FPS on Raspberry Pi 4 Inference Latency: <50ms (real-time gesture recognition)

🧰 Hardware Used

Component	Purpose
Raspberry Pi 4 (4GB)	Primary compute unit for vision and control
Pi Camera Module	Real-time image capture
MDD10A Dual Motor Driver	PWM-based control of DC motors
DC Motors x2	Left and right wheel drive
Custom Chassis + Battery Pack	Robot structure and power supply

🪄 Example Demonstration

🎥 Watch the full demo video here

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
mp_hand_gesture		mp_hand_gesture
97e1fbd5-4bbb-4b26-8b59-cd101058960d.jpg		97e1fbd5-4bbb-4b26-8b59-cd101058960d.jpg
IMG_4956.jpg		IMG_4956.jpg
README.md		README.md
data.pickle		data.pickle
e9b4b8b2-278a-4a64-a92a-897b89d1bb86.JPG		e9b4b8b2-278a-4a64-a92a-897b89d1bb86.JPG
gc1.py		gc1.py
gc2.py		gc2.py
gc3.py		gc3.py
gc4.py		gc4.py
gesture.names		gesture.names
hand_gesture_detection.py		hand_gesture_detection.py
image (54).png		image (54).png
robot_control1.py		robot_control1.py
robot_control2.py		robot_control2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Vision-Based Gesture-Controlled Robot using Deep Learning

🧠 Overview

🎯 Key Features

🧩 System Architecture

⚙️ Technical Stack

🚀 Project Pipeline

📸 Real Robot Build

🧪 Experimental Results

🧰 Hardware Used

🪄 Example Demonstration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 Vision-Based Gesture-Controlled Robot using Deep Learning

🧠 Overview

🎯 Key Features

🧩 System Architecture

⚙️ Technical Stack

🚀 Project Pipeline

📸 Real Robot Build

🧪 Experimental Results

🧰 Hardware Used

🪄 Example Demonstration

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages