New here? This part combines everything built in Parts 13–19 into one robot: autonomous driving, obstacle avoidance, camera-based steering, voice control, room mapping, and a web dashboard — all running together. If any individual piece feels unfamiliar, the part that introduced it is linked throughout. Otherwise, let’s go.
A Confession Before We Start
The instinct going into this part was to literally merge every piece of code from Parts 13 through 19 — drop them all into one file, wire them together, ship it.
That instinct was wrong.
Stitching together six separately-developed scripts — each with their own globals, their own threads, their own assumptions about what’s running and when — turns into something genuinely hard to debug. Every fix risks breaking something three functions away. The mental model needed to hold it all in your head stops fitting.
The better approach: forget the old files exist. Treat this as one program, designed from scratch, that happens to reuse ideas we already understand deeply. Same functions, same logic — but organized as a single coherent system instead of six systems duct-taped together.
Recap: What This Robot Can Do
Learn to Move → Perception → Localization → Planning → Control → [repeat from Perception]
- HC-SR04 detects obstacles, the robot reacts (Part 14)
- Pi Camera analyzes which direction is clearer (Part 15–16)
- The robot starts and stops on voice command (Part 17)
- A web dashboard provides remote control from any browser (Part 18)
- A simple occupancy grid map records where obstacles have been seen (Part 19)
Today, all of it runs from one file, structured cleanly.
Hardware
No changes. If you’ve wired the C101 + L298N + Raspberry Pi + HC-SR04 + Pi Camera following Parts 13 through 19, the physical setup is already done. This part is a software upgrade only.
Environment Setup — From Scratch
If you’re setting this up on a fresh Pi, or just want the complete list in one place:
# Enable VNC for remote desktop access
sudo raspi-config
# → Interface Options → VNC → Yes → Finish
# Update the system
sudo apt update && sudo apt upgrade -y
# Python environment with system package access
sudo apt install -y python3-pip python3-venv
python3 -m venv my_project_env --system-site-packages
source my_project_env/bin/activate
# Hardware-dependent libraries (install via apt, not pip)
sudo apt install -y python3-picamera2 python3-opencv libopenblas-dev
sudo apt install -y python3-matplotlib
# If OpenCV is outdated:
pip install --upgrade opencv-python
# Python libraries
pip install numpy opencv-python-headless pyserial flask
# Microphone check
arecord -l
# If "Device or resource busy": sudo fuser /dev/snd/* then sudo kill -9 [PID]
# Audio libraries for voice recognition
sudo apt-get install -y python3-pyaudio portaudio19-dev flac
pip3 install SpeechRecognition
Important: in Thonny, confirm the interpreter points to the virtual environment, not system Python:
Tools → Options → Interpreter → Python executable → /home/pi3/my_project_env/bin/python3
How a Clean Python Program Is Organized
Before the code itself, it’s worth naming the structure, because this is the template every serious Python program follows — and it’s the structure we’ll use here:
# 1. IMPORTS
import math
# 2. FUNCTIONS & CLASSES
def calculate_total(a, b):
return a + b
# 3. MAIN PROGRAM LOGIC
def main():
result = calculate_total(5, 10)
print(f"Total: {result}")
# 4. ENTRY POINT
if __name__ == "__main__":
main()
Our robot program follows the same shape, just with more sections:
- Imports
- GPIO hardware setup
- Grid map configuration
- Camera setup
- Global state variables
- Function definitions (motors, sensors, vision, mapping, voice)
- Main loop, guarded by
if __name__ == '__main__':
bossfight.py — The Robot’s Brain
import RPi.GPIO as GPIO
import time
import cv2
import numpy as np
from picamera2 import Picamera2
import math
import matplotlib.pyplot as plt
import speech_recognition as sr
import threading
# ============================================================
# GPIO setup
# ============================================================
GPIO.setmode(GPIO.BCM)
GPIO.setwarnings(False)
ENA = 18; IN1 = 27; IN2 = 22
IN3 = 23; IN4 = 24; ENB = 13
TRIG = 17; ECHO = 25
for pin in [ENA, ENB, IN1, IN2, IN3, IN4, TRIG]:
GPIO.setup(pin, GPIO.OUT)
GPIO.setup(ECHO, GPIO.IN)
pwm_A = GPIO.PWM(ENA, 50)
pwm_B = GPIO.PWM(ENB, 50)
pwm_A.start(0)
pwm_B.start(0)
# ============================================================
# Grid Map configuration
# ============================================================
MAP_SIZE_CM = 400
GRID_SIZE_CM = 10
GRID_ROWS = int(MAP_SIZE_CM / GRID_SIZE_CM)
GRID_COLS = int(MAP_SIZE_CM / GRID_SIZE_CM)
occupancy_grid = np.zeros((GRID_ROWS, GRID_COLS))
robot_x = GRID_ROWS / 2
robot_y = GRID_COLS / 2
robot_angle = 0.0
# ============================================================
# Camera setup
# ============================================================
picam = Picamera2()
picam.preview_configuration.main.format = 'RGB888'
picam.configure("preview")
picam.start()
time.sleep(2)
# ============================================================
# Global state — shared across threads
# ============================================================
is_running = False # True = robot drives autonomously
mapping_mode = False # True = also record obstacles to the grid map
speed = 50
# ============================================================
# Motor / sensor functions
# ============================================================
def get_distance():
GPIO.output(TRIG, True)
time.sleep(0.00001)
GPIO.output(TRIG, False)
start_time = stop_time = time.time()
while GPIO.input(ECHO) == 0:
start_time = time.time()
while GPIO.input(ECHO) == 1:
stop_time = time.time()
return (stop_time - start_time) * 34300 / 2
def set_motor(in1, in2, pwm, speed):
GPIO.output(in1, GPIO.HIGH if speed >= 0 else GPIO.LOW)
GPIO.output(in2, GPIO.LOW if speed >= 0 else GPIO.HIGH)
pwm.ChangeDutyCycle(abs(speed))
def forward(s): set_motor(IN1, IN2, pwm_A, s); set_motor(IN3, IN4, pwm_B, s)
def backward(s): set_motor(IN1, IN2, pwm_A, -s); set_motor(IN3, IN4, pwm_B, -s)
def turn_left(s): set_motor(IN1, IN2, pwm_A, -s); set_motor(IN3, IN4, pwm_B, s)
def turn_right(s): set_motor(IN1, IN2, pwm_A, s); set_motor(IN3, IN4, pwm_B, -s)
def stop_robot(): set_motor(IN1, IN2, pwm_A, 0); set_motor(IN3, IN4, pwm_B, 0)
def decide_direction():
print("Obstacle detected! Analyzing with camera...")
frame = picam.capture_array()
frame = cv2.flip(frame, -1) # Remove if camera is not mounted upside down
gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)
edges = cv2.Canny(gray, 50, 150)
h, w = edges.shape
left_density = np.sum(edges[:, :w//2]) / 255
right_density = np.sum(edges[:, w//2:]) / 255
print(f"Edge density — Left: {left_density:.0f}, Right: {right_density:.0f}")
if left_density < right_density:
return "left"
else:
return "right"
# ============================================================
# Mapping functions
# ============================================================
def update_map(distance, rx, ry, rangle):
if 2 < distance < 400:
angle_rad = math.radians(rangle)
obstacle_x = rx + (distance * math.cos(angle_rad)) / GRID_SIZE_CM
obstacle_y = ry + (distance * math.sin(angle_rad)) / GRID_SIZE_CM
if 0 <= obstacle_x < GRID_ROWS and 0 <= obstacle_y < GRID_COLS:
occupancy_grid[int(obstacle_x)][int(obstacle_y)] = 1
print(f"Obstacle at: X={obstacle_x*GRID_SIZE_CM:.0f}cm, Y={obstacle_y*GRID_SIZE_CM:.0f}cm")
def draw_map():
plt.clf()
plt.imshow(occupancy_grid, cmap='gray_r', origin='lower')
plt.title("Simple Occupancy Grid Map")
plt.xlabel("Y axis (grid cells)")
plt.ylabel("X axis (grid cells)")
plt.pause(0.1)
# ============================================================
# Voice command listener — runs in its own background thread
# ============================================================
def listen_to_commands():
global is_running, mapping_mode
recognizer = sr.Recognizer()
recognizer.energy_threshold = 500
recognizer.dynamic_energy_threshold = False
recognizer.pause_threshold = 0.5
while True:
try:
with sr.Microphone() as source:
print("Waiting for command...")
audio = recognizer.listen(source, timeout=8, phrase_time_limit=4)
command = recognizer.recognize_google(audio, language="en-US").lower()
print(f"Heard: '{command}'")
if "go" in command:
is_running = True
print("-> Go!")
elif "stop" in command:
is_running = False
stop_robot()
print("-> Stop!")
elif "map" in command:
mapping_mode = not mapping_mode
print(f"-> Mapping mode: {mapping_mode}")
except sr.WaitTimeoutError:
pass
except sr.UnknownValueError:
pass
except Exception as e:
print(f"Voice error: {type(e).__name__}: {e}")
# ============================================================
# Cleanup
# ============================================================
def cleanup():
stop_robot()
pwm_A.stop()
pwm_B.stop()
GPIO.cleanup()
picam.stop()
# ============================================================
# Main loop
# ============================================================
if __name__ == '__main__':
try:
print("Voice control active. Say 'go' to start, 'stop' to halt, 'map' to toggle mapping.")
voice_thread = threading.Thread(target=listen_to_commands, daemon=True)
voice_thread.start()
if mapping_mode:
plt.ion()
fig = plt.figure()
while True:
if is_running:
dist = get_distance()
print(f"Distance: {dist:.1f} cm")
if mapping_mode:
update_map(dist, robot_x, robot_y, robot_angle)
draw_map()
if dist > 20:
forward(speed)
if mapping_mode:
angle_rad = math.radians(robot_angle)
robot_x += (speed * 0.02 * math.cos(angle_rad)) / GRID_SIZE_CM
robot_y += (speed * 0.02 * math.sin(angle_rad)) / GRID_SIZE_CM
else:
stop_robot()
backward(50)
time.sleep(0.8)
stop_robot()
choice = decide_direction()
if choice == "left":
turn_left(speed)
if mapping_mode:
robot_angle += 45
else:
turn_right(speed)
if mapping_mode:
robot_angle -= 45
time.sleep(0.8)
stop_robot()
time.sleep(0.2)
else:
stop_robot()
time.sleep(0.1)
except KeyboardInterrupt:
print("Stopped by user.")
finally:
cleanup()
What changed from the individual parts: mapping is no longer its own separate loop bolted awkwardly into the middle of the driving logic (an early draft tried exactly that, and the indentation alone made it unrunnable). Instead, mapping_mode is a toggle — say "map" and the robot starts recording obstacles to the grid while it drives; say it again to turn that off. One flat main loop, no nested while loops fighting each other for control.
Run this file directly (F5 in Thonny) to test driving, obstacle avoidance, camera steering, and voice control together before adding the web layer.
Adding the Web Dashboard
Two more files, same structure as Part 18 — just pointed at the new bossfight.py.
app.py:
from flask import Flask, render_template, request, jsonify
import threading
import time
import speech_recognition as sr
import bossfight
from bossfight import forward, backward, turn_left, turn_right, stop_robot, get_distance, decide_direction, cleanup
app = Flask(__name__)
speed = 50
auto_mode = False
voice_thread_active = False
def voice_recognition_thread():
recognizer = sr.Recognizer()
microphone = sr.Microphone()
global auto_mode, speed, voice_thread_active
with microphone as source:
recognizer.adjust_for_ambient_noise(source)
print("Voice recognition started...")
while voice_thread_active:
try:
with microphone as source:
audio = recognizer.listen(source, timeout=2, phrase_time_limit=3)
command = recognizer.recognize_google(audio, language="en-US").lower()
print(f"Recognized: {command}")
if "forward" in command: forward(speed)
elif "backward" in command: backward(speed)
elif "left" in command: turn_left(speed)
elif "right" in command: turn_right(speed)
elif "stop" in command: stop_robot()
elif "auto" in command: auto_mode = True
elif "manual" in command: auto_mode = False
except sr.WaitTimeoutError:
pass
except Exception:
pass
def auto_pilot_thread():
global auto_mode
while True:
if auto_mode:
distance = get_distance()
print(f"Distance: {distance} cm")
if distance < 30:
stop_robot()
time.sleep(0.5)
decide_direction()
time.sleep(0.8)
else:
forward(speed)
time.sleep(0.2)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/api/control', methods=['POST'])
def control():
data = request.json
action = data.get('action')
global auto_mode, speed
if action == 'forward': forward(speed)
elif action == 'backward': backward(speed)
elif action == 'left': turn_left(speed)
elif action == 'right': turn_right(speed)
elif action == 'stop': stop_robot()
elif action == 'speed_up': speed = min(100, speed + 10)
elif action == 'speed_down': speed = max(10, speed - 10)
elif action == 'mapping':
bossfight.mapping_mode = not bossfight.mapping_mode
return jsonify({"status": "success", "mapping_mode": bossfight.mapping_mode})
return jsonify({"status": "success", "speed": speed})
@app.route('/api/mode', methods=['POST'])
def mode():
data = request.json
global auto_mode
auto_mode = data.get('auto_mode', False)
return jsonify({"status": "success", "auto_mode": auto_mode})
@app.route('/api/voice', methods=['POST'])
def voice_control():
data = request.json
global voice_thread_active
if data.get('start_voice'):
if not voice_thread_active:
voice_thread_active = True
threading.Thread(target=voice_recognition_thread, daemon=True).start()
else:
voice_thread_active = False
return jsonify({"status": "success", "voice_active": voice_thread_active})
if __name__ == '__main__':
threading.Thread(target=auto_pilot_thread, daemon=True).start()
try:
app.run(host='0.0.0.0', port=5000, debug=False)
finally:
cleanup()
A note on the mapping toggle: it imports the whole bossfight module (import bossfight) rather than importing mapping_mode directly. Pulling in a single variable by name (from bossfight import mapping_mode) copies its value at import time — later changes to the real variable inside bossfight.py wouldn’t be reflected. Referencing bossfight.mapping_mode always reads the current, live value.
templates/index.html — identical to Part 18, with one new button added to the movement section:
<button onclick="sendAction('mapping')">Toggle Map</button>
Folder structure:
my_project_env/
├── bossfight.py
├── app.py
└── templates/
└── index.html
Running the Full System
cd /home/pi3/my_project_env
source bin/activate
python3 app.py
Find the Pi’s IP: hostname -I. Open a browser on any device on the same network: http://192.168.x.x:5000.
From there: tap Forward/Left/Right/Backward/Stop for manual control, toggle Mode: Auto for autonomous driving, toggle Enable Voice for voice commands, tap Toggle Map to start recording the room layout while driving.
Say "go," watch it drive. Say "map," watch the obstacle grid start filling in. Say "stop," it freezes. All from a robot that started this blog as two motors and a keyboard.
What This Part Is Really About
Not the code itself — that’s mostly a recombination of things already built. The real lesson is how to combine systems without losing control of them: one shared state (the global variables), one flat main loop, clearly separated functions, and a web layer that reads and writes that same shared state rather than duplicating logic.
That pattern — shared state, single source of truth, clean separation between sensing and acting — is the same pattern real robotics software uses, just at a much larger scale.
Next up: Part 21 — Phase 2 Recap. What we built, what we learned, and where Phase 3 begins.
