New here? This is the recap of Phase 2 — the journey from a Raspberry Pi with no purpose to a fully autonomous, voice-controlled, web-dashboard-equipped robot that maps its surroundings. If you want the play-by-play, Parts 12 through 20 have it all. This is the “what we actually built and why it matters” version.
Where Phase 2 Started
Phase 1 ended with a robot that could see, hear, and move — but only with a laptop permanently tethered to it, either by USB cable or by someone sitting nearby pressing keys. Functional, educational, but not exactly something you’d be proud to show off at a dinner party.
Phase 2 had one job: cut that last cord. Give the robot its own brain, its own senses, and the ability to act on what it senses — without a human in the loop unless we choose to be.
Nine parts later, here’s what exists.
The Five-Step Framework, Revisited
Learn to Move → Perception → Localization → Planning → Control → [repeat from Perception]
This framework has been the backbone of every part since 13. Worth seeing it filled in completely:
Learn to Move (Part 13) — The C101 4WD kit connects to the Raspberry Pi through the L298N motor driver. Python running directly on the Pi controls four motors via GPIO. No laptop, no USB cable, no Arduino middleman. The first time those wheels spun without a human pressing a key was, genuinely, a small milestone worth a coffee.
Perception — touch (Part 14) — The HC-SR04 ultrasonic sensor gives the robot a sense of “something is X centimeters ahead.” Simple physics, no AI, and it works in complete darkness. The robot stops, reverses, and picks a direction before continuing.
Perception — vision (Part 15–16) — A Pi Camera (the original 5MP module, because the v2 met an unfortunate end against a desk) adds actual sight. Haar Cascade detects faces. Edge density comparison measures which side of the frame has more open space. Combined with the HC-SR04, the robot’s turn decisions stopped being random guesses and became — loosely, approximately — informed choices.
Human-Robot Interaction (Part 17) — A $10 USB microphone and Python’s SpeechRecognition library let the robot respond to spoken commands: “go” and “stop.” No keyboard, no remote, no app. Getting there required a full debug marathon — busy audio devices, misconfigured default sound cards, an energy threshold that auto-detected at 16,731 for reasons nobody fully explained — but it works.
Remote Access (Part 18) — A Flask web server turns any phone or laptop on the same WiFi into a control panel. Buttons for manual driving, a toggle for autonomous mode, a toggle for voice control. No app store, no pairing, just a URL.
Localization, sort of (Part 19) — A simple occupancy grid map, built from HC-SR04 readings and some honest, openly-acknowledged guesswork about position and heading. Not real SLAM — no encoders, no IMU — but the same underlying idea: sense, estimate position, record, repeat.
Everything together (Part 20) — One file, one coherent structure, all of the above working in concert. The lesson that mattered most here wasn’t the code itself — it was learning that combining six separately-built systems works far better when you rebuild them as one system, rather than literally stitching six files together and hoping for the best.
What Actually Broke (The Honest Ledger)
This blog has never pretended things worked on the first try. In the spirit of that, here’s the running list of real problems hit during Phase 2:
cv2.imwrite()failing silently because the destination folder didn’t exist- A virtual environment created without
--system-site-packages, breaking every import ofpicamera2 - PyTorch and TFLite — both dead ends on 32-bit ARM, no matter how much swap memory got thrown at the problem
- A camera mounted upside down due to chassis space constraints, fixed with one line:
cv2.flip(frame, -1) - A USB microphone that worked one day and inexplicably stopped working the next, fixed with a permanent
/etc/asound.conf - A
stop_robot()function that didn’t actually stop the robot, because it reset PWM duty cycle but left the direction pins HIGH - Flask refusing to start because an imported file’s main loop ran on import, solved with
if __name__ == '__main__': - A mapping loop nested so badly inside another loop that Python wouldn’t even parse the file
None of these were exotic. All of them were the kind of thing that happens constantly in real embedded systems work — which is exactly why documenting them has value. A tutorial that never breaks teaches you what working code looks like. A project log that breaks repeatedly teaches you how to fix things, which is the actual skill.
The Honest Cost
Hardware for Phase 2, building on what Phase 1 already had:
| Component | Cost |
|---|---|
| Raspberry Pi 3B | ~$50 |
| C101 4WD kit | ~$20 |
| Pi Camera Module (v1, after the v2 incident) | ~$10–15 |
| HC-SR04 | ~$4 |
| USB Microphone | ~$10 |
| 7.4V Li-ion battery pack | ~$12 |
| Power bank for the Pi | ~$15 |
Call it $120–130 total for Phase 2, on top of whatever was already on hand from Phase 1. Less than the cost of a mid-range robot vacuum, and arguably more educational than owning ten of them.
What This Robot Actually Is Now
A C101 chassis that:
- Drives on its own power, no remote, no cable
- Stops before hitting things
- Picks a turn direction based on a real (if imperfect) read of its surroundings
- Responds to voice commands
- Can be controlled from a phone browser
- Keeps a rough record of where it’s seen obstacles
Is it a Roomba? No — a Roomba has actual encoders, a proper IMU, and engineers who get paid to make it work reliably. Is it Bumblebee? Also no, but Bumblebee isn’t real, so the bar there was always going to be a stretch.
What it is: a robot built from roughly $250 total across both phases, by someone who started this whole project not entirely sure what an Arduino pin actually does, debugging in real time, documenting every wrong turn along the way.
What’s Next
Phase 3 doesn’t have a fixed shape yet — and that’s intentional. Some candidate directions: a warehouse-style navigation robot that follows a path and reports status, a security camera system that detects people and sends alerts, an automatic plant-watering setup, a face-recognition door lock built on the work from Part 8B.
Which of these actually gets built will depend on what’s interesting once Phase 2 has had time to breathe — and on what hardware happens to be sitting in the garage.
For now: the robot moves, senses, decides, listens, and remembers. That’s not nothing. That’s actually quite a lot.
Phase 2 complete. Phase 3 — to be decided together.

