VN4000: Adding Eyes to the Robot with Pi Camera and OpenCV

New here? This part adds a Pi Camera to the C101 robot and runs live object detection and obstacle awareness using OpenCV — no heavy AI model required. The setup builds on the Raspberry Pi environment from Part 12.

If remote connection via SSH or VNC feels unfamiliar, Part 12 has the full walkthrough. Otherwise, let's go.

Where We Are in the Loop

A quick reminder of the five-step framework we keep coming back to:

Learn to Move → Perception → Localization → Planning → Control → [repeat from Perception]

Part 13: ✅ Learn to Move — the robot drives on its own. Part 14: ✅ Perception (touch) — HC-SR04 detects obstacles and the robot reacts. Part 15: ✅ Perception (vision) — the robot gets eyes.

The HC-SR04 did a solid job in Part 14. But let's be honest about what it can and can't do.

Imagine walking down a sidewalk with your eyes closed. You extend both arms to feel for obstacles ahead — if nothing's there, you keep walking; if something is, you step around it. Reasonable enough, until the moment you step off the curb and a silent electric car coming the wrong way is suddenly three feet away. Your hands told you the path was clear. They just couldn't see the drop.

That's exactly the situation our C101 is in. HC-SR04 says "something is 25cm ahead" — useful. But it has no idea whether that something is a wall, a chair leg, or the edge of a table. And it has absolutely no opinion about what's to the left or right.

The robot needs eyes. That means a camera.

A Quick Note on Hardware Choices

This project was supposed to use a Pi Camera Module v2. Then it met the edge of a desk at speed and that was the end of that.

So we're running on a Pi Camera Module 1 — the original, 5MP, discontinued, "why do you still have this" version. Combined with a Raspberry Pi 3B that the rest of the world has largely moved on from, the whole setup feels a bit like installing twenty apps on an iPhone 6 while everyone around you is talking about the iPhone 18.

It works. That's the point.

Connecting the Pi Camera

Power off the Pi first — always, before touching the camera connector. The CSI port doesn't forgive hot-plugging.

The camera ribbon cable connects to the CSI port on the Pi 3B — the small slot between the HDMI port and the 3.5mm audio jack.

Pull the two black locking tabs straight up gently — about 2–3mm
Insert the ribbon cable — silver contacts facing toward the USB ports, blue side facing toward HDMI
Press the locking tabs back down firmly until they click

Power the Pi back on and verify the camera is detected:

rpicam-still -o test.jpg

If you see "Still capture image received" — camera is working. If the command isn't found, you're on an older OS version; try raspistill -o test.jpg instead.

Remote Connection Recap

The Pi is a tiny computer. To work with it, we borrow the laptop's keyboard and screen — but wirelessly, so no cables trail behind the robot.

SSH gives you a command line on the Pi from your laptop terminal:

# Windows: use PuTTY → Host Name = Pi's IP → SSH → port 22
# macOS/Linux:
ssh pi3@192.168.1.42

VNC gives you the full Pi desktop — mouse, icons, Thonny IDE — streamed to your laptop screen. Before using VNC Viewer on the laptop, enable VNC on the Pi first:

sudo raspi-config

→ Interface Options → VNC → Yes → Finish

Then open RealVNC Viewer on the laptop, enter the Pi's IP address, log in. The Pi desktop appears on screen as if you were sitting in front of a monitor plugged into it.

Full details in Part 12 if needed.

Why Not YOLO? Why Not TFLite?

In Phase 1, we used YOLOv8 on the laptop without a second thought. The laptop has the CPU headroom for it.

The Pi 3B does not.

The original plan for this part used TFLite (TensorFlow Lite) — Google's lightweight AI framework designed for embedded devices. One install attempt later:

ERROR: No matching distribution found for tflite-runtime

TFLite has no official build for 32-bit ARM. The Pi 3B runs a 32-bit OS. Dead end.

PyTorch — same story. No 32-bit ARM build. Install attempt downloads 426MB, crashes halfway through. Every time.

So we do what this blog has always done: find what's already there and use it well. Haar Cascade is built into OpenCV, requires no model download, no GPU, no 500MB dependency chain. It runs at a perfectly respectable speed on the Pi 3B. And combined with edge density detection, it gives the robot everything it needs for Part 16.

Setting Up the Environment

Install picamera2 — the official camera library for modern Raspberry Pi OS:

sudo apt install -y python3-picamera2

Then recreate the virtual environment with system-site-packages access (this is the flag that prevents the "No module named libcamera" error):

deactivate
cd ~
mv my_project_env my_project_env_backup
python3 -m venv my_project_env --system-site-packages
source my_project_env/bin/activate

Verify everything is in order:

python3 -c "from picamera2 import Picamera2; import cv2; import RPi.GPIO as GPIO; print('All OK')"

If you see All OK — ready to write code.

The Code

Open Thonny via VNC Viewer (Menu → Programming → Thonny Python IDE). Create a new file, save it as PicameraDetectV1.py, and paste this in:

import cv2
import numpy as np
from picamera2 import Picamera2

# Load Haar Cascade detectors
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
body_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_upperbody.xml')

# Init camera
picam2 = Picamera2()
picam2.preview_configuration.main.format = 'RGB888'
picam2.configure("preview")
picam2.start()

print("Object detection running. Press Ctrl+C to stop.")

try:
    while True:
        frame = picam2.capture_array()
        # If your camera image appears upside down, uncomment the line below:
        # frame = cv2.flip(frame, -1)

        frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        gray = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2GRAY)

        # Detect faces/people using Haar Cascade
        faces = face_cascade.detectMultiScale(gray, 1.1, 5, minSize=(30, 30))
        for (x, y, w, h) in faces:
            cv2.rectangle(frame_bgr, (x, y), (x+w, y+h), (0, 255, 0), 2)
            cv2.putText(frame_bgr, "Person", (x, y-10),
                       cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
            print(f"Detected: Person at x={x}, y={y}")

        # Measure edge density on left vs right half of frame
        edges = cv2.Canny(gray, 50, 150)
        h, w = edges.shape
        left_density = np.sum(edges[:, :w//2]) / 255
        right_density = np.sum(edges[:, w//2:]) / 255

        # Draw center dividing line
        cv2.line(frame_bgr, (w//2, 0), (w//2, h), (255, 0, 0), 1)

        # Determine which side is clearer
        if left_density < right_density:
            direction = "LEFT is clearer"
            color = (0, 255, 255)
        elif right_density < left_density:
            direction = "RIGHT is clearer"
            color = (0, 255, 255)
        else:
            direction = "Both sides equal"
            color = (128, 128, 128)

        cv2.putText(frame_bgr, direction, (10, 30),
                   cv2.FONT_HERSHEY_SIMPLEX, 0.7, color, 2)
        cv2.putText(frame_bgr, f"L:{left_density:.0f} R:{right_density:.0f}",
                   (10, 60), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2)

        cv2.imshow('Robot Vision', frame_bgr)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

except KeyboardInterrupt:
    print("Stopped.")
finally:
    picam2.stop()
    cv2.destroyAllWindows()

Press F5 to run.

Two Things That Might Go Wrong (And How to Fix Them)

Problem 1: module 'cv2' has no attribute 'data'

OpenCV version too old. Fix:

pip install --upgrade opencv-python

Problem 2: The camera image is upside down

This one is actually funny. Due to the tight packing of components on the C101 chassis, the camera ribbon exits the CSI port and naturally hangs facing downward at the front of the car — which means the image is flipped 180°. The fix is exactly one line, added right after capture_array():

frame = picam2.capture_array()
frame = cv2.flip(frame, -1)  # Flip 180° if camera is mounted upside down

-1 flips both horizontally and vertically. If only one axis is wrong: 0 for vertical flip only, 1 for horizontal only.

What the Code Actually Does

Haar Cascade face detection: The same algorithm from Part 8A, now running on the Pi. When a face appears in frame, a green bounding box appears with the label 'Person" and the coordinates are printed to the terminal.

Edge density comparison: cv2.Canny detects edges in the grayscale image — the outlines of objects, walls, furniture. The frame is split down the middle, and the total edge count on each side is compared. More edges = more stuff = less clear path. The result — "LEFT is clearer" or "RIGHT is clearer" — is displayed on screen.

A blue vertical line divides the frame in half so you can see exactly what the algorithm is comparing. The numbers L:xxx R:xxx show the raw density values in real time.

This direction signal is exactly what Part 16 will use to replace the random.choice(['left', 'right']) from Part 14. Instead of guessing, the robot will look, measure, and decide.

What We Just Built

The robot now has two senses working independently:

✅ HC-SR04 — tells it something is in the way ✅ Pi Camera + OpenCV — tells it which direction has more room

In Part 16, these two come together. The HC-SR04 triggers the stop. The camera picks the turn. The robot navigates.

Next up: Part 16 — Everything combined. The robot sees, senses, decides, and moves.

Monday, June 8, 2026

Adding Eyes to the Robot with Pi Camera and OpenCV — Part 15: The Robot Learns to Look