Before we get into the robot vision stuff, let's talk about one of the most legendary tools in all of programming.
The Print Statement: Debugging's Best Friend
Every language has one:
- Python:
print() - C:
printf() - PHP:
echo - Java:
System.out.println()
Why "legendary"? Two reasons.
First, it's the best debugging tool ever invented. Imagine a truck leaving Los Angeles headed for New York. It departs on time, but never arrives. Where did it go wrong? You'd check the cameras along the route — Utah, Colorado, Nebraska, Iowa, Illinois... If the truck shows up in Nebraska but disappears by Iowa, you know something happened somewhere between those two states.
Code works the same way. Scatter a few print() statements through hundreds of lines of code. Wherever the output stops appearing on screen — that's where your bug is hiding.
Second, it's a placeholder. Imagine you're writing code for our robot and you know it should eventually stop the motors when it sees an obstacle — but you haven't written that part yet. No problem:
print("Obstacle detected — stopping!") # Placeholder: replace with motor stop command later
The print statement holds the spot. Like booking a hotel room before you decide what you're doing that day. The logic is there; the implementation comes later.
We'll use this trick a lot in this part.
What We're Actually Building
In this project, we use the laptop webcam to build a real-time obstacle avoidance system — using YOLOv8 to detect objects and OpenCV to process the video stream.
One honest clarification: our robot won't actually avoid anything yet. It will announce that it should. The motors aren't connected to this code. But the logic — the perception → decision loop — is real. And that loop is the foundation of every autonomous system on the planet.
How It Works
1. Detection
YOLOv8 grabs each frame from the webcam and identifies what's in it — person, chair, bottle, whatever.
2. Distance Estimation (The Clever Workaround)
Regular cameras can't directly measure distance. But there's a neat trick based on a simple principle:
The closer an object gets to the camera, the larger it appears in the frame.
The formal version involves similar triangles and looks like this:
D = (W × f) / PWhere:
- D = distance from camera to object
- W = real-world width of the object (e.g. 0.5 meters)
- f = focal length of the camera (in pixels)
- P = width of the object as it appears in the frame (in pixels)
We're not going to use that formula. It requires calibrating the camera, knowing the real size of every possible object, and other headaches we don't need right now.
Instead, we use the shortcut: if the bounding box width (in pixels) exceeds a threshold, the object is too close.
3. Decision
- Object too close (box width > 200px) → WARNING: STOP
- Object at safe distance → SAFE TO MOVE
That's the full loop. Simple, but real.
The Code
Install libraries if you haven't already (open PyCharm Terminal):
pip install opencv-python ultralytics
Create a new Python file — call it avoidance.py — and paste this in:
import cv2
from ultralytics import YOLO
# 1. Load YOLOv8 nano model
model = YOLO('yolov8n.pt')
# 2. Connect to webcam (0 = default laptop webcam)
cap = cv2.VideoCapture(0)
while cap.isOpened():
success, frame = cap.read()
if not success:
break
# 3. Run YOLOv8 on the current frame
results = model(frame, stream=True)
for r in results:
boxes = r.boxes
for box in boxes:
# Get bounding box coordinates
x1, y1, x2, y2 = box.xyxy[0].int().tolist()
# Calculate width of the bounding box in pixels
w = x2 - x1
# If the object appears large enough, it's too close
if w > 200:
cv2.putText(frame, "WARNING: STOP!", (50, 50),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)
# Placeholder: replace with actual motor stop command later
# print("Obstacle detected — stopping!")
else:
cv2.putText(frame, "SAFE TO MOVE", (50, 50),
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
# Display the video feed
cv2.imshow("Obstacle Avoidance", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Hit Run. Walk toward the webcam. Watch "SAFE TO MOVE" flip to "WARNING: STOP!" as you get close.
Press q to quit.
Code Walkthrough
Initializing the camera
model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)
YOLO('yolov8n.pt') loads the nano model — smallest, fastest. VideoCapture(0) opens the default webcam. Change 0 to 1 if your laptop has multiple cameras.
The main loop
while cap.isOpened():
success, frame = cap.read()
Reads the video stream one frame at a time. cap.read() returns two things: whether the read succeeded, and the actual image data.
Detection
results = model(frame, stream=True)
Passes the frame into YOLOv8. stream=True returns a generator instead of a list — more memory-efficient when processing video.
Reading the bounding boxes
x1, y1, x2, y2 = box.xyxy[0].int().tolist()
w = x2 - x1
Each detected object comes with a bounding box: x1, y1 is the top-left corner, x2, y2 is the bottom-right. w is the width of that box in pixels.
The decision logic
if w > 200:
# Too close — warn
else:
# Safe distance — continue
This is the core of the whole system. 200 pixels is a starting threshold — adjust it based on your webcam and how sensitive you want the system to be. The closer the object, the bigger w gets. Simple, no calibration required.
Cleanup
cap.release()
cv2.destroyAllWindows()
Always release the camera and close the windows when done — otherwise the webcam stays "occupied" and you'll have to restart PyCharm to use it again.
Bonus: Using Your Phone Camera Instead of the Webcam
If you want to use IP Webcam (from Part 5) instead of the built-in webcam, just change one line:
# Replace this:
cap = cv2.VideoCapture(0)
# With this (use your actual IP and port):
cap = cv2.VideoCapture('http://192.168.1.XXX:8080/video')
Everything else stays exactly the same. Now your phone becomes the robot's eyes — which makes a lot more sense than mounting a laptop on a car chassis.
What Did We Actually Build?
Technically: a program that watches a video feed and prints warnings on screen.
In reality: the perception → decision loop — the most fundamental cycle in all of autonomous robotics. Every self-driving car, every warehouse robot, every drone doing collision avoidance runs on a version of this exact loop. Ours just prints text instead of turning a steering wheel.
The placeholder comments in the code are where the real robot commands will go — in a future part, we'll replace those print() statements with actual motor control signals. The logic is already there. The hardware connection is next.
Next up: we skip the keyboard entirely. In Part 7, we talk to our robot.