Let's get started. For this project, we will be using face detection using the Haar Cascade method. Therefore, we need to download the cascade classifier for face detection from Github. You can download and paste the file from this link.
Let us first understand what is Haar Cascade?
Haar Cascade is a machine learning object detection algorithm used to identify objects in an image or video and based on various features.
If you want to read a more detailed version of it, check out this link.
If you want to read a more detailed version of it, check out this link.
Now, we will start by importing the libraries and define a variable to capture video from my webcam.
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
Let us import our classifier file which we downloaded from Github.
face = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
write a while loop and capture the image frames. Also, we need to mirror the frames so that we can see it right.
while 1:
ret, frame = cap.read() ##Read image frame
frame = cv2.flip(frame, +1) ##Mirror image frame
All we have to do now is to convert the image into grayscale and use the detectMultiScale() method to detect faces in the image frame.
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
detect_face = face.detectMultiScale(gray, 1.2, 1)
detectMultiScale() requires three arguments. The first is the source image, the second is the scale factor which specifies how much the image size is reduced at each image scale and the third is the minimum neighbors which specify how many neighbors each candidate rectangle should have to retain it. This method returns the coordinates of the rectangle detecting a face.
Remember, our goal is to find the distance of the face from the camera. So, for that, we need to find out the area of the enclosed rectangle. This can be done by the following code.
for(x, y, z, h) in detect_face:
cv2.rectangle(frame, (x, y), (x+z, y+h), (0, 255, 0), 2)
ROI = gray[x:x+z, y:y+h]
length = ROI.shape[0]
breadth = ROI.shape[1]
Area = length * breadth
display = 'Area = ' + str(Area)
if Area > 0:
cv2.putText(frame, display, (5, 50), font, 2, (255, 255, 0), 2, cv2.LINE_AA)
We will display the area on our original image. That will help us for calibration.
You can go through the video for the calibration process.
All we have to do is measure the distance of the camera from your face using a tailor's measuring tape and the corresponding area. Do several iterations of this for various distances and store the data in an excel file. Then we plot the graph and find a trendline for the graph. We find the equation of trendline and insert it in our code.
for(x, y, z, h) in detect_face:
cv2.rectangle(frame, (x, y), (x+z, y+h), (0, 255, 0), 2)
ROI = gray[x:x+z, y:y+h]
length = ROI.shape[0]
breadth = ROI.shape[1]
Area = length * breadth
Distance = 3 * (10 ** (-9)) * (Area ** 2) - 0.001 * Area + 108.6
display = 'Distance = ' + str(Distance)
if Area > 0:
cv2.putText(frame, display, (5, 50), font, 2, (255, 255, 0), 2, cv2.LINE_AA)
That is it. Our code is ready. Don't forget to release the memory and close all the windows.
cap.release() ##Release memory
cv2.destroyAllWindows() ##Close all the windows
For full code click here.
For classifier, files click here.
Happy Coding...!!!
Hi! Love your solution of finding the distance using Object detection! At current stage im pretty new to python, and find your videoed good for tutorial purposes as your projects has a link to the code aswell. May be far fetched, but would you also release this code? Love your tutorial, easy and understandable!
ReplyDelete