Finally, we land into the wrap up of the complete IOT device. Our product comprises of voice recognition and face recognition.  These functionalities are developed by different team mates and then integrated together. As we start coding for each of the functionalities there are few things to consider :

  • Voice Recognition program is directly embedded into the raspberry.
  • Face recognition require few heavy dependencies , hence its not directly loaded on raspbian rather the program is uploaded on server which returns the result to raspberry pi upon request.

Voice Recognition:


If you follow the Manual-PiCar-B-V1.0. for installing dependencies for Adeep car, speech_recognition will automatically be downloaded.

File name: Voice_Recognition

import speech_recognition as sr 
import sys
import time
from csvInteractor import csvInteractor

class voice:
    def __init__(self):
        self.recognizedWords = ""
        self.attendanceDict = {}
        self.nameList = []
        self.csvOutputBuilder = []

    def nameRecognition(self, r, mic):
        recognizedString = ""
        with mic as source:
            print("Say Something!")
            audio = r.listen(source, timeout = 5, phrase_time_limit=5)
        print("Audio Clip acquired!")

            print("entered try block")
            recognizedString += r.recognize_sphinx(audio, keyword_entries=self.nameList)      #You can add your own command here
            print("Recognized word was:" + recognizedString)
        except sr.UnknownValueError:
            print("say again")
        except sr.RequestError as e:
            print("Errored out.")
        return recognizedString
    def main(self, filename):
        # obtain audio from the microphone
        r = sr.Recognizer()
        mic = sr.Microphone()

        csvName = filename
        csv = csvInteractor()
        #print(csv.attendance_list) #debugging
        for tuplet in csv.attendance_list:
            name = (tuplet[0].lower(), 1.0)

        for tuplet in self.nameList:
            self.attendanceDict[tuplet[0]] = "Absent"


        for iterator in self.nameList:
            self.recognizedWords += self.nameRecognition(r, mic)

        hereList = self.recognizedWords.split(" ")

        for item in hereList:
            if item in self.attendanceDict:
                self.attendanceDict[item] = "Present"

        for key,value in self.attendanceDict.items():
            temp = (key,value)

        #print(attendanceDict) #debugging
        output = csvInteractor()
        output.attendance_list = self.csvOutputBuilder

Face Recognition

Since we have planned to transfer the image capture from raspberry pi to hopper and from hopper to a virtual machine, we need to establish a connection between raspberry pi and hopper to do so. This connection is established as soon as image is captured in “def recognize” function inside file. Hopper is Saint Louis University system that would require you to enter password before login. We are resolving this issue by generating a rsa key and then transferring the public key and authorization key to the receiving end.

$ ssh-keygen -t rsa

This will ask for a key and a passphrase. Leave it blank. Now create a .shh directory on receiving end and then transfer and authorized_keys to the recently made .ssh directory. Set the permission for .ssh as 700 and for authorized_keys as 600. This process needs to be repeated both for sending a image for processing and receiving names of recognized faces from the virual machine.

$ ssh mkdir -p .ssh
$ cat .ssh/ | ssh 'cat >> .ssh/authorized_keys'ssh
$ "chmod 700 .ssh; chmod 600 .ssh/authorized_keys"

The connection is established by following line of  code in raspberry pi file:

import os
os.system("SCP /server/testingImg.jpg")

After the longin, hopper needs to connect to virtual machine to transfer the file. We will reuse connection establishing code on hopper and send the image to virtual machine. Similar script can be used in a shell file which will be called every one minute by cron. To setup the cron job type in:

$ crontab -e
# 1 * * * * bash  /home/iotsadiya/


if [-f /home/iotsadiya/Names.txt]
                  scp  /home/iotsadiya/Names.txt  pi@

For automatically running the write a shell script and set it to run in cron.

File Name:

echo "$(date +"%y-%m-%d_%H:%M") Face Recognition Starting---">>/home/sadiyaahmad/log.txt

if [ -f /home/sadiyaahmad/examples/testimg.jpg ]
 python3 --e encodings.pickle --image examples/testimage.jpg
 if [ -f /home/sadiyaahmad/output/names.txt ]
     rm -f /home/sadiyaahmad/examples/testimg.jpg
     echo "$(date +"%y-%m-%d_%H:%M") Face Recognition ran">>/home/sadiyaahmad/log.txt


The above process will be repeated in all three machines with different locations.

For the purpose of coding a face recognition we would need certain dependencies on the virtual machine to be installed over python3. Our dependencies are : imutils, dlib and face_recognition.

$ pip3 install imutils
$ pip3 install dlib
$ pip3 install face_recognition

After installing the libraries we need a set of images on which we can train our data. This is stored in dataset folder with the sub folder for each individuals holding their pics. Our program will pick the images from these folders to train the data and create encodings.pickle.

File Name:

# python --dataset dataset --encodings encodings.pickle

# import the necessary packages
from imutils import paths
import face_recognition
import argparse
import pickle
import cv2
import os

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--dataset", required=True,
	help="path to input directory of faces + images")
ap.add_argument("-e", "--encodings", required=True,
	help="path to serialized db of facial encodings")
ap.add_argument("-d", "--detection-method", type=str, default="cnn",
	help="face detection model to use: either `hog` or `cnn`")
args = vars(ap.parse_args())

# grab the paths to the input images in our dataset
print("[INFO] quantifying faces...")
imagePaths = list(paths.list_images(args["dataset"]))

# initialize the list of known encodings and known names
knownEncodings = []
knownNames = []

# loop over the image paths
for (i, imagePath) in enumerate(imagePaths):
	# extract the person name from the image path
	print("[INFO] processing image {}/{}".format(i + 1,
	name = imagePath.split(os.path.sep)[-2]

	# load the input image and convert it from RGB (OpenCV ordering)
	# to dlib ordering (RGB)
	image = cv2.imread(imagePath)
	rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

	# detect the (x, y)-coordinates of the bounding boxes
	# corresponding to each face in the input image
	boxes = face_recognition.face_locations(rgb,

	# compute the facial embedding for the face
	encodings = face_recognition.face_encodings(rgb, boxes)

	# loop over the encodings
	for encoding in encodings:
		# add each encoding + name to our set of known names and
		# encodings

# dump the facial encodings + names to disk
print("[INFO] serializing encodings...")
data = {"encodings": knownEncodings, "names": knownNames}
f = open(args["encodings"], "wb")

Our second task will be to create a examples folder to keep the example image to be recognized. The images from hopper will be transferred to this folder. Next program will pick the files from examples folder, recognize it and send the results back to raspberry pi.

File Name:

# python --encodings encodings.pickle --image examples/testimg.jpg 

# import the necessary packages
import face_recognition
import argparse
import pickle
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-e", "--encodings", required=True,
	help="path to serialized db of facial encodings")
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
ap.add_argument("-d", "--detection-method", type=str, default="cnn",
	help="face detection model to use: either `hog` or `cnn`")
args = vars(ap.parse_args())

# load the known faces and embeddings
print("[INFO] loading encodings...")
data = pickle.loads(open(args["encodings"], "rb").read())

# load the input image and convert it from BGR to RGB
image = cv2.imread(args["image"])
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# detect the (x, y)-coordinates of the bounding boxes corresponding
# to each face in the input image, then compute the facial embeddings
# for each face
print("[INFO] recognizing faces...")
boxes = face_recognition.face_locations(rgb,
encodings = face_recognition.face_encodings(rgb, boxes)

# initialize the list of names for each face detected
names = []

# loop over the facial embeddings
for encoding in encodings:
	# attempt to match each face in the input image to our known
	# encodings
	matches = face_recognition.compare_faces(data["encodings"],
	name = "Unknown"

	# check to see if we have found a match
	if True in matches:
		# find the indexes of all matched faces then initialize a
		# dictionary to count the total number of times each face
		# was matched
		matchedIdxs = [i for (i, b) in enumerate(matches) if b]
		counts = {}

		# loop over the matched indexes and maintain a count for
		# each recognized face face
		for i in matchedIdxs:
			name = data["names"][i]
			counts[name] = counts.get(name, 0) + 1

		# determine the recognized face with the largest number of
		# votes (note: in the event of an unlikely tie Python will
		# select first entry in the dictionary)
		name = max(counts, key=counts.get)
	# update the list of names
with open(path, 'w') as filehandle:  
    for Peoplename in names:
        filehandle.write('%s\n' % Peoplename)

# Remove comment from following if you want to show names on the image.
# loop over the recognized faces #for ((top, right, bottom, left), name) in zip(boxes, names): # draw the predicted face name on the image #cv2.rectangle(image, (left, top), (right, bottom), (0, 255, 0), 2) #y = top - 15 if top - 15 > 15 else top + 15 #cv2.putText(image, name, (left, y), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 255, 0), 2) # To show the output image remove comment from this section else print names from list #cv2.imshow("Image", image) #cv2.waitKey(0)
The other option of sending image back and forth from pi to virtual machine and vs is to use Restful API and transfer image through a browser.

Now the virtual machine will send the response of processing ( i.e. Names of people) back to hopper and then back to pi. As I mentioned earlier “connection establishing code” will be used to transfer image back and forth.

Authour: Sadiya Ahmad

Leave a Reply

Your email address will not be published. Required fields are marked *