Build a $60 face recognition system with NVIDIA Jetson nano 2GB and python


By Adam geitgey
Compile Flin
Source: Media

New NVIDIA Jetson nano 2GB development board (announced today!) It’s a single board computer that costs $59 and runs artificial intelligence software with GPU acceleration.

By 2020, you can get amazing performance from a $59 single board computer. Let’s use it to create a simple version of a doorbell camera that tracks everyone who comes to the front door of the house. Through facial recognition, even if these people are wearing different clothes, it can immediately know whether the person at your door has ever visited you.

What is NVIDIA Jetson nano 2GB?

Jet o n nano 2GB is a single board computer with four core 1.4GHz ARM CPU and built-in NVIDIA Maxwell GPU. It’s the cheapest NVIDIA Jetson model for amateurs who buy raspberry pies.

If you are already familiar with the raspberry pie product line, it is almost the same as other products except that Jetson nano is equipped with NVIDIA GPU. It can run GPU accelerated applications, such as the deep learning model, much faster than boards like raspberry pie, which do not support most deep learning frameworks.

There are a lot of AI development boards and accelerator modules, but NVIDIA has a big advantage – it’s directly compatible with desktop AI libraries, and you don’t need to convert deep learning models into any special format to run them.

It uses the same CUDA library that almost every Python based deep learning framework already uses for GPU acceleration. This means that you can use the existing Python based deep learning program, run it on Jetson nano 2GB with little modification, and still get good performance (as long as your application can run on 2GB of RAM).

Its ability to deploy Python code for powerful servers on a $59 stand-alone device is excellent.

The new Jetson nano 2GB motherboard is also brighter than NVIDIA’s previous hardware versions.

The first Jetson nano was inexplicably short of WiFi, but it came with a pluggable WiFi module, so you don’t have to add a messy Ethernet cable. They also upgraded the power input to a more modern usb-c port, and in terms of software, some rough edges have been worn off. For example, you don’t need to perform basic operations such as enabling swap files.

NVIDIA actively launched a simple and easy-to-use hardware device with a real GPU for less than $60. It seems that they are targeting raspberry pie and trying to capture the education / enthusiast market. It will be interesting to see how the market reacts.

Let’s assemble the system

For any hardware project, the first step is to collect all the parts we need:

1. NVIDIA Jetson nano 2GB motherboard (US $59)

The boards are currently available for booking (as of October 5, 2020) and are expected to be released by the end of October.

I don’t know what the initial availability will be after the release, but the previous Jetson nano models are in short supply in the months after the release.

Full disclosure: I received a free Jetson nano 2GB development board from NVIDIA as an evaluation unit, but I have no financial or editorial relationship with NVIDIA. That’s how I was able to write this guide ahead of time.

2. Usb-c power adapter (you may already have one?)

The new Jetson nano 2GB is powered by usb-c. The power adapter is not included, but you may already have one.

3. Camera – USB webcam (you may have one?) Or raspberry pie camera module v2. X (about $30)

The raspberry pie camera module v2. X is a good choice if you want to install a small camera in the case (Note: the v1. X camera module will not work). You can get them at Amazon or at various resellers.

Some USB webcams (such as Logitech’s c270 or c920) can also work on Jetson nano 2GB, so if you already have a USB camera, you can also use it. Here’s an incomplete list of cameras.

Before you buy a new product, don’t be afraid to try to place any USB devices. Not all features support linux drivers, but some will. I plugged in the $20 universal HDMI to USB adapter I bought on Amazon, and it worked well. So I can use my high-end digital camera as a video source through HDMI without any additional configuration.

You need something else, but you may be ready:

MicroSD card with at least 32GB space. We will install Linux here. You can reuse any existing microSD card.

A microSD card reader: so that you can install Jetson software.

A wired USB keyboard and a wired USB mouseControl Jetson nano.

Any device that directly accepts HDMI (rather than through the hdmi-dvi converter)monitorOr TV, you can see what you’re doing. Even if you do not use the monitor to run the initial setup of Jetson nano in the future, you need a monitor.

Loading Jetson nano 2GB software

Before you start inserting things into Jetson nano, you need to download the software image of Jetson nano.

NVIDIA’s default software image includes Ubuntu Linux 18.04 with Python 3.6 and opencv pre installed.

The following is how to install the Jetson nano software on the SD card:

  1. Download the Jetson nano developer kit SD card image from NVIDIA.
  1. Download etcher, which writes Jetson software image to SD card.
  1. Run etcher and use it to write the image of the Jetson nano developer kit SD card downloaded to the SD card. It takes about 20 minutes.

It’s time to unpack the rest of the hardware!

Insert all parts

First of all, please take out your Jetson nano 2GB:

The first step is to insert the microSD card. The microSD card slot is completely hidden, but you can find it on the back of the bottom of the heat sink:

You should also continue to plug the included USB WiFi adapter into one of the following USB ports:

Next, you need to insert the camera.

If you’re using a raspberry pie v2. X camera module, it’s connected through a ribbon cable. Locate the ribbon cable slot on Jetson, eject the connector, insert the cable, and eject it to close. Make sure that the metal contacts on the ribbon cable face inward toward the radiator:

If you use a USB webcam, just plug it into one of the USB ports, ignoring the ribbon cable port.

Now insert all the other parts:

Plug the mouse and keyboard into the USB port.

Plug the HDMI cable into the monitor.

Finally, plug in the usb-c power cord to start it.

If you’re using the raspberry pie camera module, you’ll end up with something like this:

Or, if you are using a USB video input device, it will look like this:

After the power cord is plugged in, Jetson nano will start automatically. After a few seconds, you should see the Linux settings screen appear on the monitor. Follow these steps to create your account and connect to WiFi. It’s simple.

Installing Linux and python libraries for face recognition

Once the initial setup of Linux is completed, we need to install several libraries that we will use in the face recognition system.

On the Jetson nano desktop, open an lxterminal window and run the following command. Each time you ask for a password, enter the password you entered when you created the user account:

sudo apt-get update
sudo apt-get install python3-pip cmake libopenblas-dev liblapack-dev libjpeg-dev

First, we need to update apt, the standard Linux software installation tool, which we will use to install other system libraries.

Then, we’ll install some Linux libraries that we don’t need to pre install our software.

Finally, we need to install face_ Recognition Python library and its dependencies, including machine learning library Dlib. You can do this automatically with a single command:

sudo pip3 -v install Cython face_recognition

Because there are no pre built copies of Dlib and numpy available for the Jetson platform, this command compiles these libraries from source code. So take this opportunity to have lunch, because it may take an hour!

When it’s finished, your Jetson nano 2GB can speed up face recognition with a full CUDA GPU. Go on to the next interesting part!

Run the face recognition doorbell camera demo app

face_ Recognition library is a python library written by me, which makes it super simple to use Dlib for face recognition. It allows you to detect faces, convert each detected face into a unique face code, and then compare those face codes to see if they might be the same person – just a few lines of code.

Using the library, I built a doorbell camera application that recognizes people who come to your front door and tracks them every time they come back. The runtime looks like this:

First, download the code. I’ve added complete code and comments here.

But here’s a simpler way to download it from the command line to your Jetson nano:

wget -O

At the top of the program, you need to edit a line of code to tell you whether to use the USB camera or the raspberry pie camera module. You can edit a file like this:


Follow the instructions, then save it, exit GEDIT and run the code:


You will see a video window on the desktop. Whenever a new person walks up to the camera, it records their faces and starts tracking their time at your door. If the same person leaves and comes back in five minutes, it will re register and track them again. You can exit at any time by pressing “Q” on the keyboard.

The application will automatically save everyone’s information you see to a file called know_ faces.dat In the file. When you run the program again, it will use that data to remember previous visitors. If you want to clear the list of known faces, just exit the program and delete the file.

Turn it into a stand-alone hardware device

So far, we have a development board running face recognition model, but it is still tied to the desktop to achieve powerful functions and display effects. Let’s see how to run it without inserting it.

A cool thing about modern single board computers is that they almost all support the same hardware standard, such as USB. That means you can buy a lot of cheap accessories on Amazon, such as touch screen monitors and batteries. You have a lot of input, output and power options. Here’s what I ordered (but anything like that is OK)

A 7-inch touch screen HDMI display with USB power supply:

And a universal usb-c battery pack for power supply:

Let’s connect them and see what it looks like when running as a stand-alone device. Just plug in the USB battery instead of the wall charger, and then plug the HDMI monitor into the HDMI port and USB port to act as screen and mouse input.

The effect is very good. The touch screen can be operated like a normal USB mouse without any other configuration. The only drawback is that if the Jetson nano 2GB consumes more power than the USB battery pack can provide, it will reduce the speed of the GPU. But it still works well.

With a little creativity, you can package all of this into a project case and use it as a prototype hardware device to test your ideas. Moreover, if you want to mass produce some products, you can buy the production version of Jetson motherboard and use it to build real hardware products.

Python code walkthrough of doorbell camera

Want to know how code works? Let’s work it out step by step.

The code starts by importing the library we’re going to use. The most important thing is opencv (called CV2 in Python). We will use OpenCV to read images from cameras and face recognition for detecting and comparing faces.

import face_recognition
import cv2
from datetime import datetime, timedelta
import numpy as np
import platform
import pickle

Then we need to know how to access the camera – getting images from the raspberry pie camera module is different from using a USB camera. Therefore, just change this variable to true or false according to your hardware:

#The settings here depend on your camera device type:
#- true = raspberry Pie 2. X camera module
# - False = USB webcam or other USB video input (like an HDMI capture device)

Next, we’ll create variables to store data about the person walking in front of the camera. These variables will act as a simple database of known visitors.

known_face_encodings = []
known_face_metadata = []

The application is just a demo, so we store the known faces in a python list. In real-world applications dealing with more faces, you might want to use a real database, but I want to keep this demo simple.

Next, we have the function of saving and loading the known facial data. This is the save function:

def save_known_faces():
    with open("known_faces.dat", "wb") as face_data_file:
        face_data = [known_face_encodings, known_face_metadata]
        pickle.dump(face_data, face_data_file)
        print("Known faces backed up to disk.")

This will use Python’s built-in pickle feature to write known faces to disk. The data is loaded back in the same way, but I don’t show it here.

Whenever our program detects a new face, we call a function to add it to the known face database

def register_new_face(face_encoding, face_image):
        "seen_count": 1,
        "seen_frames": 1,
        "face_image": face_image,

First, we store the face code representing the face in the list. Then, we store the matching data dictionary about the face in the second list. We’re going to use it to track when we first saw the person, how long they’ve been hanging around the camera recently, how many times they’ve visited our house, and their face images.

We also need an auxiliary function to check whether there is an unknown face in the face database

def lookup_known_face(face_encoding):
    metadata = None
    if len(known_face_encodings) == 0:
        return metadata
    face_distances = face_recognition.face_distance(
    best_match_index = np.argmin(face_distances)
    if face_distances[best_match_index] < 0.65:
        metadata = known_face_metadata[best_match_index]
        metadata["last_seen"] =
        metadata["seen_frames"] += 1
        if - metadata["first_seen_this_interaction"]  
                > timedelta(minutes=5):
            metadata["first_seen_this_interaction"] =
            metadata["seen_count"] += 1
    return metadata

We are here to do some important things:

  1. Using face? Recognition Library, we check the similarity of unknown faces to all previous visitors. The face? U distance The () function provides us with a numerical measure of the similarity between the unknown face and all known faces – the smaller the number, the more similar the face.

  2. If the face is very similar to one of our known visitors, we assume that they are repeat visitors. In this case, we will update their “last viewed” time and increase the number of times we see them in the video frame.

  3. Finally, if someone has seen this person in front of the camera in the last five minutes, let’s assume they are still here as part of the same visit. Otherwise, we assume that this is a new visit to our house, so we will reset the timestamp that tracks their most recent visit.

The rest of the program is the main loop – an infinite loop in which we take video frames, look for faces in the image, and process every face we see. This is the main core of the program. Let’s see:

def main_loop():
        video_capture = 
        video_capture = cv2.VideoCapture(0)

The first step is to access the camera in any way that suits our computer hardware.

Now let’s get the video frame:

while True:
    # Grab a single frame of video
    ret, frame =
    # Resize frame of video to 1/4 size
    small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
    # Convert the image from BGR color
    rgb_small_frame = small_frame[:, :, ::-1]

Every time we grab a video, we shrink it to a quarter of its size. This will make the face recognition process run faster, but the cost is only to detect the larger face in the image. But since we’re building a doorbell camera that can only recognize people near the camera, that’s not a problem.

We also have to deal with the fact that opencv extracts images from the camera and stores each pixel as a cyan red value, rather than the standard red green blue value. Before face recognition, we need to transform the image format.

Now, we can detect all the faces in the image and convert each face to face coding. Just two lines of code:

face_locations = face_recognition.face_locations(rgb_small_frame)
face_encodings = face_recognition.face_encodings(

Next, we will traverse each detected face and determine whether it is a person we have met in the past or a new visitor:

for face_location, face_encoding in zip(
metadata = lookup_known_face(face_encoding)
    if metadata is not None:
        time_at_door = - 
        face_label = f"At door {int(time_at_door.total_seconds())}s"
        face_label = "New visitor!"
        # Grab the image of the face
        top, right, bottom, left = face_location
        face_image = small_frame[top:bottom, left:right]
        face_image = cv2.resize(face_image, (150, 150))
        # Add the new face to our known face data
        register_new_face(face_encoding, face_image)

If we have seen this person before, we will retrieve the metadata we have stored about their previous access.

If not, we add them to our face database and get their face images from video images to add them to our database.

Now that we have found all the people and found out their identities, we can traverse the detected faces again, just draw a box around each face and tag each face:

for (top, right, bottom, left), face_label in 
                  zip(face_locations, face_labels):
    # Scale back up face location
    # since the frame we detected in was 1/4 size
    top *= 4
    right *= 4
    bottom *= 4
    left *= 4
    # Draw a box around the face
        frame, (left, top), (right, bottom), (0, 0, 255), 2
    # Draw a label with a description below the face
        frame, (left, bottom - 35), (right, bottom), 
        (0, 0, 255), cv2.FILLED
        frame, face_label, 
        (left + 6, bottom - 6), 
        cv2.FONT_HERSHEY_DUPLEX, 0.8, 
        (255, 255, 255), 1

I also want to draw a running list of recent visitors at the top of the screen, including the number of times they visited your house:

To draw the image, we need to traverse all known faces and look at the nearest face in front of the camera. For each recent visitor, we will draw their face image on the screen and draw the number of visits:

number_of_recent_visitors = 0
for metadata in known_face_metadata:
    #If we met this person at the last minute,
    if - metadata["last_seen"] 
                         < timedelta(seconds=10):
#Drawing known facial images
        x_position = number_of_recent_visitors * 150
frame[30:180, x_position:x_position + 150] =
number_of_recent_visitors += 1
        # Label the image with how many times they have visited
        visits = metadata['seen_count']
        visit_label = f"{visits} visits"
if visits == 1:
            visit_label = "First visit"
            frame, visit_label, 
            (x_position + 10, 170), 
            cv2.FONT_HERSHEY_DUPLEX, 0.6, 
            (255, 255, 255), 1

Finally, we can display the current video frame on the screen and draw all the comments on top of it:

cv2.imshow('Video', frame)

To ensure that the program will not crash, we save the list of known faces to disk every 100 frames

if len(face_locations) > 0 and number_of_frames_since_save > 100:
    number_of_faces_since_save = 0
    number_of_faces_since_save += 1

When the program exits, only one or two lines of cleaning code are needed to turn off the camera.

The startup code of the program is at the bottom of the program

if __name__ == "__main__":

All we have to do is load the known faces, if any, and then start the main loop, which always reads from the camera and displays the results on the screen.

The program has only about 200 lines, but it can detect visitors, identify them and track them whenever they come to your door.

An interesting fact: this face tracking code runs in advertisements on many streets and bus stations to track who is watching the advertisements and how long they last. It used to sound like a long way off, but now you can buy the same thing for $60!


This program is an example of how to use a small amount of Python 3 code running on a cheap Jetson nano 2GB board to build a powerful system.

If you want to turn it into a real doorbell camera system, you can add such a function: when the system detects that there is a new person at the door, it will send you a text message with twilio instead of just showing it on your monitor. Or you can try to replace the simple in memory face database with a real database.

You can also try to convert this program into a completely different program. The pattern of reading a video, looking for content in the image, and then taking action is the basis of various computer vision systems. Try changing the code and see what you can think of! When you go home to your door, how about letting it play your own customized theme music? You can check out some other face recognition examples to see how to do something similar.

Learn more about NVIDIA Jetson platform

If you want to learn more about building with NVIDIA Jetson hardware platform, NVIDIA will offer a new free Jetson training course. Check out their website for more information.

They also have great community resources, such as jetsonhacks.

If you want to learn more about building ml and AI systems with Python, check out my other articles and books on my website.

Link to the original text:

Welcome to panchuang AI blog:

Sklearn machine learning official Chinese document:

Welcome to pancreato blog Resource Hub: