Using rust and webassembly in Node.js Face detection in real time

Time:2021-1-23

This article will show you how to write a Node.js AI as a service application.

Today, the mainstream programming language for AI is python. However, the programming language for the web is JavaScript. In order to provide AI functions as web services, we need to package AI algorithms in JavaScript, especially in Java Node.js .

However, neither Python nor JavaScript itself is suitable for computing intensive AI applications. They are high-level (that is, slow) languages with heavy runtime. Their ease of use comes at the cost of reduced performance. Python solves this problem by wrapping AI computing in native C / C + + modules. Node.js You can do the same thing, but we have a better way – webassembly.

Webassembly VM provides Node.js Tight integration with other JavaScript runtimes. They are high performance, memory safe, safe by default and portable across operating systems. However, our approach combines the best features of webassembly and native code.

working principle
be based on Node.js Our AI as a service application consists of three parts.

Node.js Applications provide web services and call webassembly functions to perform computationally intensive tasks, such as AI reasoning.
Data preparation, processing and integration with other systems are accomplished through webassembly functions. Initially, we supported rust. Application developers must write this function.
The actual implementation of AI model is completed by native code to maximize performance. This part of the code is very brief and has been security and security reviewed. Application developers simply call the native program from the webassembly function, as they do today in Python and Java Node.js You use native functions in the same way.

Next, let’s look at the sample program.

Face detection example
The face detection web service allows users to upload photos and display images marked with green boxes.

For the implementation of mtcnn face detection model rust source code based on CETRA tutorial: using tensorflow rust for face detection. We made changes to make the tensorflow library work in webassembly.

Node.js The application handles file uploads and responses.

app.post(‘/infer’, function (req, res) {
let image_file = req.files.image_file;
var result_filename = uuidv4() + “.png”;

// Call the infer() function from WebAssembly (SSVM)
var res = infer(req.body.detection_threshold, image_file.data);

fs.writeFileSync(“public/” + result_filename, res);
res.send(‘![]()’);
});
As you can see, the JavaScript application just takes the image data and a_ The parameter of threshold is passed to the infer() function, which specifies the minimum face to be detected, and then saves the return value to the image file on the server. The infer() function is written in rust and compiled into a webassembly so that it can be called from JavaScript.

The infer() function flattens the input image data into an array. It builds a tensorflow model and uses flattened image data as the input of the model. Tensorflow model execution returns a set of numbers indicating the coordinates of the four corners of each face box. Then, the infer() function draws a green box around each face, and saves the modified image to the PNG file on the web server.

[wasm_bindgen]

pub fn infer(detection_threshold: &str, image_data: &[u8]) -> Vec<u8> {

let mut dt = detection_threshold;
... ...
let mut img = image::load_from_memory(image_data).unwrap();

// Run the tensorflow model using the face_detection_mtcnn native wrapper
let mut cmd = Command::new("face_detection_mtcnn");
// Pass in some arguments
cmd.arg(img.width().to_string())
    .arg(img.height().to_string())
    .arg(dt);
// The image bytes data is passed in via STDIN
for (_x, _y, rgb) in img.pixels() {
    cmd.stdin_u8(rgb[2] as u8)
        .stdin_u8(rgb[1] as u8)
        .stdin_u8(rgb[0] as u8);
}
let out = cmd.output();

// Draw boxes from the result JSON array
let line = Pixel::from_slice(&[0, 255, 0, 0]);
let stdout_json: Value = from_str(str::from_utf8(&out.stdout).expect("[]")).unwrap();
let stdout_vec = stdout_json.as_array().unwrap();
for i in 0..stdout_vec.len() {
    let xy = stdout_vec[i].as_array().unwrap();
    let x1: i32 = xy[0].as_f64().unwrap() as i32;
    let y1: i32 = xy[1].as_f64().unwrap() as i32;
    let x2: i32 = xy[2].as_f64().unwrap() as i32;
    let y2: i32 = xy[3].as_f64().unwrap() as i32;
    let rect = Rect::at(x1, y1).of_size((x2 - x1) as u32, (y2 - y1) as u32);
    draw_hollow_rect_mut(&mut img, rect, *line);
}   
let mut buf = Vec::new();
// Write the result image into STDOUT
img.write_to(&mut buf, image::ImageOutputFormat::Png).expect("Unable to write");
return buf;

}
face_ detection_ The mtcnn command runs the mtcnn tensorflow model in native code. It contains three parameters: image width, image height and detection threshold. The actual image data of RGB value is passed in through stdin from webassembly infer(). The results of the model are encoded in JSON and returned through stdout.

Notice how we pass the model parameter detection_ The threshold is named min_ Then input tensor is used to transfer the input image data. The box tensor is used to retrieve results from the model.

fn main() -> Result<(), Box<dyn Error>> {

// Get the arguments passed in from WebAssembly
let args: Vec<String> = env::args().collect();
let img_width: u64 = args[1].parse::<u64>().unwrap();
let img_height: u64 = args[2].parse::<u64>().unwrap();
let detection_threshold: f32 = args[3].parse::<f32>().unwrap();
let mut buffer: Vec<u8> = Vec::new();
let mut flattened: Vec<f32> = Vec::new();

// The image bytes are read from STDIN
io::stdin().read_to_end(&mut buffer)?;
for num in buffer {
    flattened.push(num.into());
}

// Load up the graph as a byte array and create a tensorflow graph.
let model = include_bytes!("mtcnn.pb");
let mut graph = Graph::new();
graph.import_graph_def(&*model, &ImportGraphDefOptions::new())?;

let mut args = SessionRunArgs::new();
// The `input` tensor expects BGR pixel data from the input image
let input = Tensor::new(&[img_height, img_width, 3]).with_values(&flattened)?;
args.add_feed(&graph.operation_by_name_required("input")?, 0, &input);

// The `min_size` tensor takes the detection_threshold argument
let min_size = Tensor::new(&[]).with_values(&[detection_threshold])?;
args.add_feed(&graph.operation_by_name_required("min_size")?, 0, &min_size);

// Default input params for the model
let thresholds = Tensor::new(&[3]).with_values(&[0.6f32, 0.7f32, 0.7f32])?;
args.add_feed(&graph.operation_by_name_required("thresholds")?, 0, &thresholds);
let factor = Tensor::new(&[]).with_values(&[0.709f32])?;
args.add_feed(&graph.operation_by_name_required("factor")?, 0, &factor);

// Request the following outputs after the session runs.
let bbox = args.request_fetch(&graph.operation_by_name_required("box")?, 0);

let session = Session::new(&SessionOptions::new(), &graph)?;
session.run(&mut args)?;

// Get the bounding boxes
let bbox_res: Tensor<f32> = args.fetch(bbox)?;
let mut iter = 0;
let mut json_vec: Vec<[f32; 4]> = Vec::new();
while (iter * 4) < bbox_res.len() {
    json_vec.push([
        bbox_res[4 * iter + 1], // x1
        bbox_res[4 * iter],     // y1
        bbox_res[4 * iter + 3], // x2
        bbox_res[4 * iter + 2], // y2
    ]);
    iter += 1;
}
let json_obj = json!(json_vec);
// Return result JSON in STDOUT
println!("{}", json_obj.to_string()); 
Ok(())

}
Our goal is to create native execution wrappers for common AI models so that developers can use them as libraries.

Deployment of face detection examples
As a prerequisite, you will need to install rust, Node.js , second state webassembly VM and ssvmup tools. Check out the steps or just use our docker image. You also need the tensorflow library.

$ wget https://storage.googleapis.co…
$ sudo tar -C /usr/ -xzf libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz
To deploy the face detection example, we start with the native tensorflow model driver. You can compile from the rust source code in the project.

in the native_model_zoo/face_detection_mtcnn directory

$ cargo install –path .
Next, go to the web application project. Run the ssvmup command to build the webassembly function from rust. Recall that this webassembly function performs data preparation logic for a web application.

in the nodejs/face_detection_service directory

$ ssvmup build
With the built webassembly feature, you can now launch the Node.js Application.

$ npm i express express-fileupload uuid

$ cd node
$ node server.js
Web services are now available on port 8080 of your computer. Try to use your own self photos or family and group photos!

Tensorflow model Zoo
Native rust package_ detection_ Mtcnn is a simple wrapper based on tensorflow library. It loads the trained tensorflow model (called frozen model), sets the input of the model, executes the model, and retrieves the output value from the model.

In fact, our wrapper only retrieves the detected box coordinates around the face. The model actually provides confidence for each detected face and the position of eyes, mouth and nose of each face. By changing the retrieval tensor name in the model, the wrapper can get this information and return it to the wasm function.

If you want to use other models, it should be fairly easy to follow this example and create wrappers for your own models. You only need to know the input and output of tensor name and its data type.

To achieve this goal, we created a project called “native model zoo” to develop ready-made rust wrappers for as many tensorflow models as possible.

summary
In this article, we demonstrate how to use rust and webassembly in Node.js Real AI as a service use case is implemented in. Our approach provides a framework for the community to contribute to the “model zoo,” which can be used as an AI library for more application developers.

PS: This article belongs to translation, the original

Using rust and webassembly in Node.js Face detection in real time

This article will show you how to write a Node.js AI as a service application.

Today, the mainstream programming language for AI is python. However, the programming language for the web is JavaScript. In order to provide AI functions as web services, we need to package AI algorithms in JavaScript, especially in Java Node.js .
However, neither Python nor JavaScript itself is suitable for computing intensive AI applications. They are high-level (that is, slow) languages with heavy runtime. Their ease of use comes at the cost of reduced performance. Python solves this problem by wrapping AI computing in native C / C + + modules. Node.js You can do the same thing, but we have a better way – webassembly.
Webassembly VM provides Node.js Tight integration with other JavaScript runtimes. They are high performance, memory safe, safe by default and portable across operating systems. However, our approach combines the best features of webassembly and native code.

working principle

be based on Node.js Our AI as a service application consists of three parts.

  • Node.js Applications provide web services and call webassembly functions to perform computationally intensive tasks, such as AI reasoning.
  • Data preparation, processing and integration with other systems are accomplished through webassembly functions. Initially, we supported rust. Application developers must write this function.
  • The actual implementation of AI model is completed by native code to maximize performance. This part of the code is very brief and has been security and security reviewed. Application developers simply call the native program from the webassembly function, as they do today in Python and Java Node.js You use native functions in the same way.

Using rust and webassembly in Node.js Face detection in real time

Next, let’s look at the sample program.

Face detection example

Face detection Web ServiceAllows users to upload photos and display images marked in a green box.

Rust source code for mtcnn face detection model based on CETRA tutorial:Face detection using tensorflow rust. We made changes to make the tensorflow library work in webassembly.

Using rust and webassembly in Node.js Face detection in real time

Node.js application programHandle file upload and response.

app.post('/infer', function (req, res) {
  let image_file = req.files.image_file;
  var result_filename = uuidv4() + ".png";

  // Call the infer() function from WebAssembly (SSVM)
  var res = infer(req.body.detection_threshold, image_file.data);

  fs.writeFileSync("public/" + result_filename, res);
  res.send('<img/>');
});

As you can see, the JavaScript application just takes the image data and adetection_thresholdPass the parameters of toinfer()Function, which specifies the minimum face to detect, and then saves the return value to the image file on the server.infer()The function is written in rust and compiled into a webassembly so that it can be called from JavaScript.
infer()Function flattens the input image data into an array. It builds a tensorflow model and uses flattened image data as the input of the model. Tensorflow model execution returns a set of numbers indicating the coordinates of the four corners of each face box. then,infer()Function draws a green box around each face, and then saves the modified image to the PNG file on the web server.

#[wasm_bindgen]
pub fn infer(detection_threshold: &str, image_data: &[u8]) -> Vec<u8> {
    let mut dt = detection_threshold;
    ... ...
    let mut img = image::load_from_memory(image_data).unwrap();

    // Run the tensorflow model using the face_detection_mtcnn native wrapper
    let mut cmd = Command::new("face_detection_mtcnn");
    // Pass in some arguments
    cmd.arg(img.width().to_string())
        .arg(img.height().to_string())
        .arg(dt);
    // The image bytes data is passed in via STDIN
    for (_x, _y, rgb) in img.pixels() {
        cmd.stdin_u8(rgb[2] as u8)
            .stdin_u8(rgb[1] as u8)
            .stdin_u8(rgb[0] as u8);
    }
    let out = cmd.output();

    // Draw boxes from the result JSON array
    let line = Pixel::from_slice(&[0, 255, 0, 0]);
    let stdout_json: Value = from_str(str::from_utf8(&out.stdout).expect("[]")).unwrap();
    let stdout_vec = stdout_json.as_array().unwrap();
    for i in 0..stdout_vec.len() {
        let xy = stdout_vec[i].as_array().unwrap();
        let x1: i32 = xy[0].as_f64().unwrap() as i32;
        let y1: i32 = xy[1].as_f64().unwrap() as i32;
        let x2: i32 = xy[2].as_f64().unwrap() as i32;
        let y2: i32 = xy[3].as_f64().unwrap() as i32;
        let rect = Rect::at(x1, y1).of_size((x2 - x1) as u32, (y2 - y1) as u32);
        draw_hollow_rect_mut(&mut img, rect, *line);
    }   
    let mut buf = Vec::new();
    // Write the result image into STDOUT
    img.write_to(&mut buf, image::ImageOutputFormat::Png).expect("Unable to write");
    return buf;
}

face_detection_mtcnnCommand to run mtcnn tensorflow model in native code. It contains three parameters: image width, image height and detection threshold. From webassembly infer() throughSTDINThe actual image data of the RGB value is passed in. The results of the model are encoded in JSON and passed throughSTDOUTreturn.

Notice how we pass model parametersdetection_thresholdbe known asmin_sizeThe model tensor is then usedinputTensor transfer input image data.boxTensor is used to retrieve results from the model.

fn main() -> Result<(), Box<dyn Error>> {
    // Get the arguments passed in from WebAssembly
    let args: Vec<String> = env::args().collect();
    let img_width: u64 = args[1].parse::<u64>().unwrap();
    let img_height: u64 = args[2].parse::<u64>().unwrap();
    let detection_threshold: f32 = args[3].parse::<f32>().unwrap();
    let mut buffer: Vec<u8> = Vec::new();
    let mut flattened: Vec<f32> = Vec::new();

    // The image bytes are read from STDIN
    io::stdin().read_to_end(&mut buffer)?;
    for num in buffer {
        flattened.push(num.into());
    }

    // Load up the graph as a byte array and create a tensorflow graph.
    let model = include_bytes!("mtcnn.pb");
    let mut graph = Graph::new();
    graph.import_graph_def(&*model, &ImportGraphDefOptions::new())?;

    let mut args = SessionRunArgs::new();
    // The `input` tensor expects BGR pixel data from the input image
    let input = Tensor::new(&[img_height, img_width, 3]).with_values(&flattened)?;
    args.add_feed(&graph.operation_by_name_required("input")?, 0, &input);

    // The `min_size` tensor takes the detection_threshold argument
    let min_size = Tensor::new(&[]).with_values(&[detection_threshold])?;
    args.add_feed(&graph.operation_by_name_required("min_size")?, 0, &min_size);

    // Default input params for the model
    let thresholds = Tensor::new(&[3]).with_values(&[0.6f32, 0.7f32, 0.7f32])?;
    args.add_feed(&graph.operation_by_name_required("thresholds")?, 0, &thresholds);
    let factor = Tensor::new(&[]).with_values(&[0.709f32])?;
    args.add_feed(&graph.operation_by_name_required("factor")?, 0, &factor);

    // Request the following outputs after the session runs.
    let bbox = args.request_fetch(&graph.operation_by_name_required("box")?, 0);

    let session = Session::new(&SessionOptions::new(), &graph)?;
    session.run(&mut args)?;

    // Get the bounding boxes
    let bbox_res: Tensor<f32> = args.fetch(bbox)?;
    let mut iter = 0;
    let mut json_vec: Vec<[f32; 4]> = Vec::new();
    while (iter * 4) < bbox_res.len() {
        json_vec.push([
            bbox_res[4 * iter + 1], // x1
            bbox_res[4 * iter],     // y1
            bbox_res[4 * iter + 3], // x2
            bbox_res[4 * iter + 2], // y2
        ]);
        iter += 1;
    }
    let json_obj = json!(json_vec);
    // Return result JSON in STDOUT
    println!("{}", json_obj.to_string()); 
    Ok(())
}

Our goal is to create native execution wrappers for common AI models so that developers can use them as libraries.

Deployment of face detection examples

As a prerequisite, you will need to install rust, Node.js ,Second State WebAssembly VMandssvmupTools. Check out the steps or just use our docker image. You also need the tensorflow library.

$ wget https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz
$ sudo tar -C /usr/ -xzf libtensorflow-gpu-linux-x86_64-1.15.0.tar.gz

To deploy the face detection example, we start with the native tensorflow model driver. From theRust source codeTo compile.

# in the native_model_zoo/face_detection_mtcnn directory
$ cargo install --path .

Next, go toWeb Application Project . Run the ssvmup command to build the webassembly function from rust. Recall that this webassembly function performs data preparation logic for a web application.

# in the nodejs/face_detection_service directory
$ ssvmup build

With the built webassembly feature, you can now launch the Node.js Application.

$ npm i express express-fileupload uuid

$ cd node
$ node server.js

Web services are now available on port 8080 of your computer. Try to use your own self photos or family and group photos!

Tensorflow model Zoo

Native rust package_ detection_ Mtcnn is a simple wrapper based on tensorflow library. It loads the trained tensorflow model (called frozen model), sets the input of the model, executes the model, and retrieves the output value from the model.
In fact, our wrapper only retrieves the detected box coordinates around the face. The model actually provides confidence for each detected face and the position of eyes, mouth and nose of each face. By changing the retrieval tensor name in the model, the wrapper can get this information and return it to the wasm function.
If you want to use other models, it should be fairly easy to follow this example and create wrappers for your own models. You only need to know the input and output of tensor name and its data type.
To achieve this goal, we created a“Original model Zoo”To develop ready-made rust wrappers for as many tensorflow models as possible.

summary

In this article, we demonstrate how to use rust and webassembly in Node.js Real AI as a service use case is implemented in. Our approach provides a framework for the community to contribute to the “model zoo,” which can be used as an AI library for more application developers.

PS: This article belongs to translation,original text

Recommended Today

New JavaScript syntax “double question mark syntax” and “optional chain syntax”

Double question mark grammar Double question mark grammarThe concept is to return the value on the right when the value on the left is null or undefined let Form = undefined ?? true; //Form = true Alternative chain syntax ? optional chain //Optional chain let body = { value: { a: ‘123321’ } } let […]