MicroAI

Introduction

The MicroAI Library provides APIs to interact with trained Machine Learning models, especially to run inferences.

Usage

The MicroAI Library is provided as a Foundation Library.

To use the MicroAI Library, add the following line to the project build file:

implementation("ej.api:microai:2.0.0")

Building or running an Application which uses the MicroAI Library requires a SDK6 VEE Port that provides the MicroAI Pack.

Machine Learning Model Format

MicroAI API is designed to be framework-agnostic, meaning it does not rely on a specific Machine Learning framework like TensorFlow or ONNX.

The Machine Learning framework is integrated at the VEE Port level, as a C/C++ library.

The Application is responsible of loading the model file. Therefore, before developing an Application with MicroAI, check which model file formats are supported by your target VEE Port.

MicroEJ Simulator

If you need to use the MicroEJ Simulator, you must use a model in TensorFlow Lite for Microcontrollers (TFLM) format. Other model formats will not be compatible with the MicroEJ Simulator and cannot be executed within it.

Tensorflow Lite for Microcontrollers supports a limited subset of TensorFlow operations, which impacts the model architectures that it is possible to run. The supported operators list corresponds to the list in the all_ops_resolver.cc file.

APIs

MLInferenceEngine

The first action when working with MicroAI is to load the trained Machine Learning model using MLInferenceEngine class.

There are 2 ways to load a model:

The MLInferenceEngine constructor will:

  1. Map the model into a native data structure.

  2. Build an interpreter to run the model with.

  3. Allocate memory for the model’s tensors.

When using MLInferenceEngine(InputStream is), the model is loaded inside the MicroAI heap. The size of MicroAI heap is defined from the MicroAI Configurations.

Note that the call to MLInferenceEngine(InputStream is) will block until the model is completely retrieved/loaded.

Once initialized, MLInferenceEngine allows to get input/output model tensors and to run inferences on the trained model.

For example, the following snippet loads a trained model from the application resources and runs an inference on it:

try(MLInferenceEngine mlInferenceEngine = new MLInferenceEngine("/model.tflite")) { // Initialize the inference engine.
    InputTensor inputTensor = mlInferenceEngine.getInputTensor(0); // Get input tensor of the trained model.
    /*
     * Fill the input tensor
     */
    mlInferenceEngine.run(); // Run inference on the trained model.
    OutputTensor outputTensor = mlInferenceEngine.getOutputTensor(0); // Get output tensor of the trained model.
    /*
     * Process output data
     */
}

Tensor

Tensor parameters can be retrieved from the Tensor class.

It allows to get some useful information such as the data type, the number of dimensions, the number of elements, the size in bytes or the quantization parameters.

There are 2 kinds of tensors:

  • InputTensor: Offers services to load input data inside MicroAI input tensors before running an inference. Tensor input data must be one of the types supported by MicroAI (see Tensor.DataType).

  • OutputTensor: Offers services to retrieve output data from MicroAI output tensors after running an inference. Tensor output data must be one of the types supported by MicroAI (see Tensor.DataType).

Classes Summary

Main classes:

  • MLInferenceEngine: Loads a model, get its tensors and runs inferences on it.

  • Tensor: Retrieves a tensor information.

  • InputTensor: Loads input data before running an inference.

  • OutputTensor: Retrieves output data after running an inference.

Stateless and immutable classes:

Configuration

The MicroAI Pack can be configured by defining the following Application Options:

  • microai.heap.size: defines the size of the MicroAI heap, in which the InputStream models are allocated.

Example

For example, the following snippet runs inference on model that takes 1 quantized element as input and outputs 1 float value:

try(MLInferenceEngine mlInferenceEngine = new MLInferenceEngine("/model.tflite")) { // Initialize the inference engine.
    InputTensor inputTensor = mlInferenceEngine.getInputTensor(0); // Get input tensor of the trained model.
    byte[] inputData = new byte[inputTensor.getNumberElements()]; // Create an array that fits size of input tensor.

    // Fill inputData with quantized value.
    float realValue = 10f;
    Tensor.QuantizationParameters quantizationParameters = inputTensor.getQuantizationParams(); // Get quantization parameters.
    inputData[0] = (byte) (realValue / quantizationParameters.getScale() + quantizationParameters.getZeroPoint()); // Quantize the input value.
    inputTensor.setInputData(inputData); // Load input data inside MicroAI input tensor.

    mlInferenceEngine.run(); // Run inference on the trained model.

    OutputTensor outputTensor = mlInferenceEngine.getOutputTensor(0); // Get output tensor of the trained model.
    float[] outputData = new float[outputTensor.getNumberElements()]; // Create an array that fits size of output tensor.

    // Retrieve and print inference result.
    outputTensor.getOutputData(outputData); // Retrieve output data from MicroAI output tensor.
    System.out.println("Inference result with " + realValue + " input is " + outputData[0]);
}