# Overview

The Native Emotions Library is a portable C++ library for real-time facial emotion tracking and analysis.

The SDK provides wrappers in the following languages:

C++ (native)
C
Python
C# / .NET
Java (Android)

# Getting Started

# Hardware requirements

The SDK doesn't have any special hardware requirement:

CPU: No special requirement, any modern 64 bit capable CPU (x86-64 with AVX, ARM8) is supported
GPU: No special requirement
RAM: 2 GB of available RAM required
Camera: No special requirement, minimum resolution: 640x480

# Software requirements

The SDK is regularly tested on the following Operating Systems:

Windows 10+
Ubuntu 24.04+
macOS 15+
iOS 18+
Android 23+

# 3rd Party Licenses

While the SDK is released under a proprietary license, the following Open-Source projects were used in it with their respective licenses:

OpenCV - 3 clause BSD
Tensorflow - Apache License 2.0
Protobuf - 3 clause BSD
zlib - zlib license
minizip-ng - zlib license
stlab - Boost Software License 1.0
pybind11 - 3 clause BSD
fmtlib - MIT License

# Installation

# C++

Extract the SDK contents, include the headers from the include folder and link libNativeEmotionsLibrary to your C++ project.

# C

Extract the SDK contents, include tracker_c.h from the include folder and link libNativeEmotionsLibrary to your C project.

# Python

The python version of the SDK can be installed with pip:

$ pip install realeyes.emotion-detection

# C# / .NET

The .NET version of the SDK can be installed via NuGet:

$ dotnet add package Realeyes.EmotionTracking

# Java

For Android projects, add the library to your build.gradle dependencies.

# Usage

# C++

The main entry point of this library is the nel::Tracker class.

After a tracker object is constructed, the user can call the nel::Tracker::track() function to process a frame from a video or other frame source.

The nel::Tracker::track() function has two versions, both are non-blocking async calls: one returns std::future<ResultType>, the other accepts a callback that will be called on completion. After one call, a subsequent call is possible without waiting for the result.

For the frame data, the user must construct a nel::ImageHeader object. The frame data must outlive this object since it is a non-owning view, but it only needs to be valid during the nel::Tracker::track() call - the library will copy the frame data internally.

The following example shows the basic usage of the library using OpenCV for loading images and feeding them to the tracker:

#include "tracker.h"

#include <opencv2/core.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/videoio.hpp>

#include <iostream>

int main()
{
    nel::Tracker tracker("model/model.realZ");

    cv::VideoCapture video("video.mp4");
    cv::Mat frame;

    while (video.read(frame)) {
        nel::ImageHeader header{
            frame.ptr(),
            frame.cols,
            frame.rows,
            static_cast<int>(frame.step1()),
            nel::ImageFormat::BGR
        };
        int64_t timestamp_in_ms = video.get(cv2::CAP_PROP_POS_MSEC);

        // Track asynchronously using std::future
        auto future = tracker.track(header, std::chrono::milliseconds(timestamp_in_ms));
        auto result = future.get();

        // Process results
        std::cout << "Face tracking: " << (result.landmarks.isGood ? "good" : "failed") << std::endl;
        for (const auto& emotion : result.emotions) {
            std::cout << "  Probability: " << emotion.probability
                      << " Active: " << emotion.isActive << std::endl;
        }
    }
    return 0;
}

# C

The main entry point is the NELTracker opaque pointer type with associated functions.

After creating a tracker with nel_tracker_new(), you can track frames by calling nel_tracker_track() with a callback function. The callback will be called asynchronously when tracking completes.

The following example shows basic usage:

#include "tracker_c.h"
#include <stdio.h>
#include <stdlib.h>

void track_callback(void* user_data, NELResultType* result, const char* error_msg) {
    if (error_msg != NULL) {
        printf("Error: %s\n", error_msg);
        return;
    }

    printf("Face tracking: %s\n", result->landmarks->isGood ? "good" : "failed");
    for (int i = 0; i < result->emotions->count; i++) {
        printf("  Emotion %d - Probability: %f, Active: %d\n",
               result->emotions->emotions[i].emotionID,
               result->emotions->emotions[i].probability,
               result->emotions->emotions[i].isActive);
    }
}

int main() {
    char* error_msg = NULL;
    NELTracker* tracker = nel_tracker_new("model/model.realZ", 0, &error_msg);
    if (tracker == NULL) {
        printf("Failed to load model: %s\n", error_msg);
        free(error_msg);
        return 1;
    }

    // Prepare image data (example with dummy data)
    uint8_t image_data[640 * 480 * 3];  // RGB image
    NELImageHeader header = {
        .data = image_data,
        .width = 640,
        .height = 480,
        .stride = 640 * 3,
        .format = NELImageFormatRGB
    };

    nel_tracker_track(tracker, &header, 0, track_callback, NULL);

    // Clean up
    nel_tracker_free(tracker);
    return 0;
}

# Python

The main entry point of this library is the realeyes.emotion_detection.Tracker class.

After a tracker object is constructed, the user can call the realeyes.emotion_detection.Tracker.track() function to process frames from a video or other frame source.

The following example shows the basic usage of the library using OpenCV for loading images:

import realeyes.emotion_detection as nel
import cv2

# Initialize the tracker
tracker = nel.Tracker('model/model.realZ')

# Open video
video = cv2.VideoCapture('video.mp4')

while True:
    ret, frame = video.read()
    if not ret:
        break

    # Convert BGR to RGB (OpenCV uses BGR)
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Track emotions (timestamp in milliseconds)
    result = tracker.track(frame_rgb, 0)

    # Process results
    print(f"Face tracking: {'good' if result.landmarks.is_good else 'failed'}")
    for emotion in result.emotions:
        print(f"  Emotion ID {emotion.emotion_id}: "
              f"Probability={emotion.probability:.3f}, "
              f"Active={emotion.is_active}")

video.release()

# C# / .NET

The main entry point is the EmotionTracker class.

After an tracker object is constructed, you can call the TrackAsync() method to track faces in a frame. The method returns a Task<TrackingResult> allowing for asynchronous, non-blocking operation.

Both the constructor and tracking method support concurrent execution - you can start multiple operations in parallel without waiting for results.

The following example demonstrates processing a video frame:

using Realeyes.EmotionTracking;
using System;
using System.Threading.Tasks;

class Program
{
    static async Task Main(string[] args)
    {
        // Create tracker with model file
        using var tracker = new EmotionTracker("model/model.realZ");

        // Prepare image data (example with dummy RGB data)
        byte[] imageData = new byte[640 * 480 * 3];
        var imageHeader = new ImageHeader
        {
            Data = imageData,
            Width = 640,
            Height = 480,
            Stride = 640 * 3,
            Format = ImageFormat.RGB
        };

        // Track emotions asynchronously
        var result = await tracker.TrackAsync(imageHeader, TimeSpan.Zero);

        // Process results
        Console.WriteLine($"Face tracking: {(result.LandmarkData?.IsGood ?? false ? "good" : "failed")}");

        if (result.Emotions.Happy is { } happy)
            Console.WriteLine($"Happy: {happy.Probability:P2}, Active: {happy.IsActive}");

        if (result.Emotions.Confusion is { } confusion)
            Console.WriteLine($"Confusion: {confusion.Probability:P2}, Active: {confusion.IsActive}");
    }
}

# Java

The main entry point is the Tracker interface.

After creating a tracker object, you can call the track() method to process frames. The method returns a TrackerResultFuture for asynchronous result retrieval.

The following example shows basic usage:

import com.realeyesit.nel.*;

public class Example {
    public static void main(String[] args) {
        // Create tracker with model file
        Tracker tracker = Emotion.createTracker("model/model.realZ", 0);

        // Prepare image data (example with dummy RGB data)
        byte[] imageData = new byte[640 * 480 * 3];
        ImageHeader header = new ImageHeader();
        header.setData(imageData);
        header.setWidth(640);
        header.setHeight(480);
        header.setStride(640 * 3);
        header.setFormat(ImageFormat.RGB);

        // Track emotions asynchronously
        TrackerResultFuture future = tracker.track(header, 0);
        ResultType result = future.get();

        // Process results
        System.out.println("Face tracking: " +
            (result.getLandmarks().getIsGood() ? "good" : "failed"));

        for (EmotionData emotion : result.getEmotions()) {
            System.out.println("  Emotion: " + emotion.getEmotionID() +
                " Probability: " + emotion.getProbability() +
                " Active: " + emotion.getIsActive());
        }
    }
}

# Results

The result of the tracking contains a nel::LandmarkData structure and a nel::EmotionResults vector.

The nel::LandmarkData consists of the following members:
- scale, the size of the face (larger means closer the user to the camera)
- roll, pitch, yaw, the 3 Euler angles of the face pose
- translate, the position of the head center on the frame
- the landmarks2d vector with either 0 or 49 points,
- the landmarks3d vector with either 0 or 49 points,
- and the isGood boolean value.
The isGood indicates whether the tracking is deemed good enough.

landmarks2d and landmarks3d contain 0 points if the tracker failed to find a face on the image, otherwise it always contain 49 points in the following structure:

landmarks3d contains the 3d coordinates of the frontal face in 3D space with 0 translation and 1 scale.
The nel::EmotionResults contains multiple nel::EmotionData elements with the following members:
- probability, probability of the emotion
- isActive, whether the probability is higher than an internal threshold
- isDetectionSuccessful whether the tracking quality was good enough to reliable detect this emotion
The order of the nel::EmotionData elements are the same as the emotions in nel::Tracker::get_emotion_IDs() and in nel::Tracker::get_emotion_names().

# Interpretation of the classifier output

The probability output of the Realeyes classifier (from the nel::EmotionData structure) has the following properties:

It is a continuous value from the [0,1] range
It changes depending on type and number of facial features activated
It typically indicates facial activity in regions of face that correspond to a given facial expression
Strong facial wrinkles or shadows can amplify the classifier sensitivity to corresponding facial regions
It is purposefully sensitive as the classifier is trained to capture slight expressions
It should not be interpreted as intensity of a given facial expression
It is not possible to prescribe which facial features correspond to what output levels due to the nature of the used ML models

We recommend the following interpretation of the probability output:

values close to 0
- no or very little activity on the face with respect to a given facial expression
values between 0 and binary threshold
- some facial activity was perceived, though in the view of the classifier it does not amount to a basic facial expression
values just below binary threshold
- high facial activity was perceived, which under some circumstances may be interpreted as true basic facial expression, while under others not (e.g. watching ads vs. playing games)
values above binary threshold
- high facial activity was perceived, which in view of the classifier amount to a basic facial expression