# Overview

The Native Emotions Library is a portable C++ library for real-time facial emotion tracking and analysis.

The SDK provides wrappers in the following languages:

  • C++ (native)
  • C
  • Python
  • C# / .NET
  • Java (Android)

# Getting Started

# Hardware requirements

The SDK doesn't have any special hardware requirement:

  • CPU: No special requirement, any modern 64 bit capable CPU (x86-64 with AVX, ARM8) is supported
  • GPU: No special requirement
  • RAM: 2 GB of available RAM required
  • Camera: No special requirement, minimum resolution: 640x480

# Software requirements

The SDK is regularly tested on the following Operating Systems:

  • Windows 10+
  • Ubuntu 24.04+
  • macOS 15+
  • iOS 18+
  • Android 23+

# 3rd Party Licenses

While the SDK is released under a proprietary license, the following Open-Source projects were used in it with their respective licenses:

# Installation

# C++

Extract the SDK contents, include the headers from the include folder and link libNativeEmotionsLibrary to your C++ project.

# C

Extract the SDK contents, include tracker_c.h from the include folder and link libNativeEmotionsLibrary to your C project.

# Python

The python version of the SDK can be installed with pip:

$ pip install realeyes.emotion-detection

# C# / .NET

The .NET version of the SDK can be installed via NuGet:

$ dotnet add package Realeyes.EmotionTracking

# Java

For Android projects, add the library to your build.gradle dependencies.

# Usage

# C++

The main entry point of this library is the nel::Tracker class.

After a tracker object is constructed, the user can call the nel::Tracker::track() function to process a frame from a video or other frame source.

The nel::Tracker::track() function has two versions, both are non-blocking async calls: one returns std::future<ResultType>, the other accepts a callback that will be called on completion. After one call, a subsequent call is possible without waiting for the result.

For the frame data, the user must construct a nel::ImageHeader object. The frame data must outlive this object since it is a non-owning view, but it only needs to be valid during the nel::Tracker::track() call - the library will copy the frame data internally.

The following example shows the basic usage of the library using OpenCV for loading images and feeding them to the tracker:

#include "tracker.h"

#include <opencv2/core.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/videoio.hpp>

#include <iostream>

int main()
{
    nel::Tracker tracker("model/model.realZ");

    cv::VideoCapture video("video.mp4");
    cv::Mat frame;

    while (video.read(frame)) {
        nel::ImageHeader header{
            frame.ptr(),
            frame.cols,
            frame.rows,
            static_cast<int>(frame.step1()),
            nel::ImageFormat::BGR
        };
        int64_t timestamp_in_ms = video.get(cv2::CAP_PROP_POS_MSEC);

        // Track asynchronously using std::future
        auto future = tracker.track(header, std::chrono::milliseconds(timestamp_in_ms));
        auto result = future.get();

        // Process results
        std::cout << "Face tracking: " << (result.landmarks.isGood ? "good" : "failed") << std::endl;
        for (const auto& emotion : result.emotions) {
            std::cout << "  Probability: " << emotion.probability
                      << " Active: " << emotion.isActive << std::endl;
        }
    }
    return 0;
}

# C

The main entry point is the NELTracker opaque pointer type with associated functions.

After creating a tracker with nel_tracker_new(), you can track frames by calling nel_tracker_track() with a callback function. The callback will be called asynchronously when tracking completes.

The following example shows basic usage:

#include "tracker_c.h"
#include <stdio.h>
#include <stdlib.h>

void track_callback(void* user_data, NELResultType* result, const char* error_msg) {
    if (error_msg != NULL) {
        printf("Error: %s\n", error_msg);
        return;
    }

    printf("Face tracking: %s\n", result->landmarks->isGood ? "good" : "failed");
    for (int i = 0; i < result->emotions->count; i++) {
        printf("  Emotion %d - Probability: %f, Active: %d\n",
               result->emotions->emotions[i].emotionID,
               result->emotions->emotions[i].probability,
               result->emotions->emotions[i].isActive);
    }
}

int main() {
    char* error_msg = NULL;
    NELTracker* tracker = nel_tracker_new("model/model.realZ", 0, &error_msg);
    if (tracker == NULL) {
        printf("Failed to load model: %s\n", error_msg);
        free(error_msg);
        return 1;
    }

    // Prepare image data (example with dummy data)
    uint8_t image_data[640 * 480 * 3];  // RGB image
    NELImageHeader header = {
        .data = image_data,
        .width = 640,
        .height = 480,
        .stride = 640 * 3,
        .format = NELImageFormatRGB
    };

    nel_tracker_track(tracker, &header, 0, track_callback, NULL);

    // Clean up
    nel_tracker_free(tracker);
    return 0;
}

# Python

The main entry point of this library is the realeyes.emotion_detection.Tracker class.

After a tracker object is constructed, the user can call the realeyes.emotion_detection.Tracker.track() function to process frames from a video or other frame source.

The following example shows the basic usage of the library using OpenCV for loading images:

import realeyes.emotion_detection as nel
import cv2

# Initialize the tracker
tracker = nel.Tracker('model/model.realZ')

# Open video
video = cv2.VideoCapture('video.mp4')

while True:
    ret, frame = video.read()
    if not ret:
        break

    # Convert BGR to RGB (OpenCV uses BGR)
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Track emotions (timestamp in milliseconds)
    result = tracker.track(frame_rgb, 0)

    # Process results
    print(f"Face tracking: {'good' if result.landmarks.is_good else 'failed'}")
    for emotion in result.emotions:
        print(f"  Emotion ID {emotion.emotion_id}: "
              f"Probability={emotion.probability:.3f}, "
              f"Active={emotion.is_active}")

video.release()

# C# / .NET

The main entry point is the EmotionTracker class.

After an tracker object is constructed, you can call the TrackAsync() method to track faces in a frame. The method returns a Task<TrackingResult> allowing for asynchronous, non-blocking operation.

Both the constructor and tracking method support concurrent execution - you can start multiple operations in parallel without waiting for results.

The following example demonstrates processing a video frame:

using Realeyes.EmotionTracking;
using System;
using System.Threading.Tasks;

class Program
{
    static async Task Main(string[] args)
    {
        // Create tracker with model file
        using var tracker = new EmotionTracker("model/model.realZ");

        // Prepare image data (example with dummy RGB data)
        byte[] imageData = new byte[640 * 480 * 3];
        var imageHeader = new ImageHeader
        {
            Data = imageData,
            Width = 640,
            Height = 480,
            Stride = 640 * 3,
            Format = ImageFormat.RGB
        };

        // Track emotions asynchronously
        var result = await tracker.TrackAsync(imageHeader, TimeSpan.Zero);

        // Process results
        Console.WriteLine($"Face tracking: {(result.LandmarkData?.IsGood ?? false ? "good" : "failed")}");

        if (result.Emotions.Happy is { } happy)
            Console.WriteLine($"Happy: {happy.Probability:P2}, Active: {happy.IsActive}");

        if (result.Emotions.Confusion is { } confusion)
            Console.WriteLine($"Confusion: {confusion.Probability:P2}, Active: {confusion.IsActive}");
    }
}

# Java

The main entry point is the Tracker interface.

After creating a tracker object, you can call the track() method to process frames. The method returns a TrackerResultFuture for asynchronous result retrieval.

The following example shows basic usage:

import com.realeyesit.nel.*;

public class Example {
    public static void main(String[] args) {
        // Create tracker with model file
        Tracker tracker = Emotion.createTracker("model/model.realZ", 0);

        // Prepare image data (example with dummy RGB data)
        byte[] imageData = new byte[640 * 480 * 3];
        ImageHeader header = new ImageHeader();
        header.setData(imageData);
        header.setWidth(640);
        header.setHeight(480);
        header.setStride(640 * 3);
        header.setFormat(ImageFormat.RGB);

        // Track emotions asynchronously
        TrackerResultFuture future = tracker.track(header, 0);
        ResultType result = future.get();

        // Process results
        System.out.println("Face tracking: " +
            (result.getLandmarks().getIsGood() ? "good" : "failed"));

        for (EmotionData emotion : result.getEmotions()) {
            System.out.println("  Emotion: " + emotion.getEmotionID() +
                " Probability: " + emotion.getProbability() +
                " Active: " + emotion.getIsActive());
        }
    }
}

# Results

The result of the tracking contains a nel::LandmarkData structure and a nel::EmotionResults vector.

  • The nel::LandmarkData consists of the following members:

    • scale, the size of the face (larger means closer the user to the camera)
    • roll, pitch, yaw, the 3 Euler angles of the face pose
    • translate, the position of the head center on the frame
    • the landmarks2d vector with either 0 or 49 points,
    • the landmarks3d vector with either 0 or 49 points,
    • and the isGood boolean value.

    The isGood indicates whether the tracking is deemed good enough.

    landmarks2d and landmarks3d contain 0 points if the tracker failed to find a face on the image, otherwise it always contain 49 points in the following structure:

    landmarks

    landmarks3d contains the 3d coordinates of the frontal face in 3D space with 0 translation and 1 scale.

  • The nel::EmotionResults contains multiple nel::EmotionData elements with the following members:

    • probability, probability of the emotion
    • isActive, whether the probability is higher than an internal threshold
    • isDetectionSuccessful whether the tracking quality was good enough to reliable detect this emotion

    The order of the nel::EmotionData elements are the same as the emotions in nel::Tracker::get_emotion_IDs() and in nel::Tracker::get_emotion_names().

# Interpretation of the classifier output

The probability output of the Realeyes classifier (from the nel::EmotionData structure) has the following properties:

  • It is a continuous value from the [0,1] range
  • It changes depending on type and number of facial features activated
  • It typically indicates facial activity in regions of face that correspond to a given facial expression
  • Strong facial wrinkles or shadows can amplify the classifier sensitivity to corresponding facial regions
  • It is purposefully sensitive as the classifier is trained to capture slight expressions
  • It should not be interpreted as intensity of a given facial expression
  • It is not possible to prescribe which facial features correspond to what output levels due to the nature of the used ML models

We recommend the following interpretation of the probability output:

  • values close to 0
    • no or very little activity on the face with respect to a given facial expression
  • values between 0 and binary threshold
    • some facial activity was perceived, though in the view of the classifier it does not amount to a basic facial expression
  • values just below binary threshold
    • high facial activity was perceived, which under some circumstances may be interpreted as true basic facial expression, while under others not (e.g. watching ads vs. playing games)
  • values above binary threshold
    • high facial activity was perceived, which in view of the classifier amount to a basic facial expression