#
Overview
The Native Emotions Library is a portable C++ library for real-time facial emotion tracking and analysis.
The SDK provides wrappers in the following languages:
- C++ (native)
- C
- Python
- C# / .NET
- Java (Android)
#
Getting Started
#
Hardware requirements
The SDK doesn't have any special hardware requirement:
- CPU: No special requirement, any modern 64 bit capable CPU (x86-64 with AVX, ARM8) is supported
- GPU: No special requirement
- RAM: 2 GB of available RAM required
- Camera: No special requirement, minimum resolution: 640x480
#
Software requirements
The SDK is regularly tested on the following Operating Systems:
- Windows 10+
- Ubuntu 24.04+
- macOS 15+
- iOS 18+
- Android 23+
#
3rd Party Licenses
While the SDK is released under a proprietary license, the following Open-Source projects were used in it with their respective licenses:
- OpenCV - 3 clause BSD
- Tensorflow - Apache License 2.0
- Protobuf - 3 clause BSD
- zlib - zlib license
- minizip-ng - zlib license
- stlab - Boost Software License 1.0
- pybind11 - 3 clause BSD
- fmtlib - MIT License
#
Installation
#
C++
Extract the SDK contents, include the headers from the include folder and link libNativeEmotionsLibrary to your C++ project.
#
C
Extract the SDK contents, include tracker_c.h from the include folder and link libNativeEmotionsLibrary to your C project.
#
Python
The python version of the SDK can be installed with pip:
$ pip install realeyes.emotion-detection
#
C# / .NET
The .NET version of the SDK can be installed via NuGet:
$ dotnet add package Realeyes.EmotionTracking
#
Java
For Android projects, add the library to your build.gradle dependencies.
#
Usage
#
C++
The main entry point of this library is the nel::Tracker class.
After a tracker object is constructed, the user can call the nel::Tracker::track() function to process
a frame from a video or other frame source.
The nel::Tracker::track() function has two versions, both are non-blocking async calls: one returns
std::future<ResultType>, the other accepts a callback that will be called on completion. After one call,
a subsequent call is possible without waiting for the result.
For the frame data, the user must construct a nel::ImageHeader object. The frame data must outlive
this object since it is a non-owning view, but it only needs to be valid during the nel::Tracker::track()
call - the library will copy the frame data internally.
The following example shows the basic usage of the library using OpenCV for loading images and feeding them to the tracker:
#include "tracker.h"
#include <opencv2/core.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/videoio.hpp>
#include <iostream>
int main()
{
nel::Tracker tracker("model/model.realZ");
cv::VideoCapture video("video.mp4");
cv::Mat frame;
while (video.read(frame)) {
nel::ImageHeader header{
frame.ptr(),
frame.cols,
frame.rows,
static_cast<int>(frame.step1()),
nel::ImageFormat::BGR
};
int64_t timestamp_in_ms = video.get(cv2::CAP_PROP_POS_MSEC);
// Track asynchronously using std::future
auto future = tracker.track(header, std::chrono::milliseconds(timestamp_in_ms));
auto result = future.get();
// Process results
std::cout << "Face tracking: " << (result.landmarks.isGood ? "good" : "failed") << std::endl;
for (const auto& emotion : result.emotions) {
std::cout << " Probability: " << emotion.probability
<< " Active: " << emotion.isActive << std::endl;
}
}
return 0;
}
#
C
The main entry point is the NELTracker opaque pointer type with associated functions.
After creating a tracker with nel_tracker_new(), you can track frames by calling nel_tracker_track()
with a callback function. The callback will be called asynchronously when tracking completes.
The following example shows basic usage:
#include "tracker_c.h"
#include <stdio.h>
#include <stdlib.h>
void track_callback(void* user_data, NELResultType* result, const char* error_msg) {
if (error_msg != NULL) {
printf("Error: %s\n", error_msg);
return;
}
printf("Face tracking: %s\n", result->landmarks->isGood ? "good" : "failed");
for (int i = 0; i < result->emotions->count; i++) {
printf(" Emotion %d - Probability: %f, Active: %d\n",
result->emotions->emotions[i].emotionID,
result->emotions->emotions[i].probability,
result->emotions->emotions[i].isActive);
}
}
int main() {
char* error_msg = NULL;
NELTracker* tracker = nel_tracker_new("model/model.realZ", 0, &error_msg);
if (tracker == NULL) {
printf("Failed to load model: %s\n", error_msg);
free(error_msg);
return 1;
}
// Prepare image data (example with dummy data)
uint8_t image_data[640 * 480 * 3]; // RGB image
NELImageHeader header = {
.data = image_data,
.width = 640,
.height = 480,
.stride = 640 * 3,
.format = NELImageFormatRGB
};
nel_tracker_track(tracker, &header, 0, track_callback, NULL);
// Clean up
nel_tracker_free(tracker);
return 0;
}
#
Python
The main entry point of this library is the realeyes.emotion_detection.Tracker class.
After a tracker object is constructed, the user can call the realeyes.emotion_detection.Tracker.track()
function to process frames from a video or other frame source.
The following example shows the basic usage of the library using OpenCV for loading images:
import realeyes.emotion_detection as nel
import cv2
# Initialize the tracker
tracker = nel.Tracker('model/model.realZ')
# Open video
video = cv2.VideoCapture('video.mp4')
while True:
ret, frame = video.read()
if not ret:
break
# Convert BGR to RGB (OpenCV uses BGR)
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
# Track emotions (timestamp in milliseconds)
result = tracker.track(frame_rgb, 0)
# Process results
print(f"Face tracking: {'good' if result.landmarks.is_good else 'failed'}")
for emotion in result.emotions:
print(f" Emotion ID {emotion.emotion_id}: "
f"Probability={emotion.probability:.3f}, "
f"Active={emotion.is_active}")
video.release()
#
C# / .NET
The main entry point is the EmotionTracker class.
After an tracker object is constructed, you can call the TrackAsync() method to track faces
in a frame. The method returns a Task<TrackingResult> allowing for asynchronous, non-blocking operation.
Both the constructor and tracking method support concurrent execution - you can start multiple operations in parallel without waiting for results.
The following example demonstrates processing a video frame:
using Realeyes.EmotionTracking;
using System;
using System.Threading.Tasks;
class Program
{
static async Task Main(string[] args)
{
// Create tracker with model file
using var tracker = new EmotionTracker("model/model.realZ");
// Prepare image data (example with dummy RGB data)
byte[] imageData = new byte[640 * 480 * 3];
var imageHeader = new ImageHeader
{
Data = imageData,
Width = 640,
Height = 480,
Stride = 640 * 3,
Format = ImageFormat.RGB
};
// Track emotions asynchronously
var result = await tracker.TrackAsync(imageHeader, TimeSpan.Zero);
// Process results
Console.WriteLine($"Face tracking: {(result.LandmarkData?.IsGood ?? false ? "good" : "failed")}");
if (result.Emotions.Happy is { } happy)
Console.WriteLine($"Happy: {happy.Probability:P2}, Active: {happy.IsActive}");
if (result.Emotions.Confusion is { } confusion)
Console.WriteLine($"Confusion: {confusion.Probability:P2}, Active: {confusion.IsActive}");
}
}
#
Java
The main entry point is the Tracker interface.
After creating a tracker object, you can call the track() method to process frames.
The method returns a TrackerResultFuture for asynchronous result retrieval.
The following example shows basic usage:
import com.realeyesit.nel.*;
public class Example {
public static void main(String[] args) {
// Create tracker with model file
Tracker tracker = Emotion.createTracker("model/model.realZ", 0);
// Prepare image data (example with dummy RGB data)
byte[] imageData = new byte[640 * 480 * 3];
ImageHeader header = new ImageHeader();
header.setData(imageData);
header.setWidth(640);
header.setHeight(480);
header.setStride(640 * 3);
header.setFormat(ImageFormat.RGB);
// Track emotions asynchronously
TrackerResultFuture future = tracker.track(header, 0);
ResultType result = future.get();
// Process results
System.out.println("Face tracking: " +
(result.getLandmarks().getIsGood() ? "good" : "failed"));
for (EmotionData emotion : result.getEmotions()) {
System.out.println(" Emotion: " + emotion.getEmotionID() +
" Probability: " + emotion.getProbability() +
" Active: " + emotion.getIsActive());
}
}
}
#
Results
The result of the tracking contains a nel::LandmarkData structure and a nel::EmotionResults vector.
The
nel::LandmarkDataconsists of the following members:- scale, the size of the face (larger means closer the user to the camera)
- roll, pitch, yaw, the 3 Euler angles of the face pose
- translate, the position of the head center on the frame
- the landmarks2d vector with either 0 or 49 points,
- the landmarks3d vector with either 0 or 49 points,
- and the isGood boolean value.
The isGood indicates whether the tracking is deemed good enough.
landmarks2d and landmarks3d contain 0 points if the tracker failed to find a face on the image, otherwise it always contain 49 points in the following structure:

landmarks3d contains the 3d coordinates of the frontal face in 3D space with 0 translation and 1 scale.
The
nel::EmotionResultscontains multiplenel::EmotionDataelements with the following members:- probability, probability of the emotion
- isActive, whether the probability is higher than an internal threshold
- isDetectionSuccessful whether the tracking quality was good enough to reliable detect this emotion
The order of the
nel::EmotionDataelements are the same as the emotions innel::Tracker::get_emotion_IDs()and innel::Tracker::get_emotion_names().
#
Interpretation of the classifier output
The probability output of the Realeyes classifier (from the nel::EmotionData structure) has the following properties:
- It is a continuous value from the [0,1] range
- It changes depending on type and number of facial features activated
- It typically indicates facial activity in regions of face that correspond to a given facial expression
- Strong facial wrinkles or shadows can amplify the classifier sensitivity to corresponding facial regions
- It is purposefully sensitive as the classifier is trained to capture slight expressions
- It should not be interpreted as intensity of a given facial expression
- It is not possible to prescribe which facial features correspond to what output levels due to the nature of the used ML models
We recommend the following interpretation of the probability output:
- values close to 0
- no or very little activity on the face with respect to a given facial expression
- values between 0 and binary threshold
- some facial activity was perceived, though in the view of the classifier it does not amount to a basic facial expression
- values just below binary threshold
- high facial activity was perceived, which under some circumstances may be interpreted as true basic facial expression, while under others not (e.g. watching ads vs. playing games)
- values above binary threshold
- high facial activity was perceived, which in view of the classifier amount to a basic facial expression