ALPON X5 AI · AI Development · Updated 2026-05-07

— AI Development

Run vision AI on the DEEPX DX-M1 NPU.

End-to-end developer reference for the ALPON X5 AI: the Sixfab Model Zoo for instant inference, the DXNN SDK for compiling your own ONNX graphs, and the dx_engine Python and dxrt_api.h C++ APIs that drive the NPU from a Docker container. Intelligented by DEEPX. Built on Raspberry Pi.

25 TOPS · DEEPX DX-M1 4 GB NPU memory ONNX → DXNN compiler Docker-native Python & C++

What is the ALPON X5 AI development workflow?

The ALPON X5 AI supports two paths. Path 1 — Sixfab Model Zoo ships pre-compiled .dxnn models inside the sixfab-dx APT package and exposes a run_hello_world demo for instant inference. Path 2 — DXNN SDK is the custom-model workflow: train in PyTorch, TensorFlow, or Keras, export to ONNX, compile to .dxnn with dx-com on a host PC, and deploy through dx_engine Python or dxrt_api.h C++ on the DEEPX DX-M1 NPU.

01Overview

The ALPON X5 AI is an industrial edge AI computer built for image-based inference at the network edge. AI acceleration is provided by the DEEPX DX-M1: a 25 TOPS NPU with 4 GB of dedicated on-chip memory, fully independent of the Raspberry Pi CM5 system RAM. Every developer tool on this page targets that accelerator.

Custom-model pipeline

Path 2 (DXNN SDK) follows a four-stage pipeline. Path 1 (Sixfab Model Zoo) skips the first three stages because the .dxnn file ships pre-compiled inside sixfab-dx.

Source

Training framework

PyTorch · TF · Keras

Interchange

ONNX graph

.onnx · opset ≥ 13

Compile (Host PC)

dx-com

DEEPX compiler

Deploy (ALPON X5 AI)

sixfab-dx + dx_engine

.dxnn on DX-M1 NPU

What lives where

The toolchain is split cleanly between the host PC (your development laptop) and the target device (the ALPON X5 AI). Keep this split in mind when wiring up build systems and CI pipelines.

Host PC

x86_64 Linux

dx-com: compiles ONNX graphs into .dxnn binaries optimized for DEEPX DX-M1.
dx-tron: Netron-based viewer for .dxnn files. Used for inspection and debugging.
Your training pipeline and ONNX export script.

ALPON X5 AI (target)

arm64 · ALPON X5 AI OS

DEEPX kernel driver: exposes the NPU as /dev/dxrt0. Preinstalled.
sixfab-dx: runtime, Python wheels, pre-built venv, and CLI tools.
dx_engine Python and dxrt_api.h C++ APIs.
Compiled .dxnn models and your inference application.

Compilation happens on a host PC once per model. The .dxnn artifact is then copied to any number of ALPON X5 AI devices and loaded at runtime. You do not need a DEEPX DX-M1 attached to your development machine to compile.

Languages and APIs

The DEEPX Runtime ships first-class bindings for Python (dx_engine) and C++ (dxrt_api.h). Use Python for rapid prototyping, video analytics with OpenCV or GStreamer, and glue code. Use C++ when you need deterministic latency, embedded integration, or direct runtime control.

Two paths to inference

The ALPON X5 AI supports two complementary workflows. Pick the one that matches what you need today; you can mix them in the same project.

Path 1 Sixfab Model Zoo

Pre-compiled .dxnn models bundled with sixfab-dx. Run run_hello_world for an instant YOLOv8 demo. Fastest path to working inference, no compiler or training pipeline required.

Run a demo in seconds

Path 2 DXNN SDK (custom model)

Compile your own ONNX model into a .dxnn binary with dx-com on a host PC and deploy it through dx_engine on the ALPON X5 AI. Full control over architecture, quantization, and post-processing.

Deploy a custom model

Supported workloads

The DEEPX DX-M1 is optimized for image-based AI workloads. The runtime executes convolutional architectures reliably. Transformer-based models are not supported on the current runtime.

Object detection: YOLO family (v5, v7, v8, v11, YOLO26, YOLOX), SSD variants.
Classification: ResNet, MobileNet, EfficientNet.
Segmentation: U-Net, DeepLabV3, and other semantic segmentation backbones.
OCR: PaddleOCR-based detection, classification, and recognition heads.
Face detection and recognition: RetinaFace, SCRFD, ArcFace pipelines.
Pose estimation: YOLOv8-Pose, HRNet keypoint regressors.

02DEEPX Runtime

What is the DEEPX Runtime on the ALPON X5 AI?

The runtime is delivered through the sixfab-dx APT package. It bundles the DEEPX Runtime libraries, the dx_engine Python wheel, a pre-built virtual environment at /opt/sixfab-dx/venv, and CLI tools. The kernel driver is preinstalled on ALPON X5 AI OS and exposes the DEEPX DX-M1 at /dev/dxrt0.

APT Package

sixfab-dx

sixfab repo

Python API

dx_engine

bundled venv

C++ API

dxrt_api.h

native header

Device Node

/dev/dxrt0

PCIe Gen 3

Health Check

dxrt-cli -s

device status

Live Monitor

dxtop

htop-like NPU view

Model Format

.dxnn

from ONNX via dx-com

How do I install the DEEPX Runtime on the ALPON X5 AI?

The DEEPX kernel driver and the Sixfab APT repository are preconfigured on ALPON X5 AI OS. Install sixfab-dx with a single apt command, or pull it into your Docker image during the build step.

Install `sixfab-dx`

One command brings in the shared libraries, CLI tools, and the pre-built Python virtual environment with dx_engine already inside.

sudo apt update && sudo apt install -y sixfab-dx

Verify the NPU

Two CLI tools ship with the package. Use dxrt-cli -s for a quick status check and dxtop for a live, htop-like view of NPU utilization.

# 1. One-shot status check
dxrt-cli -s

# 2. Live monitoring while a workload runs
dxtop

Activate the bundled Python environment

The dx_engine wheel is pre-installed inside the bundled venv. Activate it before importing.

source /opt/sixfab-dx/venv/bin/activate
python -c "import dx_engine; print(dx_engine.__version__)"

The dx_engine wheel ships inside the sixfab-dx package and is already installed in /opt/sixfab-dx/venv. Use that venv directly. Avoid creating a parallel venv and copying files around.

Running the runtime inside a Docker container

Production ALPON X5 AI deployments run inference code inside Docker containers. The kernel driver lives on the host, so the container only needs access to the device node and the runtime libraries.

# Minimal run command: privileged mode + NPU device node
docker run --privileged   --device /dev/dxrt0   -v $(pwd)/models:/models   -it your-image:tag

The equivalent in docker-compose.yml:

services:
  inference:
    image: your-image:tag
    privileged: true
    devices:
      - "/dev/dxrt0:/dev/dxrt0"
    volumes:
      - ./models:/models
    restart: unless-stopped

The DEEPX Runtime uses PCIe ioctls to communicate with the DEEPX DX-M1 that are not exposed by default in unprivileged containers. --privileged is currently required. If your threat model demands tighter isolation, restrict the container with a narrow seccomp profile or AppArmor policy.

Concurrent models

The DEEPX Runtime schedules NPU time across multiple concurrently loaded models automatically. A typical ALPON X5 AI deployment pairs a detector (for example YOLO) with a classifier or OCR head on the same device. You do not need to implement your own scheduler.

03Sixfab Model Zoo

What is the Sixfab Model Zoo?

The Sixfab Model Zoo is a curated set of pre-compiled .dxnn models bundled with the sixfab-dx APT package. It includes a ready-to-run run_hello_world demo and validated builds for object detection, face detection, pose estimation, and instance segmentation. There is no compiler, no training pipeline, and no extra download: install sixfab-dx and you have working inference.

The Sixfab Model Zoo is the fastest way to validate your hardware and understand the inference pipeline before building your own application. The same runtime powers the demos and your production code, so any zoo demo can ship as-is or serve as a starting point.

Sixfab Model Zoo demos use the same sixfab-dx runtime you already installed. There is no separate dependency to fetch or compiler to set up. Production deployments can use a zoo model directly without any compilation step.

Quick demo: `run_hello_world`

The sixfab-dx package includes a ready-to-run YOLOv8 object detection demo. It draws bounding boxes around cars, people, and other objects in real time using a bundled sample video, so no camera is needed to get started.

# Activate the bundled venv (if not already active)
source /opt/sixfab-dx/venv/bin/activate

# Launch the YOLOv8 demo
run_hello_world

Expected performance summary on the DEEPX DX-M1 (25 TOPS):

Performance summary OK

         PERFORMANCE SUMMARY
================================================
Pipeline Step   Avg Latency   Throughput
------------------------------------------------
Read              21.45 ms      46.6 FPS
Preprocess        14.11 ms      70.9 FPS
Inference        399.69 ms      16.0 FPS*
Postprocess        2.23 ms     449.0 FPS
Display           32.47 ms      30.8 FPS
------------------------------------------------
* Actual throughput via async inference
Overall FPS   :  16.0 FPS

To watch the NPU in real time, open a second SSH session and run dxtop. Core utilization should sit at 80 to 90 percent during active inference.

Available models

Pre-compiled .dxnn files included with the package. Performance figures are for the 25 TOPS DEEPX DX-M1 at default settings.

Model file	Task	Performance
YoloV8N.dxnn	Object detection (nano + PPU)	~35 FPS
YoloV8S.dxnn	Object detection (small)	FPS pending
YoloV8M.dxnn	Object detection (medium)	FPS pending
YoloV9S.dxnn	Object detection	FPS pending
YoloV9C.dxnn	Object detection (compact)	FPS pending
SCRFD500M.dxnn	Face detection	FPS pending
YoloV5Pose.dxnn	Pose estimation	FPS pending
YoloV26S-Seg.dxnn	Instance segmentation	FPS pending

PPU (Post-Processing Unit) models execute non-maximum suppression and confidence filtering on the NPU itself, removing a major CPU bottleneck. The YOLOv8n + PPU variant in the zoo reaches ~35 FPS for this reason.

Supported camera inputs

The camera connects to the Raspberry Pi CM5 host; the NPU only receives preprocessed frames. Any of these sources work with zoo demos and with custom applications.

Source	Invocation	Notes
USB Webcam	`--camera_index 0`	Any UVC-compatible USB camera. Change the index for multi-camera setups.
Video file	`-v video.mp4`	MP4, AVI, MKV, and any other format OpenCV decodes.
RTSP stream	`-v rtsp://<ip>/stream`	IP cameras over RTSP. Multi-stream pipelines are supported in the GitHub examples.
RPi Camera Module	`libcamera` / `picamera2`	Capture frames in C++ with `libcamera` or in Python with `picamera2` and feed them into the inference pipeline.

04DEEPX Model Zoo

What is the DEEPX Model Zoo?

The DEEPX Model Zoo is the upstream catalog of pre-compiled .dxnn models maintained by DEEPX. It is broader than the curated Sixfab Model Zoo and covers object detection, image classification, semantic segmentation, face detection and recognition, pose estimation, and OCR. Models ship in Q-Lite (fast INT8 default) and Q-Pro (fine-tuned, higher accuracy) quantization modes. Browse and download at developer.deepx.ai/modelzoo.

Categories and representative models

Object Detection

Bounding-box detectors for real-time video analytics, safety monitoring, and inventory tracking.

YOLOv5 YOLOv7 YOLOv8 YOLOv11 YOLO26 YOLOX SSD

Image Classification

Multi-class image classifiers for product sorting, defect detection, and tagging pipelines.

ResNet-18/50 MobileNet v2/v3 EfficientNet-B0/B3

Semantic Segmentation

Per-pixel masks for road-scene understanding, medical imaging, and surface inspection.

U-Net DeepLabV3

Face Detection & Recognition

Detection, alignment, and embedding models for access control and attendance systems.

RetinaFace SCRFD ArcFace

Pose Estimation

2D keypoint regression for workplace safety analytics and human activity monitoring.

YOLOv8-Pose HRNet

OCR

Text detection, classification, and recognition pipelines for document processing and industrial labels.

PaddleOCR det PaddleOCR cls PaddleOCR rec

The Model Zoo catalog reflects what the DEEPX Runtime supports: image-based CNNs. Transformer-based models (ViT, DETR, LLMs) are not supported on the current runtime. Plan your architecture around the categories listed above.

Quantization modes: Q-Lite vs Q-Pro

Models in the zoo ship in two quantization flavors. Pick one based on your accuracy headroom and deployment timeline.

Mode	Use case	Trade-off
Q-Lite	Default choice. Standard INT8 quantization optimized for fast inference and short compile times.	Lower accuracy floor than Q-Pro on sensitive models; typically negligible for well-trained detectors.
Q-Pro	Accuracy-sensitive workloads. High-precision quantization with fine-tuning to recover accuracy close to FP32.	Longer compile time. Use for production models where every mAP point matters.

How do I download a model from the Model Zoo?

Models are distributed as .dxnn files from the DEEPX developer portal. On the ALPON X5 AI, fetch the file directly with wget or curl and mount it into your container.

# 1. Keep models under /opt/models on the host
sudo mkdir -p /opt/models
cd /opt/models

# 2. Download a Q-Lite YOLOv8 nano model (example path)
sudo curl -L -O   https://developer.deepx.ai/modelzoo/download/yolov8n_qlite.dxnn

Review the per-model license on the DEEPX developer portal before redistributing or shipping a commercial product derived from a zoo model.

05Deploy a custom model

How do I deploy a custom AI model on the ALPON X5 AI?

Deploying a custom model takes six steps: (1) export to ONNX on your development machine, (2) compile to .dxnn on an Ubuntu x86_64 host with dx-com, (3) copy the artifact to the ALPON X5 AI, (4) write a Python inference script using dx_engine, (5) build and run it inside a privileged Docker container against /dev/dxrt0, and (6) verify with dxrt-cli -s or dxtop. To skip compilation entirely, use a Sixfab Model Zoo or DEEPX Model Zoo build instead.

Compiler host requirements

The dx-com compiler runs on a separate Ubuntu x86_64 machine. ARM and aarch64 hosts are not supported for compilation. The ALPON X5 AI itself is the deployment target, not the build host. Compile once per model on Ubuntu; the resulting .dxnn file then runs offline on any number of ALPON X5 AI devices.

CPU Architecture

amd64 (x86_64)

arm64 not supported

RAM

≥ 16 GB

required

Disk

≥ 8 GB free

toolchain + models

Operating System

Ubuntu 20/22/24

64-bit only

glibc

≥ 2.28

ldd --version

GPU

Not required

CPU compile

Compiling a typical vision model with dx-com takes approximately two hours. Run compilations overnight or on a CI worker. Once compiled, the .dxnn file loads instantly on the ALPON X5 AI and can be redeployed indefinitely without recompiling.

Prerequisites

An ALPON X5 AI powered on and reachable over SSH, with ALPON X5 AI OS up to date.
sixfab-dx installed on the device per the DEEPX Runtime section.
Docker available on the device (included with ALPON X5 AI OS).
An Ubuntu x86_64 host meeting the requirements above for the compile step.
A .dxnn model, either downloaded from a Model Zoo or compiled with dx-com on your host PC.

Step-by-step walkthrough

Export your model to ONNX

On your host PC, export the trained model to ONNX with opset 13 or later. This example uses Ultralytics YOLOv8.

from ultralytics import YOLO

model = YOLO("yolov8n.pt")
model.export(format="onnx", opset=13, simplify=True)
# produces yolov8n.onnx

Compile the ONNX graph with `dx-com`

Run the DEEPX compiler on your host PC to produce a .dxnn artifact. PPU support, which offloads bounding-box decoding and NMS to the NPU, is configured inside the model .cfg file rather than as a CLI flag.

dx-com compile   --model yolov8n.onnx   --config yolov8n.cfg   --output yolov8n.dxnn

Enable PPU inside yolov8n.cfg:

ppu:
  enabled: true
  type: yolo
  num_classes: 80
  conf_threshold: 0.25
  iou_threshold: 0.45

Copy the model to the ALPON X5 AI

Transfer the .dxnn artifact to a stable path on the device, typically /opt/models.

scp yolov8n.dxnn alpon@<device-ip>:/opt/models/

Write an inference script

On the ALPON X5 AI, write a minimal Python script that loads the model and runs one inference pass. Save it as infer.py.

import cv2
import numpy as np
from dx_engine import InferenceEngine

# 1) Load the compiled model onto the DX-M1 NPU
engine = InferenceEngine("/models/yolov8n.dxnn")

# 2) Prepare an input frame (BGR, 640x640 for YOLOv8n)
frame = cv2.imread("/models/sample.jpg")
input_tensor = cv2.resize(frame, (640, 640))

# 3) Run inference on the NPU
outputs = engine.run([input_tensor])

# 4) outputs is a list of numpy arrays; shape depends on the model
print("Output tensors:", [o.shape for o in outputs])

Build and run the container

Write a minimal Dockerfile that installs sixfab-dx and copies your script. Inside a fresh Debian base image, the Sixfab APT repository must be registered manually (the host ALPON X5 AI OS has it preconfigured, but a clean container does not).

FROM debian:trixie-slim

RUN apt-get update && apt-get install -y       wget gnupg ca-certificates python3-opencv  && wget -qO - https://sixfab.github.io/sixfab_dx/public.gpg       | gpg --dearmor -o /usr/share/keyrings/sixfab-dx.gpg  && echo "deb [signed-by=/usr/share/keyrings/sixfab-dx.gpg] https://sixfab.github.io/sixfab_dx trixie main"       > /etc/apt/sources.list.d/sixfab-dx.list  && apt-get update && apt-get install -y sixfab-dx

COPY infer.py /app/infer.py
WORKDIR /app

# Use the pre-built Sixfab venv so dx_engine is importable
CMD ["/opt/sixfab-dx/venv/bin/python", "/app/infer.py"]

Build and run it on the device, mounting /opt/models and exposing the NPU device node.

docker build -t alpon-infer:latest .

docker run --rm --privileged   --device /dev/dxrt0   -v /opt/models:/models   alpon-infer:latest

Monitor the NPU

In another SSH session, run dxrt-cli -s for a one-shot status check, or dxtop for a live, htop-like view. Utilization should spike while your container is running.

# One-shot status, firmware version, PCIe link state
dxrt-cli -s

# Live monitoring; expect 80 to 90% core utilization
dxtop

Common pitfalls

Symptom	Likely cause and fix
`ImportError: No module named dx_engine`	The script is running under the system Python instead of the bundled venv. Invoke `/opt/sixfab-dx/venv/bin/python` directly, or `source /opt/sixfab-dx/venv/bin/activate` first.
`Cannot open /dev/dxrt0` inside container	Missing `--privileged` or `--device /dev/dxrt0`. Add both to `docker run` or the compose file.
Model fails to load, error references memory	Compiled `.dxnn` footprint exceeds 4 GB NPU memory. Recompile with a smaller input resolution or switch to a lighter model variant.
Extremely low FPS on YOLO	Model compiled without PPU support. Add a `ppu` block to the `.cfg` file and recompile.
Detections look right but accuracy is degraded	Color-channel mismatch. OpenCV loads frames as BGR; most models train on RGB. Convert explicitly with `cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)` before inference.
Compile fails with unsupported operator	The ONNX graph contains an attention block or transformer operator. Switch to a CNN-based architecture; transformers are not supported on the current runtime.
Runtime reports a version mismatch on model load	An SDK update introduced breaking changes. Recompile the `.dxnn` file with the matching `dx-com` version.

06Performance & benchmarks

What FPS can I expect on the ALPON X5 AI DX-M1?

With PPU support enabled, a YOLO-nano class detector reaches approximately 50 FPS at 1280 × 720 and 20 to 25 FPS at 1920 × 1080 on the DEEPX DX-M1. Larger YOLO variants scale down proportionally. Actual throughput depends on the model variant, input resolution, pre-processing path, and whether PPU is compiled into the graph.

YOLO throughput reference

Input resolution	Approximate FPS (PPU-compiled YOLO nano)	Notes
1280 × 720 (HD)	~50 FPS	Real-time processing for single-stream HD video analytics.
1920 × 1080 (Full HD)	~20 to 25 FPS	Headroom for additional CPU-side pre- and post-processing.

PPU-compiled models handle bounding-box decoding and NMS on the NPU, which significantly reduces CPU overhead. Without PPU, post-processing runs on the CM5 CPU and can become the bottleneck on Full HD streams. Most YOLO variants support PPU compilation.

How to benchmark your own model

Use a simple loop around engine.run() and measure wall-clock time over a warm window. Skip the first ~10 iterations to avoid warm-up noise.

import time, numpy as np
from dx_engine import InferenceEngine

engine = InferenceEngine("/models/yolov8n.dxnn")
dummy  = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)

# Warm-up
for _ in range(10):
    engine.run([dummy])

# Timed window
N = 200
t0 = time.perf_counter()
for _ in range(N):
    engine.run([dummy])
dt = time.perf_counter() - t0
print(f"FPS: {N/dt:.1f} (n={N})")

Variables that move the numbers

Model variant. Nano tier is the fastest; small, medium, and large variants drop FPS roughly proportional to compute.
Input resolution. Doubling resolution roughly quarters FPS on detection models.
PPU compilation. On or off is usually the largest single factor for YOLO.
Quantization mode. Q-Lite runs slightly faster than Q-Pro; choose Q-Pro only if you measure an accuracy regression.
Async inference. Use the runtime's async API to overlap NPU compute with CPU pre- and post-processing. Submit the next frame while the previous one is still on the NPU.
Concurrent models. Running a detector and a classifier simultaneously shares NPU time; each runs slower than in isolation.
Pre- and post-processing path. OpenCV decode plus color conversion on the CPU can bottleneck high-resolution pipelines. Consider GStreamer with hardware decode for sustained Full HD.

Hard limits

NPU memory ceiling	4 GB on-chip. Models whose compiled footprint exceeds 4 GB will not load.
NPU power envelope	2 W minimum, 5 W maximum under supported AI workloads.
Interface bandwidth	PCIe Gen3 x2, shared with the NVMe SSD via the on-board ASM2806I packet switch.
Supported architectures	Image-based CNNs. Transformer-based models are not supported on the current runtime.
Power-mode control	None. The NPU runs at a fixed performance profile; there is no software API for low-power or performance modes.

Published FPS numbers are single-stream, dummy-input references. Real-world pipelines add frame capture, decode, color conversion, and result rendering, all of which consume CPU cycles on the CM5. Always measure end-to-end throughput under your exact workload before committing to a deployment budget.

07Examples & references

Working code is the fastest way to evaluate the ALPON X5 AI for a new use case. The Sixfab DX examples repository contains 54 ready-to-run demos (28 Python, 26 C++) covering object detection across all supported YOLO variants, instance and semantic segmentation, pose estimation, OCR, face detection, PPU-accelerated and async pipelines, and analytics applications such as zone intrusion, people tracking, traffic counting, and queue analysis.

Quick demo run

After sixfab-dx is installed, clone the examples repository and run the auto-installer to fetch models and build the C++ demos.

git clone https://github.com/sixfab/sixfab-dx-examples
cd sixfab-dx-examples
./auto-install.sh

# Activate the bundled venv before running Python demos
source /opt/sixfab-dx/venv/bin/activate

# Launch the interactive Python demo menu
bash python_examples/start.sh

# Or the C++ menu
bash cpp_examples/start.sh

External references

Sixfab DX runtime & APT repository

github.com/sixfab/sixfab_dx

Sixfab DX examples (Python & C++)

github.com/sixfab/sixfab-dx-examples

DEEPX Model Zoo

developer.deepx.ai/modelzoo

DEEPX dx_rt runtime source

github.com/DEEPX-AI/dx_rt

DEEPX kernel driver source

github.com/DEEPX-AI/dx_rt_npu_linux_driver

DEEPX full toolchain (compiler, simulator)

github.com/DEEPX-AI/dx-all-suite