AI Development

ALPON X5 AI · AI Development · Updated 2026-05-07
— AI Development

Run vision AI on the DEEPX DX-M1 NPU.

End-to-end developer reference for the ALPON X5 AI: the Sixfab Model Zoo for instant inference, the DXNN SDK for compiling your own ONNX graphs, and the dx_engine Python and dxrt_api.h C++ APIs that drive the NPU from a Docker container. Intelligented by DEEPX. Built on Raspberry Pi.

25 TOPS · DEEPX DX-M1 4 GB NPU memory ONNX → DXNN compiler Docker-native Python & C++
What is the ALPON X5 AI development workflow?

The ALPON X5 AI supports two paths. Path 1 — Sixfab Model Zoo ships pre-compiled .dxnn models inside the sixfab-dx APT package and exposes a run_hello_world demo for instant inference. Path 2 — DXNN SDK is the custom-model workflow: train in PyTorch, TensorFlow, or Keras, export to ONNX, compile to .dxnn with dx-com on a host PC, and deploy through dx_engine Python or dxrt_api.h C++ on the DEEPX DX-M1 NPU.

01Overview

The ALPON X5 AI is an industrial edge AI computer built for image-based inference at the network edge. AI acceleration is provided by the DEEPX DX-M1: a 25 TOPS NPU with 4 GB of dedicated on-chip memory, fully independent of the Raspberry Pi CM5 system RAM. Every developer tool on this page targets that accelerator.

Custom-model pipeline

Path 2 (DXNN SDK) follows a four-stage pipeline. Path 1 (Sixfab Model Zoo) skips the first three stages because the .dxnn file ships pre-compiled inside sixfab-dx.

01

Source

Training framework

PyTorch · TF · Keras

02

Interchange

ONNX graph

.onnx · opset ≥ 13

03

Compile (Host PC)

dx-com

DEEPX compiler

04

Deploy (ALPON X5 AI)

sixfab-dx + dx_engine

.dxnn on DX-M1 NPU

What lives where

The toolchain is split cleanly between the host PC (your development laptop) and the target device (the ALPON X5 AI). Keep this split in mind when wiring up build systems and CI pipelines.

Host PC

x86_64 Linux
  • dx-com: compiles ONNX graphs into .dxnn binaries optimized for DEEPX DX-M1.
  • dx-tron: Netron-based viewer for .dxnn files. Used for inspection and debugging.
  • Your training pipeline and ONNX export script.

ALPON X5 AI (target)

arm64 · ALPON X5 AI OS
  • DEEPX kernel driver: exposes the NPU as /dev/dxrt0. Preinstalled.
  • sixfab-dx: runtime, Python wheels, pre-built venv, and CLI tools.
  • dx_engine Python and dxrt_api.h C++ APIs.
  • Compiled .dxnn models and your inference application.
Compile once, deploy everywhere

Compilation happens on a host PC once per model. The .dxnn artifact is then copied to any number of ALPON X5 AI devices and loaded at runtime. You do not need a DEEPX DX-M1 attached to your development machine to compile.

Languages and APIs

The DEEPX Runtime ships first-class bindings for Python (dx_engine) and C++ (dxrt_api.h). Use Python for rapid prototyping, video analytics with OpenCV or GStreamer, and glue code. Use C++ when you need deterministic latency, embedded integration, or direct runtime control.

Two paths to inference

The ALPON X5 AI supports two complementary workflows. Pick the one that matches what you need today; you can mix them in the same project.

Supported workloads

The DEEPX DX-M1 is optimized for image-based AI workloads. The runtime executes convolutional architectures reliably. Transformer-based models are not supported on the current runtime.

  • Object detection: YOLO family (v5, v7, v8, v11, YOLO26, YOLOX), SSD variants.
  • Classification: ResNet, MobileNet, EfficientNet.
  • Segmentation: U-Net, DeepLabV3, and other semantic segmentation backbones.
  • OCR: PaddleOCR-based detection, classification, and recognition heads.
  • Face detection and recognition: RetinaFace, SCRFD, ArcFace pipelines.
  • Pose estimation: YOLOv8-Pose, HRNet keypoint regressors.

02DEEPX Runtime

What is the DEEPX Runtime on the ALPON X5 AI?

The runtime is delivered through the sixfab-dx APT package. It bundles the DEEPX Runtime libraries, the dx_engine Python wheel, a pre-built virtual environment at /opt/sixfab-dx/venv, and CLI tools. The kernel driver is preinstalled on ALPON X5 AI OS and exposes the DEEPX DX-M1 at /dev/dxrt0.

APT Package
sixfab-dx
sixfab repo
Python API
dx_engine
bundled venv
C++ API
dxrt_api.h
native header
Device Node
/dev/dxrt0
PCIe Gen 3
Health Check
dxrt-cli -s
device status
Live Monitor
dxtop
htop-like NPU view
Model Format
.dxnn
from ONNX via dx-com

How do I install the DEEPX Runtime on the ALPON X5 AI?

The DEEPX kernel driver and the Sixfab APT repository are preconfigured on ALPON X5 AI OS. Install sixfab-dx with a single apt command, or pull it into your Docker image during the build step.

1

Install sixfab-dx

One command brings in the shared libraries, CLI tools, and the pre-built Python virtual environment with dx_engine already inside.

terminal bash
sudo apt update && sudo apt install -y sixfab-dx
2

Verify the NPU

Two CLI tools ship with the package. Use dxrt-cli -s for a quick status check and dxtop for a live, htop-like view of NPU utilization.

terminal bash
# 1. One-shot status check
dxrt-cli -s

# 2. Live monitoring while a workload runs
dxtop
3

Activate the bundled Python environment

The dx_engine wheel is pre-installed inside the bundled venv. Activate it before importing.

terminal bash
source /opt/sixfab-dx/venv/bin/activate
python -c "import dx_engine; print(dx_engine.__version__)"
No pip install needed

The dx_engine wheel ships inside the sixfab-dx package and is already installed in /opt/sixfab-dx/venv. Use that venv directly. Avoid creating a parallel venv and copying files around.

Running the runtime inside a Docker container

Production ALPON X5 AI deployments run inference code inside Docker containers. The kernel driver lives on the host, so the container only needs access to the device node and the runtime libraries.

terminal bash
# Minimal run command: privileged mode + NPU device node
docker run --privileged   --device /dev/dxrt0   -v $(pwd)/models:/models   -it your-image:tag

The equivalent in docker-compose.yml:

docker-compose.yml yaml
services:
  inference:
    image: your-image:tag
    privileged: true
    devices:
      - "/dev/dxrt0:/dev/dxrt0"
    volumes:
      - ./models:/models
    restart: unless-stopped
Why privileged mode

The DEEPX Runtime uses PCIe ioctls to communicate with the DEEPX DX-M1 that are not exposed by default in unprivileged containers. --privileged is currently required. If your threat model demands tighter isolation, restrict the container with a narrow seccomp profile or AppArmor policy.

Concurrent models

The DEEPX Runtime schedules NPU time across multiple concurrently loaded models automatically. A typical ALPON X5 AI deployment pairs a detector (for example YOLO) with a classifier or OCR head on the same device. You do not need to implement your own scheduler.


03Sixfab Model Zoo

What is the Sixfab Model Zoo?

The Sixfab Model Zoo is a curated set of pre-compiled .dxnn models bundled with the sixfab-dx APT package. It includes a ready-to-run run_hello_world demo and validated builds for object detection, face detection, pose estimation, and instance segmentation. There is no compiler, no training pipeline, and no extra download: install sixfab-dx and you have working inference.

The Sixfab Model Zoo is the fastest way to validate your hardware and understand the inference pipeline before building your own application. The same runtime powers the demos and your production code, so any zoo demo can ship as-is or serve as a starting point.

Independent of the DXNN SDK

Sixfab Model Zoo demos use the same sixfab-dx runtime you already installed. There is no separate dependency to fetch or compiler to set up. Production deployments can use a zoo model directly without any compilation step.

Quick demo: run_hello_world

The sixfab-dx package includes a ready-to-run YOLOv8 object detection demo. It draws bounding boxes around cars, people, and other objects in real time using a bundled sample video, so no camera is needed to get started.

terminal bash
# Activate the bundled venv (if not already active)
source /opt/sixfab-dx/venv/bin/activate

# Launch the YOLOv8 demo
run_hello_world

Expected performance summary on the DEEPX DX-M1 (25 TOPS):

Performance summary OK
         PERFORMANCE SUMMARY
================================================
Pipeline Step   Avg Latency   Throughput
------------------------------------------------
Read              21.45 ms      46.6 FPS
Preprocess        14.11 ms      70.9 FPS
Inference        399.69 ms      16.0 FPS*
Postprocess        2.23 ms     449.0 FPS
Display           32.47 ms      30.8 FPS
------------------------------------------------
* Actual throughput via async inference
Overall FPS   :  16.0 FPS

To watch the NPU in real time, open a second SSH session and run dxtop. Core utilization should sit at 80 to 90 percent during active inference.

Available models

Pre-compiled .dxnn files included with the package. Performance figures are for the 25 TOPS DEEPX DX-M1 at default settings.

Model fileTaskPerformance
YoloV8N.dxnn Object detection (nano + PPU) ~35 FPS
YoloV8S.dxnn Object detection (small) FPS pending
YoloV8M.dxnn Object detection (medium) FPS pending
YoloV9S.dxnn Object detection FPS pending
YoloV9C.dxnn Object detection (compact) FPS pending
SCRFD500M.dxnn Face detection FPS pending
YoloV5Pose.dxnn Pose estimation FPS pending
YoloV26S-Seg.dxnn Instance segmentation FPS pending
Always prefer PPU-compiled variants

PPU (Post-Processing Unit) models execute non-maximum suppression and confidence filtering on the NPU itself, removing a major CPU bottleneck. The YOLOv8n + PPU variant in the zoo reaches ~35 FPS for this reason.

Supported camera inputs

The camera connects to the Raspberry Pi CM5 host; the NPU only receives preprocessed frames. Any of these sources work with zoo demos and with custom applications.

SourceInvocationNotes
USB Webcam --camera_index 0 Any UVC-compatible USB camera. Change the index for multi-camera setups.
Video file -v video.mp4 MP4, AVI, MKV, and any other format OpenCV decodes.
RTSP stream -v rtsp://<ip>/stream IP cameras over RTSP. Multi-stream pipelines are supported in the GitHub examples.
RPi Camera Module libcamera / picamera2 Capture frames in C++ with libcamera or in Python with picamera2 and feed them into the inference pipeline.

04DEEPX Model Zoo

What is the DEEPX Model Zoo?

The DEEPX Model Zoo is the upstream catalog of pre-compiled .dxnn models maintained by DEEPX. It is broader than the curated Sixfab Model Zoo and covers object detection, image classification, semantic segmentation, face detection and recognition, pose estimation, and OCR. Models ship in Q-Lite (fast INT8 default) and Q-Pro (fine-tuned, higher accuracy) quantization modes. Browse and download at developer.deepx.ai/modelzoo.

Categories and representative models

Object Detection

Bounding-box detectors for real-time video analytics, safety monitoring, and inventory tracking.

YOLOv5 YOLOv7 YOLOv8 YOLOv11 YOLO26 YOLOX SSD
Image Classification

Multi-class image classifiers for product sorting, defect detection, and tagging pipelines.

ResNet-18/50 MobileNet v2/v3 EfficientNet-B0/B3
Semantic Segmentation

Per-pixel masks for road-scene understanding, medical imaging, and surface inspection.

U-Net DeepLabV3
Face Detection & Recognition

Detection, alignment, and embedding models for access control and attendance systems.

RetinaFace SCRFD ArcFace
Pose Estimation

2D keypoint regression for workplace safety analytics and human activity monitoring.

YOLOv8-Pose HRNet
OCR

Text detection, classification, and recognition pipelines for document processing and industrial labels.

PaddleOCR det PaddleOCR cls PaddleOCR rec
Architecture support

The Model Zoo catalog reflects what the DEEPX Runtime supports: image-based CNNs. Transformer-based models (ViT, DETR, LLMs) are not supported on the current runtime. Plan your architecture around the categories listed above.

Quantization modes: Q-Lite vs Q-Pro

Models in the zoo ship in two quantization flavors. Pick one based on your accuracy headroom and deployment timeline.

ModeUse caseTrade-off
Q-Lite Default choice. Standard INT8 quantization optimized for fast inference and short compile times. Lower accuracy floor than Q-Pro on sensitive models; typically negligible for well-trained detectors.
Q-Pro Accuracy-sensitive workloads. High-precision quantization with fine-tuning to recover accuracy close to FP32. Longer compile time. Use for production models where every mAP point matters.

How do I download a model from the Model Zoo?

Models are distributed as .dxnn files from the DEEPX developer portal. On the ALPON X5 AI, fetch the file directly with wget or curl and mount it into your container.

terminal bash
# 1. Keep models under /opt/models on the host
sudo mkdir -p /opt/models
cd /opt/models

# 2. Download a Q-Lite YOLOv8 nano model (example path)
sudo curl -L -O   https://developer.deepx.ai/modelzoo/download/yolov8n_qlite.dxnn

Review the per-model license on the DEEPX developer portal before redistributing or shipping a commercial product derived from a zoo model.


05Deploy a custom model

How do I deploy a custom AI model on the ALPON X5 AI?

Deploying a custom model takes six steps: (1) export to ONNX on your development machine, (2) compile to .dxnn on an Ubuntu x86_64 host with dx-com, (3) copy the artifact to the ALPON X5 AI, (4) write a Python inference script using dx_engine, (5) build and run it inside a privileged Docker container against /dev/dxrt0, and (6) verify with dxrt-cli -s or dxtop. To skip compilation entirely, use a Sixfab Model Zoo or DEEPX Model Zoo build instead.

Compiler host requirements

The dx-com compiler runs on a separate Ubuntu x86_64 machine. ARM and aarch64 hosts are not supported for compilation. The ALPON X5 AI itself is the deployment target, not the build host. Compile once per model on Ubuntu; the resulting .dxnn file then runs offline on any number of ALPON X5 AI devices.

CPU Architecture
amd64 (x86_64)
arm64 not supported
RAM
≥ 16 GB
required
Disk
≥ 8 GB free
toolchain + models
Operating System
Ubuntu 20/22/24
64-bit only
glibc
≥ 2.28
ldd --version
GPU
Not required
CPU compile
Plan for ~2 hour compilation times

Compiling a typical vision model with dx-com takes approximately two hours. Run compilations overnight or on a CI worker. Once compiled, the .dxnn file loads instantly on the ALPON X5 AI and can be redeployed indefinitely without recompiling.

Prerequisites

  • An ALPON X5 AI powered on and reachable over SSH, with ALPON X5 AI OS up to date.
  • sixfab-dx installed on the device per the DEEPX Runtime section.
  • Docker available on the device (included with ALPON X5 AI OS).
  • An Ubuntu x86_64 host meeting the requirements above for the compile step.
  • A .dxnn model, either downloaded from a Model Zoo or compiled with dx-com on your host PC.

Step-by-step walkthrough

1

Export your model to ONNX

On your host PC, export the trained model to ONNX with opset 13 or later. This example uses Ultralytics YOLOv8.

export.py python
from ultralytics import YOLO

model = YOLO("yolov8n.pt")
model.export(format="onnx", opset=13, simplify=True)
# produces yolov8n.onnx
2

Compile the ONNX graph with dx-com

Run the DEEPX compiler on your host PC to produce a .dxnn artifact. PPU support, which offloads bounding-box decoding and NMS to the NPU, is configured inside the model .cfg file rather than as a CLI flag.

bash shell
dx-com compile   --model yolov8n.onnx   --config yolov8n.cfg   --output yolov8n.dxnn

Enable PPU inside yolov8n.cfg:

yolov8n.cfg yaml
ppu:
  enabled: true
  type: yolo
  num_classes: 80
  conf_threshold: 0.25
  iou_threshold: 0.45
3

Copy the model to the ALPON X5 AI

Transfer the .dxnn artifact to a stable path on the device, typically /opt/models.

bash shell
scp yolov8n.dxnn alpon@<device-ip>:/opt/models/
4

Write an inference script

On the ALPON X5 AI, write a minimal Python script that loads the model and runs one inference pass. Save it as infer.py.

infer.py python
import cv2
import numpy as np
from dx_engine import InferenceEngine

# 1) Load the compiled model onto the DX-M1 NPU
engine = InferenceEngine("/models/yolov8n.dxnn")

# 2) Prepare an input frame (BGR, 640x640 for YOLOv8n)
frame = cv2.imread("/models/sample.jpg")
input_tensor = cv2.resize(frame, (640, 640))

# 3) Run inference on the NPU
outputs = engine.run([input_tensor])

# 4) outputs is a list of numpy arrays; shape depends on the model
print("Output tensors:", [o.shape for o in outputs])
5

Build and run the container

Write a minimal Dockerfile that installs sixfab-dx and copies your script. Inside a fresh Debian base image, the Sixfab APT repository must be registered manually (the host ALPON X5 AI OS has it preconfigured, but a clean container does not).

Dockerfile docker
FROM debian:trixie-slim

RUN apt-get update && apt-get install -y       wget gnupg ca-certificates python3-opencv  && wget -qO - https://sixfab.github.io/sixfab_dx/public.gpg       | gpg --dearmor -o /usr/share/keyrings/sixfab-dx.gpg  && echo "deb [signed-by=/usr/share/keyrings/sixfab-dx.gpg] https://sixfab.github.io/sixfab_dx trixie main"       > /etc/apt/sources.list.d/sixfab-dx.list  && apt-get update && apt-get install -y sixfab-dx

COPY infer.py /app/infer.py
WORKDIR /app

# Use the pre-built Sixfab venv so dx_engine is importable
CMD ["/opt/sixfab-dx/venv/bin/python", "/app/infer.py"]

Build and run it on the device, mounting /opt/models and exposing the NPU device node.

terminal bash
docker build -t alpon-infer:latest .

docker run --rm --privileged   --device /dev/dxrt0   -v /opt/models:/models   alpon-infer:latest
6

Monitor the NPU

In another SSH session, run dxrt-cli -s for a one-shot status check, or dxtop for a live, htop-like view. Utilization should spike while your container is running.

terminal bash
# One-shot status, firmware version, PCIe link state
dxrt-cli -s

# Live monitoring; expect 80 to 90% core utilization
dxtop

Common pitfalls

SymptomLikely cause and fix
ImportError: No module named dx_engine The script is running under the system Python instead of the bundled venv. Invoke /opt/sixfab-dx/venv/bin/python directly, or source /opt/sixfab-dx/venv/bin/activate first.
Cannot open /dev/dxrt0 inside container Missing --privileged or --device /dev/dxrt0. Add both to docker run or the compose file.
Model fails to load, error references memory Compiled .dxnn footprint exceeds 4 GB NPU memory. Recompile with a smaller input resolution or switch to a lighter model variant.
Extremely low FPS on YOLO Model compiled without PPU support. Add a ppu block to the .cfg file and recompile.
Detections look right but accuracy is degraded Color-channel mismatch. OpenCV loads frames as BGR; most models train on RGB. Convert explicitly with cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) before inference.
Compile fails with unsupported operator The ONNX graph contains an attention block or transformer operator. Switch to a CNN-based architecture; transformers are not supported on the current runtime.
Runtime reports a version mismatch on model load An SDK update introduced breaking changes. Recompile the .dxnn file with the matching dx-com version.

06Performance & benchmarks

What FPS can I expect on the ALPON X5 AI DX-M1?

With PPU support enabled, a YOLO-nano class detector reaches approximately 50 FPS at 1280 × 720 and 20 to 25 FPS at 1920 × 1080 on the DEEPX DX-M1. Larger YOLO variants scale down proportionally. Actual throughput depends on the model variant, input resolution, pre-processing path, and whether PPU is compiled into the graph.

YOLO throughput reference

Input resolutionApproximate FPS (PPU-compiled YOLO nano)Notes
1280 × 720 (HD) ~50 FPS Real-time processing for single-stream HD video analytics.
1920 × 1080 (Full HD) ~20 to 25 FPS Headroom for additional CPU-side pre- and post-processing.
Compile with PPU whenever you can

PPU-compiled models handle bounding-box decoding and NMS on the NPU, which significantly reduces CPU overhead. Without PPU, post-processing runs on the CM5 CPU and can become the bottleneck on Full HD streams. Most YOLO variants support PPU compilation.

How to benchmark your own model

Use a simple loop around engine.run() and measure wall-clock time over a warm window. Skip the first ~10 iterations to avoid warm-up noise.

bench.py python
import time, numpy as np
from dx_engine import InferenceEngine

engine = InferenceEngine("/models/yolov8n.dxnn")
dummy  = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)

# Warm-up
for _ in range(10):
    engine.run([dummy])

# Timed window
N = 200
t0 = time.perf_counter()
for _ in range(N):
    engine.run([dummy])
dt = time.perf_counter() - t0
print(f"FPS: {N/dt:.1f} (n={N})")

Variables that move the numbers

  • Model variant. Nano tier is the fastest; small, medium, and large variants drop FPS roughly proportional to compute.
  • Input resolution. Doubling resolution roughly quarters FPS on detection models.
  • PPU compilation. On or off is usually the largest single factor for YOLO.
  • Quantization mode. Q-Lite runs slightly faster than Q-Pro; choose Q-Pro only if you measure an accuracy regression.
  • Async inference. Use the runtime's async API to overlap NPU compute with CPU pre- and post-processing. Submit the next frame while the previous one is still on the NPU.
  • Concurrent models. Running a detector and a classifier simultaneously shares NPU time; each runs slower than in isolation.
  • Pre- and post-processing path. OpenCV decode plus color conversion on the CPU can bottleneck high-resolution pipelines. Consider GStreamer with hardware decode for sustained Full HD.

Hard limits

NPU memory ceiling4 GB on-chip. Models whose compiled footprint exceeds 4 GB will not load.
NPU power envelope2 W minimum, 5 W maximum under supported AI workloads.
Interface bandwidthPCIe Gen3 x2, shared with the NVMe SSD via the on-board ASM2806I packet switch.
Supported architecturesImage-based CNNs. Transformer-based models are not supported on the current runtime.
Power-mode controlNone. The NPU runs at a fixed performance profile; there is no software API for low-power or performance modes.
Benchmarks are workload-dependent

Published FPS numbers are single-stream, dummy-input references. Real-world pipelines add frame capture, decode, color conversion, and result rendering, all of which consume CPU cycles on the CM5. Always measure end-to-end throughput under your exact workload before committing to a deployment budget.


07Examples & references

Working code is the fastest way to evaluate the ALPON X5 AI for a new use case. The Sixfab DX examples repository contains 54 ready-to-run demos (28 Python, 26 C++) covering object detection across all supported YOLO variants, instance and semantic segmentation, pose estimation, OCR, face detection, PPU-accelerated and async pipelines, and analytics applications such as zone intrusion, people tracking, traffic counting, and queue analysis.

sixfab/sixfab-dx-examples

28 Python and 26 C++ demos with a TUI launcher. Object detection, segmentation, pose estimation, OCR, and analytics pipelines.

Quick demo run

After sixfab-dx is installed, clone the examples repository and run the auto-installer to fetch models and build the C++ demos.

terminal bash
git clone https://github.com/sixfab/sixfab-dx-examples
cd sixfab-dx-examples
./auto-install.sh

# Activate the bundled venv before running Python demos
source /opt/sixfab-dx/venv/bin/activate

# Launch the interactive Python demo menu
bash python_examples/start.sh

# Or the C++ menu
bash cpp_examples/start.sh

External references