AI Runtimes

SPOT-Tuned AI Runtime

SPOT®-tuned hardware-software synergy for ultra-low power AI on Apollo. Choose runtime tools including heliaRT for drop-in, backward-compatible speedups, heliaAOT for compile-time optimization and the smallest, most power-efficient builds, and more. Both runtimes are co-designed with Apollo silicon for kernel optimization, memory planning, and dataflow, so models run efficiently on real devices, not just in benchmarks.

heliaRT

heliaAOT

AI Runtimes Highlights

SPOT-Tuned Synergy

Co-designed with Apollo silicon. Kernel optimizations, memory planning, and dataflow tuned to SPOT so models run efficiently on real devices—not just in benchmarks.

Drop-In or Dialed-In

Choose heliaRT for TFLM-compatible, drop-in speedups—or heliaAOT for compile-time control with dials for size, speed, and power.

Smaller Footprint, Faster Starts

heliaAOT emits only what your model needs; heliaRT trims overhead vs baseline TFLM. Result: lean binaries, quick init, more room for features.

Performance per Microamp

SPOT-tuned runtimes squeeze latency and memory while minimizing energy draw, so real-time AI runs comfortably on battery—perfect for wearables, wellness, and smart audio.

AI Runtime Comparisons

Aspect

TFLM

heliaRT(Optimized TFLM)

heliaAOT(Ahead-of-Time)

Primary Fit

Portable baseline
micro-inference

Drop-in, Apollo-tuned
upgrade

Max efficiency & control for production

Performance on
Apollo

Good baseline

Faster via SPOT-tuned
kernels & planning

Fastest via compile-time
specialization/fusion

Memory Footprint

Moderate interpreter +ops

Leaner than TFLM

Smallest—emits only
what the model needs

Deployment &
Updates

Load .tflite at runtime;
very easy swaps

Same .tflite flow;
backward-compatible
with TFLM

Compiled artifact
(C/obj/bin); update
requires rebuild

Maturity / “Battle-
Tested”

Highest (widely used)

High (production-
hardened on Apollo)

Newer (rapidly maturing)

Op & dtype
coverage

Broad ops;
int8/int16/int32

Broad ops;
int8/int16/int32 (TFLM-compatible)

Focused ops;
int8/int16 (no int32)

Optimization
Control

Limited runtime knobs

Apollo-aware planners
& tuned kernels

Most control:
size/speed/power,
layout/schedule

Best For

Rapid prototyping & portability

Products with frequent
model refreshes

Locked-down SKUs
with tight latency,
memory, energy targets

Video Library

Frequently Asked Questions (FAQ)

heliaRT™ is a drop-in, TFLM-compatible runtime — you load the same .tflite file and get SPOT®-tuned kernel and memory-planning speedups with no code changes. heliaAOT™ is an ahead-of-time compiler: it compiles your model into a specialized binary before deployment, trading easy updates (it needs a rebuild when the model changes) for the smallest footprint and highest performance.
Choose heliaRT™ if your product ships with frequent model refreshes and you want backward-compatible, drop-in speed. Choose heliaAOT™ if you’re shipping a locked-down SKU with tight latency, memory, or energy targets and can accept that updates require a rebuild.
Yes — heliaRT™ uses the same .tflite loading flow as TFLM and is fully backward-compatible, so existing TFLM models and workflows carry over with no changes required.
Yes. heliaAOT™ compiles your model ahead of time into a fixed artifact (C/object/binary), so any change to the model requires a rebuild — best suited to a model that’s already finalized rather than one still under active iteration.
heliaAOT™ emits only the code a specific model needs, giving it the smallest memory footprint of the three options, and its compile-time specialization makes it the fastest on Apollo silicon. [Exact percentage figures depend on the model — confirm representative benchmark numbers with engineering before publishing a specific claim.]
Both are SPOT®-tuned specifically for Apollo silicon and built to work across the Apollo lineup. Note heliaAOT™ currently supports a narrower operator/data-type set (int8/int16, no int32) than heliaRT™ or TFLM, so unusual model architectures should be checked against its supported operator list first.
SPOT®-tuned means heliaRT™ and heliaAOT™ are co-designed specifically with Ambiq’s SPOT sub-threshold power platform and Apollo silicon — their kernel optimizations, memory planning, and dataflow are built around how Apollo chips actually run, not a generic microcontroller target. That’s what lets both runtimes deliver real efficiency gains on real devices, not just in synthetic benchmarks.

AI Runtimes

SPOT-Tuned AI Runtime

AI Runtimes Highlights

SPOT-Tuned Synergy

Drop-In or Dialed-In

Smaller Footprint, Faster Starts

Performance per Microamp

AI Runtime Comparisons

Video Library

Additional Resources

Frequently Asked Questions (FAQ)

What’s the difference between heliaRT and heliaAOT?

Which runtime should I choose for my project?

Is heliaRT compatible with existing TensorFlow Lite Micro (TFLM) models?

Does switching to heliaAOT require recompiling my model every time it changes?

How much smaller or faster is heliaAOT compared to standard TFLM?

Does heliaRT or heliaAOT work with any Apollo SoC, or only specific generations?

What does “SPOT-tuned” mean for these runtimes?

Video title