Primary Fit
Portable baseline
micro-inference
Drop-in, Apollo-tuned
upgrade
Max efficiency & control for production
Performance on
Apollo
Good baseline
Faster via SPOT-tuned
kernels & planning
Fastest via compile-time
specialization/fusion
Memory Footprint
Moderate interpreter +ops
Leaner than TFLM
Smallest—emits only
what the model needs
Deployment &
Updates
Load .tflite at runtime;
very easy swaps
Same .tflite flow;
backward-compatible
with TFLM
Compiled artifact
(C/obj/bin); update
requires rebuild
Maturity / “Battle-
Tested”
Highest (widely used)
High (production-
hardened on Apollo)
Newer (rapidly maturing)
Op & dtype
coverage
Broad ops;
int8/int16/int32
Broad ops;
int8/int16/int32 (TFLM-compatible)
Focused ops;
int8/int16 (no int32)
Optimization
Control
Limited runtime knobs
Apollo-aware planners
& tuned kernels
Most control:
size/speed/power,
layout/schedule
Best For
Rapid prototyping & portability
Products with frequent
model refreshes
Locked-down SKUs
with tight latency,
memory, energy targets