Summary
Kairos is a 4B-parameter open world model from ACE Robotics that natively integrates multimodal understanding, generation, and action prediction in a single hybrid linear attention architecture. It learns via a Native Pre-training Paradigm governed by a Cross-Embodiment Data Curriculum, organizing open-world videos, human behavioral data, and robot interactions into a progressive developmental pathway. Formal error-accumulation bounds via temporal factorization guarantee stable long-horizon state propagation, and a Deployment-Aware System Co-Design enables real-time edge inference across embodiments. Kairos-4B ranks first on WorldModelBench Robot with score 9.30, outperforming 14B, 16B, and 28B models.
Key Contributions
- Hybrid linear attention (sliding window + dilated windows + gated linear attention) enabling real-time edge inference
- Native Pre-training with Cross-Embodiment Data Curriculum for progressive learning across data scales
- Theoretical error-accumulation bounds via temporal factorization for long-horizon stability
- #1 on WorldModelBench Robot (9.30), surpassing models 3–7× larger; best open Text-to-Image and Image-to-Video (Artificial Analysis)
Significance
Kairos demonstrates that a purpose-built 4B architecture for physical AI can outperform much larger general-purpose world models on embodied benchmarks while running in real time on edge hardware, establishing a new efficiency frontier for open robotic world models.