Summary
This comprehensive survey from a multi-institution team (Nanyang Technological University, UC Berkeley, Stanford, and others) systematically reviews world models as predictive representations of how environments evolve under actions, covering their role in policy learning, planning, simulation, evaluation, and data generation. The survey traces the field’s rapid advance from imagination-based generation through controllable, structured, and foundation-scale formulations, and connects robotic world models to navigation and autonomous driving.
Key Contributions
- Unifying framework defining three core world-model capabilities: foresight, imagination-driven planning, and data amplification
- Comprehensive taxonomy covering model-based RL, video generation, latent prediction, and foundation-scale approaches
- Companion GitHub repository (NTUMARS/Awesome-World-Model-for-Robotics-Policy) and project website with continuously updated paper lists
Significance
Arrives at a pivotal moment when world models are transitioning from academic curiosity to production-scale robot training infrastructure; the survey provides researchers and practitioners with a structured map of the landscape and identifies open challenges.