DDP-WM: Disentangled Dynamics Prediction for Efficient World Models

Summary

DDP-WM introduces the Disentangled Dynamics Prediction (DDP) principle: latent state evolution is decomposed into sparse primary dynamics (driven by physical interactions) and secondary context-driven background updates. This decomposition is realized through dynamic localization to isolate foreground primary dynamics, a cross-attention mechanism for background updates, and a Low-Rank Correction Module (LRM) for background. The result is a world model that is ~9× faster at inference than dense Transformer baselines while achieving higher task success.

Key Contributions

Disentangled Dynamics Prediction principle separating foreground physical interactions from background context
Four-stage decoupled process: dynamic localization → primary predictor → LRM background update
~9× inference speedup on Push-T task vs. state-of-the-art dense models
Success rate improvement from 90% to 98% on Push-T with MPC planning
Validated across navigation, tabletop manipulation, deformable-object, and multi-body interaction tasks

Significance

Addresses the computational bottleneck of dense Transformer world models while improving performance, making real-time world-model-based planning practical for a wider range of robotic systems.

Embodied Robotics Research

Explorer

DDP-WM: Disentangled Dynamics Prediction for Efficient World Models

Summary

Key Contributions

Significance

Links

Graph View

Table of Contents

Backlinks