Weekly Research Digest — 2026-05-18

11 new entries this week across 3 topic areas.


Vision-Language-Action (VLA) Models

ReleaseVenueSignificance
alam-algebraically-consistent-latent-action-model-vla ALAMarXiv 2605.10819Algebraic structure on latent actions lifts MetaWorld MT50 success 47.9% → 85.0%
vla-forget-vision-language-action-unlearning-embodied VLA-ForgetarXiv 2604.03956First machine-unlearning framework targeting VLA models for safe post-deployment correction
from-pixels-to-tokens-latent-action-supervision-vla From Pixels to TokensarXiv 2605.04678Systematic study revealing image- vs action-based latent supervision best-suited to different task types
defi-disentangled-robot-learning-forward-inverse-dynamics-pretraining DeFIICLR 2026Decouples visual forward/inverse dynamics pretraining, enabling action-free web video exploitation for VLAs

World Models for Robotics

ReleaseVenueSignificance
world-model-for-robot-learning-comprehensive-survey World Model SurveyarXiv 2605.00080Comprehensive multi-institution survey unifying world model roles in policy learning, planning, and data generation
lawm-least-action-world-models-long-horizon-physical-consistency LaWMarXiv 2605.08279Variational integrator grounded in Principle of Least Action for physically consistent long-horizon prediction
one-token-per-frame-visual-bandwidth-world-models-vla-policy One Token Per FramearXiv 2605.07931Compresses world-model visual stream to 1 token/frame via adaptive pooling without performance loss
physically-native-world-models-hamiltonian-perspective Physically Native WMsarXiv 2605.00412Proposes Hamiltonian World Models unifying video, 3D, and latent approaches under classical mechanics principles

Reinforcement Learning for Robotics

ReleaseVenueSignificance
scaling-sim-to-real-rl-robot-vla-generative-3d-worlds Scaling Sim-to-Real RLarXiv 2603.18532Generative 3D worlds automate scene diversity for VLA RL fine-tuning; real-world success 21.7% → 75%
grounding-sim-to-real-generalization-dexterous-manipulation-vla Grounding Sim-to-RealarXiv 2603.22876Rigorous empirical study across four sim-to-real axes for VLA dexterous manipulation policies
twinrl-vla-digital-twin-driven-rl-robotic-manipulation TwinRL-VLAarXiv 2602.09023Smartphone-captured digital twin enables 100% real-world RL success in ~20 minutes per task

Generated automatically. All entries verified via web search.