Summary

LabVLA adapts VLA models to the domain of scientific laboratory automation, where existing policies trained on household or tabletop demonstrations fail due to the unique challenges of transparent liquids, specialized instruments, and rigid protocol workflows. It adapts a Qwen3-VL backbone with a DiT-based action expert to map visual observations, robot state, and written lab protocols into continuous action chunks.

Key Contributions

  • First VLA pipeline targeting scientific laboratory protocols and diverse lab robot embodiments
  • Two-stage training: action token pretraining followed by flow-matching policy learning
  • Introduces simulated scientific workspaces that capture lab-specific objects and transparent-liquid dynamics
  • Demonstrates superior performance on laboratory protocol benchmarks over generic VLA baselines

Significance

Opens a new application domain for VLAs in life science and chemistry laboratory automation, where precise manipulation of delicate instruments and liquids is essential.