Embodied Robotics Research

Tag: preference-optimization

1 item with this tag.

  • Jun 03, 2026

    FlowPRO: Reward-Free Reinforced Fine-Tuning of Flow-Matching VLAs via Proximalized Preference Optimization

    • vla
    • reinforcement-fine-tuning
    • flow-matching
    • preference-optimization
    • reward-free
    • tencent-robotics

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community