Dice Question Streamline Icon: https://streamlinehq.com

Sufficiency of real‑world teleoperation–trained foundation models for general manipulation

Determine whether the paradigm of collecting large-scale real-world teleoperation datasets and training robotic foundation models exclusively on this data is, by itself, sufficient to achieve general-purpose manipulation capabilities across diverse tasks and environments.

Information Square Streamline Icon: https://streamlinehq.com

Background

Recent efforts in robotics have sought to replicate the LLM approach by scaling up real-world datasets and training robotic foundation models from teleoperation data. The paper notes that although this direction is promising, the variation encountered in mobile manipulation—especially for humanoids—implies substantially higher data requirements and costs compared to fixed tabletop setups.

This uncertainty directly motivates the paper’s exploration of visual sim-to-real as a complementary path, aiming to reduce the reliance on expensive real-world data collection while enabling robust, long-horizon loco-manipulation on humanoid robots.

References

While it remains unclear whether this path alone will suffice for general manipulation, it is clear that mobile manipulation will encounter substantially more variation than fixed tabletop setups and will therefore demand far more data.

VIRAL: Visual Sim-to-Real at Scale for Humanoid Loco-Manipulation (2511.15200 - He et al., 19 Nov 2025) in Section 1 (Introduction)