Jointly scaling human data and model capacity for improved planning and composition
Determine whether jointly scaling the amount of egocentric human pretraining data and the capacity of the flow-based Vision–Language–Action policy introduced in EgoScale yields further gains in dexterous robot manipulation, specifically improved long-horizon planning and compositional generalization beyond the gains observed when scaling data alone.
References
Looking forward, several directions remain open. While we observe no saturation within the explored regime, jointly scaling human data and model capacity may unlock further gains, including improved long-horizon planning and compositional generalization.
— EgoScale: Scaling Dexterous Manipulation with Diverse Egocentric Human Data
(2602.16710 - Zheng et al., 18 Feb 2026) in Conclusion