Dice Question Streamline Icon: https://streamlinehq.com

Overcoming the Reality Gap for Simulation-Based Policy Evaluation

Develop calibration and alignment techniques that overcome the reality gap specifically for simulation-based evaluation of robotic policies, ensuring that performance measured in simulation reliably matches performance in the real environment.

Information Square Streamline Icon: https://streamlinehq.com

Background

Simulation is increasingly used to evaluate policies intended for real-world deployment, offering reproducibility and scale. However, without careful alignment, simulation may not be a reliable proxy for real-world performance, undermining evaluations.

The authors explicitly call out the need to bridge the reality gap for the downstream purpose of model evaluation, so that simulation-derived metrics accurately predict real-world behavior.

References

While methods to reduce this gap are very similar to the ones used for sim-to-real transfer, an open question is how to overcome the reality gap for the specific downstream purpose of model evaluations such that the performance of a policy in simulation matches the performance in the real environment.

The Reality Gap in Robotics: Challenges, Solutions, and Best Practices (2510.20808 - Aljalbout et al., 23 Oct 2025) in Section 7.5 (Simulation for Large Robotics Models)