Evaluation of embodied intelligence

Develop rigorous, generalizable methodologies and metrics to evaluate embodied intelligent systems, determining how to assess progress and performance across changing tasks and environments while accounting for learning and adaptation.

Background

The paper argues that embodied agents face unique challenges not adequately addressed by conventional machine learning, including non-stationarity, safety, and adaptation. Standard benchmarks and validation approaches often do not capture these complexities. The authors highlight that, unlike passive perception tasks with static datasets, robots act in and alter their environments, complicating evaluation.

This motivates the need for new evaluation methodologies that can measure whether learning agents are making progress and can generalize safely to novel tasks and environments.

References

Finally, an open question in embodied intelligence is how these systems should be evaluated. At heart, how do we as a community know if progress is being made?

From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence  (2110.15245 - Roy et al., 2021) in Section 1 (Introduction)