Quantify the synthetic-to-real performance gap for egocentric HOI detection

Determine the magnitude of the performance gap between models trained on synthetic egocentric hand–object interaction data and models trained on real-world labeled data for hand–object interaction detection.

Background

The paper investigates leveraging synthetic data to improve egocentric hand–object interaction (HOI) detection and notes that progress is hindered by a domain gap between synthetic and real data. Establishing how large this gap is provides a baseline for evaluating adaptation methods and for understanding when synthetic data alone may suffice.

The authors evaluate across VISOR, EgoHOS, and ENIGMA-51 and study multiple adaptation regimes (unsupervised, semi-supervised, and fully supervised), making the quantification of the synthetic-to-real gap central to the study’s aims.

References

As a result, several key open questions still need to be addressed: 1) How large is the gap between synthetic and real data?

Leveraging Synthetic Data for Enhancing Egocentric Hand-Object Interaction Detection  (2603.29733 - Leonardi et al., 31 Mar 2026) in Section 1 (Introduction)