Evaluate the benefits of aligning synthetic data to target domains

Investigate whether aligning synthetic data to the target real domain—by matching objects, grasps, and environments—yields additional performance gains over generic synthetic data for egocentric hand–object interaction detection.

Background

Domain-specific alignment may reduce the domain gap by tailoring synthetic assets to the target dataset’s distribution (e.g., specific kitchens, tools, or hand poses).

The authors propose alignment strategies guided by unlabeled real data and pre-trained models (e.g., DINOv2, MMPose) to test whether targeted generation improves adaptation.

References

As a result, several key open questions still need to be addressed: 1) How large is the gap between synthetic and real data? 2) What are its main causes? 3) How can it be minimized? 4) Can synthetic data fully replace real-world data? 5) Is it possible to leverage synthetic data when real-world data is unlabeled? 6) Can it improve performance when only a small amount of real-world labeled data is available? 7) What scale of synthetic data is required? 8) Does aligning synthetic data more closely with real-world objects and environments provide additional benefits?

Leveraging Synthetic Data for Enhancing Egocentric Hand-Object Interaction Detection  (2603.29733 - Leonardi et al., 31 Mar 2026) in Section 1 (Introduction)