Use synthetic data to aid learning with limited real labels

Show whether augmenting training with synthetic data improves performance when only a small fraction of real-world egocentric hand–object interaction data is labeled.

Background

Semi-supervised domain adaptation (SSDA) aims to maximize performance when only limited real labels are available. If synthetic data can compensate for label scarcity, training costs can be substantially reduced.

The study evaluates SSDA settings with varying proportions of labeled real data to quantify the benefit of synthetic augmentation.

References

As a result, several key open questions still need to be addressed: 1) How large is the gap between synthetic and real data? 2) What are its main causes? 3) How can it be minimized? 4) Can synthetic data fully replace real-world data? 5) Is it possible to leverage synthetic data when real-world data is unlabeled? 6) Can it improve performance when only a small amount of real-world labeled data is available?

Leveraging Synthetic Data for Enhancing Egocentric Hand-Object Interaction Detection  (2603.29733 - Leonardi et al., 31 Mar 2026) in Section 1 (Introduction)