Exploiting Heterogeneous Motion Data for Generalizable Interaction-to-Reaction Models

Develop learning methods that effectively exploit heterogeneous and sparse motion datasets across single-person, human–human interaction, and human–scene interaction domains to train a generalizable interaction-to-reaction motion generation model.

Background

The paper highlights that available motion datasets differ markedly in domain coverage and supervision modality, ranging from large-scale single-person actions to smaller, fragmented human–human and human–scene interaction datasets. This heterogeneity, combined with limited interaction data, hampers the learning of unified motion priors or transferable representations.

As a consequence, end-to-end models trained within a single domain tend to overfit to that domain’s distribution and generalize poorly to new interaction configurations. The authors explicitly identify the challenge of leveraging all available yet sparse, domain-specific resources to build a generalizable interaction-to-reaction model as an open problem.

References

Effectively exploiting all available yet sparse and domain-specific resources remains an open problem for building a generalizable interaction-to-reaction model.

ReMoGen: Real-time Human Interaction-to-Reaction Generation via Modular Learning from Diverse Data  (2604.01082 - Ye et al., 1 Apr 2026) in Section 1 (Introduction), subsection “(1) Data scarcity and heterogeneity”