Dice Question Streamline Icon: https://streamlinehq.com

Paired Multi-Identity Data for Group Photos: Construction and Effective Exploitation

Construct large-scale datasets of group photos that provide multiple reference images per identity, and develop training and inference methods that effectively exploit such paired multi-identity data to enable controllable, identity-consistent image generation while mitigating copy-paste artifacts.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper identifies a widespread failure mode in identity customization models: excessive similarity to a single reference image leads to copy-paste artifacts that suppress natural variations in pose, expression, and lighting. A key cause is the scarcity of datasets that provide multiple reference images per identity, especially in multi-person scenes, which forces most methods into reconstruction-based training that reinforces copying behavior.

To address this gap, the authors introduce MultiID-2M, a large-scale dataset with paired references and group photos, and WithAnyone, a training paradigm that leverages contrastive identity losses and extended negatives. Despite these contributions, the broader challenge of building such datasets at scale and designing algorithms that fully exploit multi-reference, multi-identity data for controllable generation remains explicitly noted by the authors as open.

References

Constructing datasets with multiple references per identity, particularly in group photos, and developing methods to effectively exploit such data remain open challenges.

WithAnyone: Towards Controllable and ID Consistent Image Generation (2510.14975 - Xu et al., 16 Oct 2025) in Section 1 (Introduction)