Do unpaired auxiliary modalities improve textual tasks?
Determine whether unpaired auxiliary modalities such as images and audio can provide useful information to improve performance on textual tasks within the Unpaired Multimodal Representation Learning (Uml) framework introduced in this work.
References
Furthermore, we evaluate how multimodal data enhances image and audio classification; it remains to show if they can, in turn, offer useful information for textual tasks.
— Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
(2510.08492 - Gupta et al., 9 Oct 2025) in Conclusions and Limitations (Section: Conclusions and Limitations)