Optimizing strategies to boost multimodal pretraining Transformer performance
Determine effective combinations and training strategies of pretext objectives and losses for Transformer-based multimodal pretraining that improve performance while controlling optimization challenges, such as balancing multiple loss terms and avoiding overly complex objectives.
References
How to boost the performance for multimodal pretraining Transformers is an open problem.
— Multimodal Learning with Transformers: A Survey
(2206.06488 - Xu et al., 2022) in Discussion under Subsubsection "Task-Agnostic Multimodal Pretraining" (Section 4.1.1)