Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 84 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 21 tok/s Pro

GPT-5 High 28 tok/s Pro

GPT-4o 96 tok/s Pro

GPT OSS 120B 462 tok/s Pro

Kimi K2 189 tok/s Pro

2000 character limit reached

Text-centric Alignment for Multi-Modality Learning (2402.08086v2)

Published 12 Feb 2024 in cs.LG, cs.CL, and cs.CV

Abstract: This research paper addresses the challenge of modality mismatch in multimodal learning, where the modalities available during inference differ from those available at training. We propose the Text-centric Alignment for Multi-Modality Learning (TAMML) approach, an innovative method that utilizes LLMs with in-context learning and foundation models to enhance the generalizability of multimodal systems under these conditions. By leveraging the unique properties of text as a unified semantic space, TAMML demonstrates significant improvements in handling unseen, diverse, and unpredictable modality combinations. TAMML not only adapts to varying modalities but also maintains robust performance, showcasing the potential of foundation models in overcoming the limitations of traditional fixed-modality frameworks in embedding representations. This study contributes to the field by offering a flexible, effective solution for real-world applications where modality availability is dynamic and uncertain.

References (50)

Citations (7)

View on Semantic Scholar

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Text-centric Alignment for Multi-Modality Learning (2402.08086v2)

Collections

Summary

Paper Prompts

Follow-up Questions

Authors (5)

Tweets

Text-centric Alignment for Multi-Modality Learning (2402.08086v2)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (5)

Tweets