Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 96 tok/s
Gemini 3.0 Pro 48 tok/s Pro
Gemini 2.5 Flash 155 tok/s Pro
Kimi K2 197 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Differential Multimodal Transformers (2507.15875v1)

Published 17 Jul 2025 in cs.AI and cs.MM

Abstract: Small LLMs have gained significant popularity due to their efficiency and growing capabilities. However, incorporating additional modalities, such as vision, can exacerbate the challenge of limited context windows by introducing noise. Recent studies have highlighted that Transformer attention mechanisms often disproportionately focus on irrelevant contexts. In this work, we extend the Differential Attention mechanism, originally designed for text-only models, to the text-vision model PaliGemma. Our aim is to evaluate its ability to mitigate noisy information retrieval and reduce hallucinations. To this end, we fine-tuned the PaliGemma 3B model using LoRA, incorporating Differential Attention, and experimented with various parameter settings and configurations. We demonstrate that Differential Attention can be adapted and integrated into the fine-tuning of existing models to enhance noisy information retrieval and question-answering capabilities.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.