Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 159 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 36 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 22 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion (2401.02993v2)

Published 4 Jan 2024 in cs.CL and cs.AI

Abstract: Retrieval-based augmentations (RA) incorporating knowledge from an external database into LLMs have greatly succeeded in various knowledge-intensive (KI) tasks. However, integrating retrievals in non-knowledge-intensive (NKI) tasks is still challenging. Existing works focus on concatenating retrievals with inputs to improve model performance. Unfortunately, the use of retrieval concatenation-based augmentations causes an increase in the input length, substantially raising the computational demands of attention mechanisms. This paper proposes a new paradigm of RA named \textbf{ReFusion}, a computation-efficient Retrieval representation Fusion with bi-level optimization. Unlike previous works, ReFusion directly fuses the retrieval representations into the hidden states of models. Specifically, ReFusion leverages an adaptive retrieval integrator to seek the optimal combination of the proposed ranking schemes across different model layers. Experimental results demonstrate that the proposed ReFusion can achieve superior and robust performance in various NKI tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. RETRO-Improving Language Models by Retrieving from Trillions of Tokens. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  2206–2240. PMLR, 2022a.
  2. Improving language models by retrieving from trillions of tokens. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  2206–2240. PMLR, 2022b.
  3. Reading wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pp.  1870–1879, 2017.
  4. MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp.  5558–5570. Association for Computational Linguistics, 2022a.
  5. RETROPROMPT-Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems, volume 35, pp.  23908–23922. Curran Associates, Inc., 2022b.
  6. RETROPROMPT-Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning. In Advances in Neural Information Processing Systems, volume 35, pp.  23908–23922. Curran Associates, Inc., 2022c.
  7. Fully nested neural network for adaptive compression and quantization. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, pp.  2080–2087, 2020.
  8. Bayesian nested neural networks for uncertainty calibration and adaptive compression. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pp.  2392–2401, 2021.
  9. Variational nested dropout. IEEE Trans. Pattern Anal. Mach. Intell., 45(8):10519–10534, 2023.
  10. Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019.
  11. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp.  4171–4186. Association for Computational Linguistics, 2019.
  12. Retrieval-Augmented Generative Question Answering for Event Argument Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp.  4649–4666. Association for Computational Linguistics, 2022.
  13. BioReader: a Retrieval-Enhanced Text-to-Text Transformer for Biomedical Literature. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (eds.), Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp.  5770–5793. Association for Computational Linguistics, 2022.
  14. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pp.  3816–3830. Association for Computational Linguistics, 2021.
  15. Accelerating large-scale inference with anisotropic vector quantization. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, pp.  3887–3896. PMLR, 2020.
  16. Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, pp.  10896–10912. Association for Computational Linguistics, 2023a.
  17. Prompt-guided retrieval augmentation for non-knowledge-intensive tasks. In Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (eds.), Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, pp.  10896–10912. Association for Computational Linguistics, 2023b.
  18. REALM: retrieval-augmented language model pre-training. CoRR, abs/2002.08909, 2020.
  19. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp.  2225–2240. Association for Computational Linguistics, 2022.
  20. Categorical reparameterization with gumbel-softmax. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
  21. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
  22. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp.  6769–6781, 2020.
  23. Generalization through memorization: Nearest neighbor language models. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, 2020.
  24. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020.
  25. Decoupled context processing for context augmented language modeling. In NeurIPS, 2022.
  26. Retrieval Augmented Visual Question Answering with Outside Knowledge. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp.  11238–11254. Association for Computational Linguistics, 2022.
  27. DARTS: differentiable architecture search. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019a.
  28. Roberta: A robustly optimized BERT pretraining approach. CoRR, abs/1907.11692, 2019b.
  29. Accelerating general-purpose lossless compression via simple and scalable parameterization. In João Magalhães, Alberto Del Bimbo, Shin’ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, and Laura Toni (eds.), MM ’22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022, pp.  3205–3213. ACM, 2022.
  30. Learning ordered representations with nested dropout. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, volume 32 of JMLR Workshop and Conference Proceedings, pp.  1746–1754. JMLR.org, 2014.
  31. RACE: Retrieval-augmented Commit Message Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pp.  5520–5530. Association for Computational Linguistics, 2022.
  32. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp.  5998–6008, 2017.
  33. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019.
  34. GENREAD: Generate rather than Retrieve: Large Language Models are Strong Context Generators. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023, 2023.
  35. Differentiable prompt makes pre-trained language models better few-shot learners. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022, 2022.
  36. DocPrompting: Generating Code by Retrieving the Docs. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.