Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Position-Aware Parameter Efficient Fine-Tuning Approach for Reducing Positional Bias in LLMs (2404.01430v1)

Published 1 Apr 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Recent advances in LLMs have enhanced their ability to process long input contexts. This development is particularly crucial for tasks that involve retrieving knowledge from an external datastore, which can result in long inputs. However, recent studies show a positional bias in LLMs, demonstrating varying performance depending on the location of useful information within the input sequence. In this study, we conduct extensive experiments to investigate the root causes of positional bias. Our findings indicate that the primary contributor to LLM positional bias stems from the inherent positional preferences of different models. We demonstrate that merely employing prompt-based solutions is inadequate for overcoming the positional preferences. To address this positional bias issue of a pre-trained LLM, we developed a Position-Aware Parameter Efficient Fine-Tuning (PAPEFT) approach which is composed of a data augmentation technique and a parameter efficient adapter, enhancing a uniform attention distribution across the input context. Our experiments demonstrate that the proposed approach effectively reduces positional bias, improving LLMs' effectiveness in handling long context sequences for various tasks that require externally retrieved knowledge.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Improving language models by retrieving from trillions of tokens. In International conference on machine learning, pp.  2206–2240. PMLR, 2022.
  2. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, March 2023. URL https://lmsys.org/blog/2023-03-30-vicuna/.
  3. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
  4. Flashattention: Fast and memory-efficient exact attention with io-awareness. Advances in Neural Information Processing Systems, 35:16344–16359, 2022.
  5. Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314, 2023.
  6. Qlora: Efficient finetuning of quantized llms. Advances in Neural Information Processing Systems, 36, 2024.
  7. Retrieval augmented language model pre-training. In International conference on machine learning, pp.  3929–3938. PMLR, 2020a.
  8. Kelvin Guu et al. Realm: Retrieval-augmented language model pre-training. In Proceedings of ICML, 2020b.
  9. Parameter-efficient transfer learning for nlp. In International conference on machine learning, pp.  2790–2799. PMLR, 2019.
  10. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  11. Amazon-m2: A multilingual multi-locale shopping session dataset for recommendation and text generation. arXiv preprint arXiv:2307.09688, 2023.
  12. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, 7:453–466, 2019.
  13. The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691, 2021.
  14. Patrick Lewis et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. In Proceedings of NeurIPS, 2020.
  15. How long can open-source llms truly promise on context length?, June 2023. URL https://lmsys.org/blog/2023-06-29-longchat.
  16. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190, 2021.
  17. Lost in the middle: How language models use long contexts. arXiv preprint arXiv:2307.03172, 2023a.
  18. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023b.
  19. Deja vu: Contextual sparsity for efficient llms at inference time. In International Conference on Machine Learning, pp.  22137–22176. PMLR, 2023c.
  20. Metaicl: Learning to learn in context. arXiv preprint arXiv:2110.15943, 2021.
  21. Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091, 2019.
  22. On position bias in summarization with large language models. arXiv preprint arXiv:2310.10570, 2023.
  23. How much knowledge can you pack into the parameters of a language model? arXiv preprint arXiv:2002.08910, 2020.
  24. Ashish Vaswani et al. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  25. Microsoft academic graph: When experts are not enough. Quantitative Science Studies, 1(1):396–413, 2020.
  26. Qa-gnn: Reasoning with language models and knowledge graphs for question answering. arXiv preprint arXiv:2104.06378, 2021.
  27. On large language models’ selection bias in multi-choice questions. arXiv preprint arXiv:2309.03882, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Zheng Zhang (486 papers)
  2. Fan Yang (877 papers)
  3. Ziyan Jiang (16 papers)
  4. Zheng Chen (221 papers)
  5. Zhengyang Zhao (23 papers)
  6. Chengyuan Ma (20 papers)
  7. Liang Zhao (353 papers)
  8. Yang Liu (2253 papers)
Citations (2)