Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization (2505.11166v1)

Published 16 May 2025 in cs.CL and cs.AI

Abstract: Despite advances in pretraining with extended context lengths, LLMs still face challenges in effectively utilizing real-world long-context information, primarily due to insufficient long-context alignment caused by data quality issues, training inefficiencies, and the lack of well-designed optimization objectives. To address these limitations, we propose a framework named $\textbf{S}$h$\textbf{o}$rt-to-$\textbf{Lo}$ng $\textbf{P}$reference $\textbf{O}$ptimization ($\textbf{SoLoPO}$), decoupling long-context preference optimization (PO) into two components: short-context PO and short-to-long reward alignment (SoLo-RA), supported by both theoretical and empirical evidence. Specifically, short-context PO leverages preference pairs sampled from short contexts to enhance the model's contextual knowledge utilization ability. Meanwhile, SoLo-RA explicitly encourages reward score consistency utilization for the responses when conditioned on both short and long contexts that contain identical task-relevant information. This facilitates transferring the model's ability to handle short contexts into long-context scenarios. SoLoPO is compatible with mainstream preference optimization algorithms, while substantially improving the efficiency of data construction and training processes. Experimental results show that SoLoPO enhances all these algorithms with respect to stronger length and domain generalization abilities across various long-context benchmarks, while achieving notable improvements in both computational and memory efficiency.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Huashan Sun (7 papers)
  2. Shengyi Liao (4 papers)
  3. Yansen Han (2 papers)
  4. Yu Bai (136 papers)
  5. Yang Gao (761 papers)
  6. Cheng Fu (12 papers)
  7. Weizhou Shen (18 papers)
  8. Fanqi Wan (20 papers)
  9. Ming Yan (190 papers)
  10. Ji Zhang (176 papers)
  11. Fei Huang (409 papers)