REFA: Reference Free Alignment for multi-preference optimization (2412.16378v3)
Abstract: We introduce $\textbf{REFA}$, a family of reference-free alignment methods that optimize over multiple user preferences while enforcing fine-grained length control. Our approach integrates deviation-based weighting to emphasize high-quality responses, length normalization to prevent trivial short-response solutions, and an EOS-probability regularizer to mitigate dataset-induced brevity biases. Theoretically, we show that under the Uncertainty Reduction with Sequence Length Assertion (URSLA) framework, naive length normalization can still incentivize length-based shortcuts. In contrast, REFA corrects these subtle incentives, guiding models toward genuinely more informative and higher-quality outputs. Empirically, REFA achieves a new $\textbf{state-of-the-art}$ among reference-free alignment methods, generating richer responses that align more closely with human preferences. Notably, REFA improves performance on the AlpacaEval2 benchmark, achieving a $\textbf{26.6%}$ Length-Controlled Win Rate (LC-WR) and $\textbf{24.2%}$ Win Rate (WR).
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.