Normalization Mechanisms to Reduce Length Bias

Design normalization mechanisms that reduce length bias in LLM-as-a-Judge so that verbose responses do not receive inflated scores relative to concise, accurate ones.

Background

Length bias leads LLM judges to prefer longer responses, even when additional text does not improve factual quality. Attackers can inflate length to manipulate scores.

The paper identifies the need for normalization techniques that correct or compensate for response length effects during evaluation.

References

The open research problems in this context are: Design normalization mechanisms for reducing length bias.

Security in LLM-as-a-Judge: A Comprehensive SoK  (2603.29403 - Masoud et al., 31 Mar 2026) in Section 7.3, Length and Style Bias Exploitation (Challenges and Open Problems)