Quantification and Mitigation of Positional Bias in LLM Judging

Quantify and reduce positional bias in LLM-as-a-Judge evaluation tasks so that scores and preferences reflect true response quality rather than prompt position.

Background

Empirical studies show that LLM judges may systematically prefer responses based on their order, independent of actual merit, affecting both pairwise and listwise evaluations.

The paper highlights the need for measurement protocols to quantify this bias and mitigation strategies to reduce it in practice.

References

The open research problems in this context are: Quantify and reduce positional bias in evaluation tasks.

Security in LLM-as-a-Judge: A Comprehensive SoK  (2603.29403 - Masoud et al., 31 Mar 2026) in Section 7.2, Positional Bias and Evaluation Manipulation (Challenges and Open Problems)