Generalizability of score range adjustment for numerical bias mitigation
Ascertain the generalizability of score range adjustment as a mitigation for alignment-induced numerical bias in LLM-as-a-judge evaluations.
References
Our mitigation methods, such as score range adjustment, are heuristic and task-specific. They represent a practical first step rather than a fundamental solution to numerical bias. Although effective in our setting, its generalizability remains uncertain, so further exploration in this direction is meaningful future work.
— Exploring the Effects of Alignment on Numerical Bias in Large Language Models
(2601.16444 - Sato et al., 23 Jan 2026) in Limitations, item (iv)