Verifying whether majority voting eliminates imperfect reasoning in self-improving VLM judges
Ascertain whether majority voting–based filtering of synthetic preference pairs for closed-ended tasks in the self-improving vision-language model judge training framework eliminates all imperfect reasoning in the judge’s decisions.
Sponsor
References
While we cannot conclusively verify that majority voting eliminates all imperfect reasoning, the empirical advantages shown in Section~\ref{sec:analysis_reasoning} suggest that consistency-based filtering may provide more robust supervision than correctness checking alone for learning generalizable judgment criteria.
— Self-Improving VLM Judges Without Human Annotations
(2512.05145 - Lin et al., 2 Dec 2025) in Appendix: Correctness Filter Negative Example