Scaling Laws for Moral Machine Judgment in Large Language Models

Published 25 Jan 2026 in cs.CY and cs.HC | (2601.17637v1)

Abstract: Autonomous systems increasingly require moral judgment capabilities, yet whether these capabilities scale predictably with model size remains unexplored. We systematically evaluate 75 LLM configurations (0.27B--1000B parameters) using the Moral Machine framework, measuring alignment with human preferences in life-death dilemmas. We observe a consistent power-law relationship with distance from human preferences ($D$) decreasing as $D \propto S^{{-0.10\pm0.01}$} ($R^2=0.50$, $p<0.001$) where $S$ is model size. Mixed-effects models confirm this relationship persists after controlling for model family and reasoning capabilities. Extended reasoning models show additional 16\% improvement beyond scale effects. The relationship holds across diverse architectures, while variance decreases at larger scales, indicating systematic emergence of more reliable moral judgment with computational scale. These findings extend scaling law research to value-based judgments and provide empirical foundations for artificial intelligence governance.