Can Language Models Represent the Past without Anachronism?

Published 28 Apr 2025 in cs.CL | (2505.00030v1)

Abstract: Before researchers can use LLMs to simulate the past, they need to understand the risk of anachronism. We find that prompting a contemporary model with examples of period prose does not produce output consistent with period style. Fine-tuning produces results that are stylistically convincing enough to fool an automated judge, but human evaluators can still distinguish fine-tuned model outputs from authentic historical text. We tentatively conclude that pretraining on period prose may be required in order to reliably simulate historical perspectives for social research.