Do human-level analogy models use human-like mechanisms?

Ascertain whether foundation models that solve analogical reasoning tasks at human-level performance (such as large language models evaluated on analogy benchmarks) possess reasoning mechanisms similar to those observed in human cognition, rather than alternative computational strategies that diverge from human reasoning.

Background

The paper raises concerns that strong task performance does not necessarily imply mechanistic similarity to humans. For analogical reasoning, recent work shows LLMs achieving human-level results, but it remains contested whether these models rely on cognitive-like processes or on different computational strategies that exploit training data statistics.

This debate is situated within the broader theme of distinguishing predictive success from explanatory insight. The authors point to evidence that performance often degrades outside familiar training regimes, motivating targeted evaluations to determine if the mechanisms align with human reasoning.

References

It remains debated, for example, whether superhuman-level game-playing models trained purely on move sequences truly encode the game's rules40-42,47,48 3, or whether models that solve analogies at a human level possess reasoning mechanisms like those of humans49.

— From Prediction to Understanding: Will AI Foundation Models Transform Brain Science? (2509.17280 - Serre et al., 21 Sep 2025) in Main text, section 'Prediction is not Explanation'

Do human-level analogy models use human-like mechanisms?

Background

References

Related Problems