Abstract reasoning from minimal examples in frontier foundation models

Determine learning principles and model designs that enable frontier foundation models (e.g., GPT-5 and Grok 4) to correctly infer structured transformation rules from a handful of input–output matrix pairs and generalize these rules to novel test matrices in the Abstraction and Reasoning Corpus (ARC-AGI).

Background

ARC-AGI evaluates the ability to infer abstract transformation rules from very few examples and apply them to unseen instances, a capability at which humans excel but current AI systems struggle. The paper motivates its approach by noting that even cutting-edge models fail on this kind of few-shot abstraction and transfer.

The authors propose vision-language synergy (VLSR) and modality-switch self-correction (MSSC) as partial advances, showing measurable improvements. However, the overarching challenge—enabling frontier models to robustly perform abstract reasoning from minimal examples—remains unresolved and is presented as a core unsolved problem.

References

Abstract reasoning from minimal examples remains a core unsolved problem for frontier foundation models such as GPT-5 and Grok 4. These models still fail to infer structured transformation rules from a handful of examples, which is a key hallmark of human intelligence.

— Think Visually, Reason Textually: Vision-Language Synergy in ARC (2511.15703 - Zhang et al., 19 Nov 2025) in Abstract (page 1)

Abstract reasoning from minimal examples in frontier foundation models

Background

References

Related Problems