ARC AGI
A concept-learning benchmark that evaluates few-shot abstraction, pattern induction, and program-like reasoning from input–output examples.
About ARC AGI
The Abstraction and Reasoning Corpus (ARC-AGI) probes whether models can infer underlying rules from a handful of grid-based input–output demonstrations and generalize to novel test cases. Tasks require discovering transformations such as symmetry, compositional rules, object grouping, and algorithmic procedures—without natural language supervision.
ARC-AGI is designed to assess systematic generalization and compositional reasoning beyond surface statistics. Scores reflect pass/fail accuracy across tasks, with even simple-looking problems often demanding coherent internal representations and planning akin to writing a short program.