AI Benchmarking
Our database of benchmark results, featuring the performance of leading AI models on challenging tasks. It includes results from benchmarks evaluated internally by Epoch AI as well as data collected from external sources. Explore trends in AI capabilities across time, by benchmark, or by model.
Benchmarking updates
December 22, 2025
We benchmarked several open-weight Chinese models on FrontierMath Tiers 1-3. Top scores lag the overall frontier by about 7 months.
Read the thread
December 23, 2025
Why benchmarking is hard: exploring the challenges and complexities at each stage of the benchmarking pipeline.
Understand the challenges
December 23, 2025
AI capabilities progress has sped up: According to our ECI frontier model improvement nearly doubled in 2024.
Explore the data