Our database of benchmark results, featuring the performance of leading AI models on challenging tasks. It includes results from benchmarks evaluated internally by Epoch AI as well as data collected from external sources. Explore trends in AI capabilities across time, by benchmark, or by model.
We added nine new external benchmarks to the hub, spanning agentic work, cybersecurity, algorithm engineering, forecasting, and research-level physics.
Claude Fable 5 achieves a new high score of 161 on the ECI, beating GPT-5.5 Pro by 1 point. This is the first time Anthropic has taken the lead on the ECI in over a year.
Claude Fable 5 scores 87% on FrontierMath Tiers 1–3 and 88% on Tier 4, continuing Anthropic's streak of rapid gains in math.
Need deeper insights? Our team offers custom research and advisory services.
Book a consultation