Our database of benchmark results, featuring the performance of leading AI models on challenging tasks. It includes results from benchmarks evaluated internally by Epoch AI as well as data collected from external sources. Explore trends in AI capabilities across time, by benchmark, or by model.
Claude Fable 5 achieves a new high score of 161 on the ECI, beating GPT-5.5 Pro by 1 point. This is the first time Anthropic has taken the lead on the ECI in over a year.
Claude Fable 5 scores 87% on FrontierMath Tiers 1–3 and 88% on Tier 4, continuing Anthropic's streak of rapid gains in math.
We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.
Need deeper insights? Our team offers custom research and advisory services.
Book a consultation