Few domains test AI reasoning as clearly as mathematics, where answers can be verified automatically and the hardest problems extend to the frontier of human knowledge. Epoch tracks how AI is performing on mathematical tasks over time, including through FrontierMath, our own benchmark of expert-level problems designed to test the limits of what today's best systems can do.

