Help us improve our work.
Take our survey

FrontierMath

A math benchmark testing the limits of AI

Unprecedented difficulty

Each problem demands hours of work from expert mathematicians. Even the most advanced AI systems today, including GPT-4 and Gemini, solve less than 2% of them.

True evaluation

All problems are new and unpublished, eliminating data contamination concerns that plague existing benchmarks.

Mathematical depth

Created in collaboration with over 60 mathematicians, FrontierMath spans the full spectrum of modern mathematics, from algebraic geometry to Zermelo–Fraenkel set theory.

Help shape the future of AI in mathematics

We are hosting a competition to establish rigorous human performance baselines for FrontierMath. With a prize pool of over $30,000, your participation will contribute directly to measuring AI progress in solving challenging mathematical problems.

Learn more
Impressions of our research-level problems
(top 25% of difficulty)

“These are extremely challenging... I think they will resist AIs for several years at least.”

Terence Tao
Terence Tao Fields Medalist (2006)

“Getting even one question right would be well beyond what we can do now, let alone saturating them.”

Timothy Gowers
Timothy Gowers Fields Medalist (1998)

“These are genuinely hard problems... most of them look well above my pay grade.”

Evan Chen
Evan Chen International Mathematical Olympiad Coach

Learn more

Exploring AI’s mathematical limits

Read the full academic paper introducing FrontierMath, including methodology, evaluation procedures, and detailed analysis.

Read more