GDPval

GDPVal evaluates models on workplace tasks drawn from nine sectors of the U.S. economy, including finance and insurance, government, healthcare, manufacturing, and information. These sectors wereselected because they collectively account for the largest share of U.S. GDP. Within each sector, the benchmark covers the five highest-earning predominantly digital occupations, for a total of 44 occupations and 1,320 tasks.

Tasks represent actual work deliverables, such as documents, spreadsheets, and presentations. Scoring uses blinded pairwise comparisons: a domain expert sees only the task and two unlabeled deliverables (the model’s output and a human expert’s), and ranks them without knowing which is which. The result is a win, tie, or loss for the model against the human baseline.

Methodology

We source GDPVal results from the public GDPVal leaderboard. The public leaderboard reports at least two aggregate metrics: a win-rate metric and a wins-plus-ties metric. For this page, the default chart uses the public win-rate metric, and the dropdown exposes the wins-plus-ties metric when available in the exported data.

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Papers & Reports

Data Insights

Newsletter

Podcast

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

GDPval

GDPval

Methodology

GDPval

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Publications

Papers & Reports

Data Insights

Newsletter

Podcast

Data explorers

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

Benchmarks by Epoch AI

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

Scaling

Software progress

Open models

Capabilities

Math

Leading companies

Finances

Geopolitics

Chips

Data centers

Energy

Adoption and use

Economic impact

Future of AI

About Epoch AI

Donate

Team

Careers

Consultations

For press

Transparency

GDPval

GDPval

Methodology