Latest

data insight · 1 min read

Open-weight models lag state-of-the-art by around 3 months on average

Oct 30, 2025 · By Luke Emberson

report · 14 min read

What does OSWorld tell us about AI’s ability to use computers?

We review OSWorld, a prominent computer use benchmark. Tasks are relatively simple, many don’t require GUIs, and success often hinges on interpreting ambiguous instructions. The benchmark is also not stable over time.

Oct 30, 2025 · By Florian Brand and Greg Burnham

report · 25 min read

Could decentralized training solve AI’s power problem?

We illustrate a decentralized 10 GW training run across a dozen sites spanning thousands of kilometers. Developers are likely to scale datacenters to multi-gigawatt levels before adopting decentralized training.

Oct 28, 2025 · By Jaime Sevilla and Anton Troynikov

newsletter · 8 min read

Less than 70% of FrontierMath is within reach for today’s models

Oct 17, 2025 · By Greg Burnham

newsletter · 7 min read

OpenAI is projecting unprecedented revenue growth

Oct 15, 2025 · By Greg Burnham

data insight · 1 min read

OpenAI's revenue has been growing 3x a year since 2024

Oct 14, 2025 · By Venkat Somala

data insight · 3 min read

Most of OpenAI’s 2024 compute went to experiments

Oct 10, 2025 · By Josh You

report · 23 min read

Evaluating Gemini 2.5 Deep Think's math capabilities

Improved use of knowledge and precision, helpful for research, more conceptual in geometry, but limited creativity and citation issues.

Oct 9, 2025 · By Greg Burnham

newsletter · 8 min read

How many digital workers could OpenAI deploy?

Oct 3, 2025 · By Jean-Stanislas Denain, Anson Ho and Jaime Sevilla

data insight · 1 min read

AI capabilities have steadily improved over the past year

Sep 30, 2025 · By Luke Emberson

update · 2 min read

Introducing the AI Companies Data Hub

Our new AI Companies Data Hub tracks key economic and operational data, including frontier AI companies’ revenue, funding, valuations, staff counts, compute spending, and product usage

Sep 30, 2025 · By The Epoch AI Team

newsletter · 7 min read

Why GPT-5 used less training compute than GPT-4.5 (but GPT-6 probably won’t)

Sep 26, 2025 · By Yafah Edelman, Jean-Stanislas Denain, Jaime Sevilla and Anson Ho

data insight · 2 min read

AI developers accurately report GPQA Diamond scores for recent models

Sep 19, 2025 · By Jaeho Lee and Yafah Edelman

newsletter · 8 min read

The huge potential implications of long-context inference

Sep 19, 2025 · By Jean-Stanislas Denain and Anson Ho

report · 8 min read

What will AI look like in 2030?

If scaling persists to 2030, AI investments will reach hundreds of billions of dollars and require gigawatts of power. Benchmarks suggest AI could improve productivity in valuable areas such as scientific R&D.

Sep 16, 2025 · By David Owen

data insight · 3 min read

What did it take to train Grok 4?

Sep 12, 2025 · By James Sanders, Luke Emberson and Yafah Edelman

newsletter · 11 min read

Three challenges facing compute-based AI policies

Sep 11, 2025 · By Venkat Somala, Anson Ho and Séb Krier

newsletter · 10 min read

Compute scaling will slow down due to increasing lead times

Sep 5, 2025 · By Yafah Edelman and Anson Ho

data insight · 5 min read

LLMs have not yet solved the hardest problems on high school math contests

Sep 3, 2025 · By Greg Burnham

data insight · 2 min read

GPT-5 and GPT-4 were both major leaps in benchmarks from the previous generation

Aug 29, 2025 · By Luke Emberson and Josh You

newsletter · 7 min read

Why future AI agents will be trained to work together

Aug 22, 2025 · By Anson Ho and Jean-Stanislas Denain

data insight · 5 min read

Frontier AI performance becomes accessible on consumer hardware within a year

Aug 15, 2025 · By Venkat Somala and Luke Emberson

paper · 4 min read

How much power will frontier AI training demand in 2030?

The power required to train the largest frontier models is growing by more than 2x per year, and is on trend to reaching multiple gigawatts by 2030.

Aug 11, 2025 · By Josh You and David Owen

data insight · 3 min read

Compute is not a bottleneck for robotic manipulation

Aug 8, 2025 · By Ben Cottier, Scott Longwell, James Sanders, David Owen, Yafah Edelman and Luke Emberson

newsletter · 10 min read

We didn’t learn much from the IMO

Aug 7, 2025 · By Greg Burnham

newsletter · 10 min read

Quantifying the algorithmic improvement from reasoning models

Aug 2, 2025 · By Anson Ho and Arden Berg

data insight · 2 min read

Training open-weight models is becoming more data intensive

Aug 1, 2025 · By Venkat Somala and Yafah Edelman

newsletter · 12 min read

Why China isn’t about to leap ahead of the West on compute

Jul 26, 2025 · By Veronika Blablová and Robi Rahman

data insight · 2 min read

Frontier training runs will likely stop getting longer by around 2027

Jul 25, 2025 · By Luke Emberson and Yafah Edelman

report · 31 min read

Evaluating Grok 4’s math capabilities

It's good at involved computations, improving at proofs, and useful for literature search. It still favors low-level grinds and leans on background knowledge.

Jul 25, 2025 · By Greg Burnham

newsletter · 11 min read

After the ChatGPT moment: Measuring AI’s adoption

Jul 17, 2025 · By Arden Berg and Anson Ho

update · 15 min read

How to run SWE-bench Verified in one hour on one machine

We are releasing a public registry of optimized Docker images for SWE-bench. This allows us to run SWE-bench Verified in 62 minutes on a single GitHub actions VM.

Jul 10, 2025 · By Tom Adamczewski

newsletter · 15 min read

What will the IMO tell us about AI math capabilities?

Jul 8, 2025 · By Greg Burnham

newsletter · 9 min read

How big could an “AI Manhattan Project” get?

Jul 2, 2025 · By Arden Berg and Anson Ho

data insight · 4 min read

LLMs now accept longer inputs, and the best models can use them more effectively

Jun 25, 2025 · By Greg Burnham and Tom Adamczewski

newsletter · 9 min read

AI and explosive growth redux

Jun 20, 2025 · By Andrei Potlogea and Anson Ho

paper · 4 min read

Inference economics of language models

We investigate how speed trades off against cost in language model inference. We find that inference latency scales with the square root of model size and the cube root of memory bandwidth, and other results.

Jun 17, 2025 · By Ege Erdil

newsletter · 19 min read

Do the biorisk evaluations of AI labs actually measure the risk of developing bioweapons?

Jun 13, 2025 · By Anson Ho and Arden Berg

report · 11 min read

What skills does SWE-bench Verified evaluate?

We take a deep dive into SWE-bench Verified, a prominent agentic coding benchmark. While one of the best public tests of AI coding agents, it is limited by its focus on simple bug fixes in familiar open-source repositories.

Jun 13, 2025 · By Florian Brand and Jean-Stanislas Denain

data insight · 3 min read

LLM providers offer a trade-off between accuracy and speed

Jun 11, 2025 · By Greg Burnham and Tom Adamczewski

data insight · 8 min read

Over 30 AI models have been trained at the scale of GPT-4

Updated Jun 6, 2025 · By Robi Rahman, Lovis Heindrich, David Owen and Luke Emberson

newsletter · 8 min read

Beyond benchmark scores: Analyzing o3-mini’s mathematical reasoning

Jun 6, 2025 · By Anson Ho, Jean-Stanislas Denain and Elliot Glazer

data insight · 3 min read

Power requirements of leading AI supercomputers have doubled every 13 months

Jun 5, 2025 · By Konstantin F. Pilz, Robi Rahman, James Sanders, Luke Emberson and Lennart Heim

data insight · 1 min read

Private-sector companies own a dominant share of GPU clusters

Jun 5, 2025 · By Konstantin F. Pilz, Robi Rahman, James Sanders, Luke Emberson and Lennart Heim

data insight · 2 min read

The US hosts the majority of GPU cluster performance, followed by China

Jun 5, 2025 · By Konstantin F. Pilz, Robi Rahman, James Sanders, Luke Emberson and Lennart Heim

data insight · 1 min read

Acquisition costs of leading AI supercomputers have doubled every 13 months

Jun 5, 2025 · By Konstantin F. Pilz, Robi Rahman, James Sanders, Luke Emberson and Lennart Heim

data insight · 2 min read

The computational performance of leading AI supercomputers has doubled every nine months

Updated Jun 5, 2025 · By Konstantin F. Pilz, Robi Rahman, James Sanders, Luke Emberson and Lennart Heim

update · 9 min read

What is Epoch?

Our director explains Epoch AI’s mission and how we decide our priorities. In short, we work on projects to understand the trajectory of AI, share this knowledge publicly, and inform important decisions about AI.

Jun 5, 2025 · By Jaime Sevilla

newsletter · 11 min read

GPQA Diamond: What’s left?

May 30, 2025 · By Greg Burnham

report · 35 min read

How many AI models will exceed compute thresholds?

We project how many notable AI models will exceed training compute thresholds. Model counts rapidly grow from 10 above 1e26 FLOP by 2026, to over 200 by 2030.

May 30, 2025 · By Ben Cottier and David Owen

data insight · 4 min read

Widespread adoption of new numeric formats took 3-4 years in past cycles

May 28, 2025 · By Venkat Somala and Luke Emberson

newsletter · 7 min read

Is AI already superhuman on FrontierMath?

May 23, 2025 · By Anson Ho

newsletter · 8 min read

How fast can algorithms advance capabilities?

May 16, 2025 · By Henry Josephson

newsletter · 10 min read

How far can reasoning models scale?

May 9, 2025 · By Josh You

newsletter · 10 min read

Where’s my ten minute AGI?

May 2, 2025 · By Anson Ho

newsletter · 12 min read

The case for multi-decade AI timelines

Apr 26, 2025 · By Ege Erdil

paper · 4 min read

Trends in AI supercomputers

AI supercomputers double in performance every 9 months, cost billions of dollars, and require as much power as mid-sized cities. Companies now own 80% of all AI supercomputers, while governments’ share has declined.

Apr 23, 2025 · By Konstantin F. Pilz, Robi Rahman, James Sanders and Lennart Heim

data insight · 2 min read

LLM responses to benchmark questions are getting longer over time

Apr 17, 2025 · By Luke Emberson, Ben Cottier, Josh You, Tom Adamczewski and Jean-Stanislas Denain

data insight · 7 min read

The combined revenues of leading AI companies grew by over 9x in 2023-2024

Apr 3, 2025 · By Ben Snodin, David Owen and Luke Emberson

newsletter · 4 min read

The real reason AI benchmarks haven’t reflected economic impacts

Mar 28, 2025 · By Anson Ho and Jean-Stanislas Denain

newsletter · 15 min read

Most AI value will come from broad automation, not from R&D

Mar 21, 2025 · By Ege Erdil and Matthew Barnett

paper · 5 min read

GATE: Modeling the trajectory of AI and automation

We introduce a compute-centric model of AI automation and its economic effects, illustrating key dynamics of AI development. The model suggests large AI investments and subsequent economic growth.

Mar 21, 2025 · By The Epoch AI Team

update · 1 min read

FrontierMath competition: Setting benchmarks for AI evaluation

We are hosting a competition to establish rigorous human performance baselines for FrontierMath. With a prize pool of $10,000, your participation will contribute directly to measuring AI progress in solving challenging mathematical problems.

Updated Mar 18, 2025 · By Tamay Besiroglu, Elliot Glazer and Caroline Falkman Olsson

data insight · 5 min read

LLM inference prices have fallen rapidly but unequally across tasks

Mar 12, 2025 · By Ben Cottier, Ben Snodin, David Owen and Tom Adamczewski

newsletter · 10 min read

What AI can currently do is not the story

Mar 7, 2025 · By Ege Erdil

report · 9 min read

Train once, deploy many: AI and increasing returns

AI's “train-once-deploy-many” advantage yields increasing returns: doubling compute more than doubles output by increasing models' inference efficiency and enabling more deployed inference instances.

Mar 7, 2025 · By Ege Erdil and Tamay Besiroglu

data insight · 4 min read

Leading AI chip designs are used for around four years in frontier training

Mar 5, 2025 · By Luke Emberson, Ben Snodin and David Owen

newsletter · 14 min read

The promise of reasoning models

Feb 28, 2025 · By Matthew Barnett

data insight · 3 min read

Biology AI models are scaling 2-4x per year after rapid growth from 2019-2021

Feb 21, 2025 · By Pablo Villalobos and David Atanasov

newsletter · 11 min read

AI progress is about to speed up

Feb 21, 2025 · By Ege Erdil

newsletter · 13 min read

Algorithmic progress likely spurs more spending on compute, not less

Feb 14, 2025 · By Matthew Barnett

data insight · 7 min read

The stock of computing power from NVIDIA chips is doubling every 10 months

Feb 13, 2025 · By Luke Emberson and David Owen

data insight · 1 min read

US models currently outperform non-US models

Updated Feb 7, 2025 · By Jean-Stanislas Denain

data insight · 1 min read

Models with downloadable weights currently lag behind the top-performing models

Updated Feb 7, 2025 · By Jean-Stanislas Denain

data insight · 1 min read

Accuracy increases with estimated training compute

Updated Feb 7, 2025 · By Jean-Stanislas Denain

newsletter · 22 min read

How much energy does ChatGPT use?

Feb 7, 2025 · By Josh You

update · 3 min read

A more systematic and transparent AI benchmarking hub

We've overhauled our AI benchmarking infrastructure to provide more transparent, systematic, and up-to-date evaluations of AI model capabilities.

Feb 7, 2025 · By Tom Adamczewski

newsletter · 14 min read

What went into training DeepSeek-R1?

Jan 31, 2025 · By Ege Erdil

update · 2 min read

Announcing our expanded biology AI coverage

We've expanded our Biology AI Dataset, now covering 360+ models. Our analysis reveals rapid scaling from 2017-2021, followed by a notable slowdown in biological model development.

Jan 29, 2025 · By Pablo Villalobos and David Atanasov

newsletter · 16 min read

AGI could drive wages below subsistence level

Jan 24, 2025 · By Matthew Barnett

update · 2 min read

Clarifying the creation and use of the FrontierMath benchmark

We clarify that OpenAI commissioned Epoch AI to produce 300 math questions for the FrontierMath benchmark. They own these and have access to the statements and solutions, except for a 50-question holdout set.

Jan 23, 2025 · By Tamay Besiroglu and Jaime Sevilla

data insight · 4 min read

Chinese language models have scaled up more slowly than their global counterparts

Jan 22, 2025 · By Ben Cottier

newsletter · 12 min read

How has DeepSeek improved the Transformer architecture?

Jan 17, 2025 · By Ege Erdil

update · 9 min read

2024 impact report

Epoch's Impact Report for 2024 highlights influential research on AI's trajectory, the launch of FrontierMath, an expanded AI data hub, engagement with leaders, $7M raised, and more.

Jan 17, 2025 · By The Epoch AI Team

data insight · 3 min read

Frontier open models may surpass 10²⁶ FLOP of training compute before 2026

Jan 15, 2025 · By Luke Emberson

newsletter · 16 min read

The economic consequences of automating remote work

Jan 10, 2025 · By Matthew Barnett

data insight · 4 min read

Training compute growth is driven by larger clusters, longer training, and better hardware

Jan 8, 2025 · By Luke Emberson and David Owen

newsletter · 9 min read

Moravec’s paradox and its implications

Dec 27, 2024 · By Ege Erdil

newsletter · 10 min read

How do mixture-of-experts models compare to dense models in inference?

Dec 20, 2024 · By Ege Erdil

newsletter · 8 min read

Frontier language models have become much smaller

Dec 13, 2024 · By Ege Erdil

update · 1 min read

Announcing Gradient Updates: Our new weekly newsletter

We are announcing Gradient Updates, Epoch AI’s new weekly newsletter focused on timely and important questions in AI.

Dec 13, 2024 · By Ege Erdil

newsletter · 8 min read

What did US export controls mean for China’s AI capabilities?

Dec 6, 2024 · By Ege Erdil

report · 7 min read

What is the future of AI in mathematics? Interviews with leading mathematicians

How will AI transform mathematics? Fields Medalists and other leading mathematicians discuss whether they expect AI to automate advanced math research.

Dec 4, 2024 · By Anson Ho and Tamay Besiroglu

update · 6 min read

Introducing the distributed training interactive simulator

We introduce and walk you through an interactive tool that simulates distributed training runs of large language models under ideal conditions.

Nov 29, 2024 · By Ege Erdil and Tamay Besiroglu

update · 2 min read

Introducing Epoch AI's AI benchmarking hub

We are launching the AI Benchmarking Hub: a platform presenting our evaluations of leading models on challenging benchmarks, with analysis of trends in AI capabilities.

Nov 27, 2024 · By The Epoch AI Team

report · 15 min read

Hardware failures won’t limit AI scaling

Hardware failures won't limit AI training scale - GPU memory checkpointing enables training with millions of GPUs despite failures.

Nov 22, 2024 · By Alexander Erben and Ege Erdil

paper · 6 min read

FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI

FrontierMath: a new benchmark of expert-level math problems designed to measure AI's mathematical abilities. See how leading AI models perform against the collective mathematics community.

Nov 8, 2024 · By Tamay Besiroglu, Elliot Glazer and Caroline Falkman Olsson

report · 37 min read

How far behind are open models?

Analysis of open vs. closed AI models reveals the best open model today matches closed models in performance and training compute, but with a one-year lag.

Nov 4, 2024 · By Ben Cottier, Josh You, Natalia Martemianova and David Owen

paper · 14 min read

Data movement bottlenecks to large-scale model training: Scaling past 1e28 FLOP

Data movement bottlenecks limit LLM scaling beyond 2e28 FLOP, with a "latency wall" at 2e31 FLOP. We may hit these in ~3 years. Aggressive batch size scaling could potentially overcome these limits.

Nov 2, 2024 · By Ege Erdil

data insight · 1 min read

AI training cluster sizes increased by more than 20x since 2016

Oct 23, 2024 · By Robi Rahman

data insight · 1 min read

Performance per dollar improves around 30% each year

Oct 23, 2024 · By Robi Rahman

data insight · 1 min read

The computational performance of machine learning hardware has doubled every 2.3 years

Oct 23, 2024 · By Robi Rahman

data insight · 1 min read

The NVIDIA A100 has been the most popular hardware for training notable machine learning models

Oct 23, 2024 · By Robi Rahman

data insight · 1 min read

Leading ML hardware becomes 40% more energy-efficient each year

Oct 23, 2024 · By Robi Rahman

data insight · 1 min read

Performance improves 13x when switching from FP32 to tensor-INT8

Oct 23, 2024 · By Robi Rahman and David Owen

update · 1 min read

Introducing Epoch AI's machine learning hardware database

Our new database covers hardware used to train AI models, featuring over 100 accelerators (GPUs and TPUs) across the deep learning era.

Oct 23, 2024 · By The Epoch AI Team

data insight · 8 min read

Leading AI companies have hundreds of thousands of cutting-edge AI chips

Oct 9, 2024 · By Josh You and David Owen

data insight · 1 min read

The power required to train frontier AI models is doubling annually

Sep 19, 2024 · By Luke Emberson and Robi Rahman

report · 10 min read

Interviewing AI researchers on automation of AI R&D

AI could speed up AI R&D, especially in coding and debugging. We explore predictions on automation and researchers' suggestions for AI R&D evaluations.

Aug 27, 2024 · By David Owen

report · 83 min read

Can AI scaling continue through 2030?

We investigate four constraints to scaling AI training: power, chip manufacturing, data, and latency. We predict 2e29 FLOP runs will be feasible by 2030.

Aug 20, 2024 · By Jaime Sevilla, Tamay Besiroglu, Ben Cottier, Josh You, Edu Roldán, Pablo Villalobos and Ege Erdil

data insight · 1 min read

The length of time spent training notable models is growing

Aug 16, 2024 · By Luke Emberson

data insight · 1 min read

Language models compose the large majority of large-scale AI models

Jun 19, 2024 · By Robi Rahman and Josh You

data insight · 1 min read

Most large-scale models are developed by US companies

Jun 19, 2024 · By Robi Rahman

data insight · 1 min read

The pace of large-scale model releases is accelerating

Jun 19, 2024 · By Robi Rahman

data insight · 1 min read

Almost half of large-scale models have published, downloadable weights

Jun 19, 2024 · By Ben Cottier, Josh You and Natalia Martemianova

data insight · 1 min read

The size of datasets used to train language models doubles approximately every six months

Jun 19, 2024 · By Robi Rahman and David Owen

data insight · 1 min read

Training compute costs are doubling every eight months for the largest AI models

Jun 19, 2024 · By Ben Cottier and Robi Rahman

data insight · 1 min read

The training compute of notable AI models has been doubling roughly every six months

Jun 19, 2024 · By Robi Rahman and David Owen

data insight · 1 min read

Training compute has scaled up faster for language than vision

Jun 19, 2024 · By Robi Rahman and David Owen

update · 1 min read

Announcing Epoch AI’s data hub

We're launching a hub for data and visualizations, featuring our databases on notable and large-scale AI models for users and researchers.

Jun 19, 2024 · By The Epoch AI Team

paper · 6 min read

Will we run out of data? Limits of LLM scaling based on human-generated data

If trends continue, language models will fully utilize the stock of human-generated public text between 2026 and 2032.

Jun 6, 2024 · By Pablo Villalobos, Anson Ho, Jaime Sevilla, Tamay Besiroglu, Lennart Heim and Marius Hobbhahn

paper · 4 min read

How much does it cost to train frontier AI models?

The cost of training top AI models has grown 2-3x annually for the past eight years. By 2027, the largest models could cost over a billion dollars.

Updated Jan 13, 2025 · By Ben Cottier, Robi Rahman, Loredana Fattorini, Nestor Maslej and David Owen

report · 20 min read

Training compute of frontier AI models grows by 4-5x per year

Our expanded AI model database shows that training compute grew 4-5x/year from 2010 to 2024, with similar trends in frontier and large language models.

May 28, 2024 · By Jaime Sevilla and Edu Roldán

paper · 10 min read

Do the returns to software R&D point towards a singularity?

Returns to R&D are key in growth dynamics and AI development. Our paper introduces new empirical techniques to estimate this vital parameter.

May 17, 2024 · By Tamay Besiroglu, Ege Erdil and Anson Ho

paper · 4 min read

Chinchilla scaling: A replication attempt

We replicate Hoffmann et al.’s parametric scaling law estimates, finding issues and providing better-fitting estimates that align with their other methods.

Apr 17, 2024 · By Tamay Besiroglu, Ege Erdil, Matthew Barnett and Josh You

report · 16 min read

Tracking large-scale AI models

We present a dataset of 81 large-scale models, from AlphaGo to Gemini, developed across 18 countries, at the leading edge of scale and capabilities.

Apr 5, 2024 · By Robi Rahman, David Owen and Josh You

report · 9 min read

Optimally allocating compute between inference and training

AI labs should spend comparable resources on training and inference, assuming they can flexibly balance compute between the two to maintain performance.

Mar 29, 2024 · By Ege Erdil

paper · 3 min read

Algorithmic progress in language models

Progress in pretrained language model performance outpaces expectations, occurring at a pace equivalent to doubling computational power every 5 to 14 months.

Mar 12, 2024 · By Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson and Jaime Sevilla

update · 10 min read

2023 impact report

In 2023, Epoch published nearly 20 reports on AI, added hundreds of models to our database, helped with government policies, and raised over $7 million.

Jan 19, 2024 · By The Epoch AI Team

report · 23 min read

Biological sequence models in the context of the AI directives

Our expanded database now includes biological sequence models, highlighting potential regulatory gaps and the growth of training compute in these models.

Updated Apr 9, 2024 · By Nicole Maug, Aidan O'Gara and Tamay Besiroglu

paper · 3 min read

How predictable is language model benchmark performance?

We investigate large language model performance, finding that compute-focused extrapolations are a promising way to forecast AI capabilities.

Updated Jan 11, 2024 · By David Owen

paper · 4 min read

Limits to the energy efficiency of CMOS microprocessors

How far can the energy efficiency of CMOS microprocessors be pushed before hitting physical limits? We find room for a further 50 to 1000x improvement.

Dec 15, 2023 · By Anson Ho, Ege Erdil and Tamay Besiroglu

paper · 2 min read

AI capabilities can be significantly improved without expensive retraining

While scaling compute is key to improving LLMs, post-training enhancements can offer gains equivalent to 5-20x more compute at less than 1% of the cost.

Dec 12, 2023 · By Tom Davidson, Jean-Stanislas Denain, Pablo Villalobos and Guillem Bas

paper · 3 min read

Who is leading in AI? An analysis of industry AI research

Industry has emerged as a driving force in AI. We compare top companies on research impact, training runs, and contributions to algorithmic innovations.

Nov 27, 2023 · By Ben Cottier, Tamay Besiroglu and David Owen

report · 31 min read

Challenges in predicting AI automation

Economists propose various approaches to predicting AI's automation of valuable tasks, but disagreements persist, with no consensus on the best method.

Nov 24, 2023 · By David Owen and Tamay Besiroglu

report · 27 min read

Trends in machine learning hardware

FLOP/s performance in 47 ML hardware accelerators doubled every 2.3 years. Switching from FP32 to tensor-FP16 led to a further 10x performance increase.

Nov 9, 2023 · By Marius Hobbhahn, Lennart Heim and Gökçe Aydos

update · 1 min read

Announcing Epoch AI's updated parameter, compute and data trends database

Our database now spans over 700 ML systems, tracking parameters, datasets, and training compute details for notable machine learning models.

Oct 23, 2023 · By The Epoch AI Team

paper · 11 min read

Explosive growth from AI: A review of the arguments

Our new article explores whether deployment of advanced AI systems could lead to growth rates ten times higher than those of today’s frontier economies.

Sep 23, 2023 · By Ege Erdil and Tamay Besiroglu

report · 27 min read

Trading off compute in training and inference

We characterize techniques that induce a tradeoff between spending resources on training and inference, outlining their implications for AI governance.

Jul 28, 2023 · By Pablo Villalobos and David Atkinson

report · 10 min read

The limited benefit of recycling foundation models

Reusing pretrained models can save on training costs, but it's unlikely to significantly boost AI capabilities beyond modest improvements.

Jul 7, 2023 · By Matthew Barnett

update · 3 min read

Epoch AI and FRI mentorship program summer 2023

We’re launching the Epoch and FRI mentorship program for women, non-binary, and transgender people interested in AI forecasting.

Jun 8, 2023 · By The Epoch AI Team

report · 14 min read

Direct Approach interactive model

When could transformative AI be achieved? We present a simple, user-adjustable model of key inputs that forecasts the date TAI could be deployed.

Updated Jan 9, 2024 · By David Atkinson, Matthew Barnett, Edu Roldán, Ben Cottier and Tamay Besiroglu

viewpoint · 26 min read

A compute-based framework for thinking about the future of AI

AI’s potential to automate labor could alter the course of human history. The availability of compute is the most important factor driving progress in AI.

Updated Aug 10, 2023 · By Matthew Barnett

viewpoint · 1 min read

Please report your compute

Compute is essential for AI performance, yet often underreported. Adopting reporting norms would improve research, forecasts, and policy decisions.

Apr 26, 2023 · By Jaime Sevilla, Anson Ho and Tamay Besiroglu

report · 10 min read

The Direct Approach

We propose a method using neural scaling laws to estimate the compute needed to train AI models to reach human-level performance on various tasks.

Apr 25, 2023 · By Matthew Barnett and Tamay Besiroglu

paper · 2 min read

Power laws in speedrunning and machine learning

Our model suggests ML benchmarks aren’t near saturation. While large improvements are rare, we find 1OOM gains happen roughly once in every 50 instances.

Apr 21, 2023 · By Ege Erdil and Jaime Sevilla

update · 1 min read

Announcing Epoch AI’s dashboard of key trends and figures in machine learning

Our dashboard provides key data from our research on machine learning and is a valuable resource for understanding the present and future of the field.

Apr 12, 2023 · By The Epoch AI Team

update · 1 min read

2022 impact report

Our impact report for 2022.

Feb 1, 2023 · By The Epoch AI Team

report · 66 min read

Trends in the dollar training cost of machine learning systems

How much does it cost to train AI models? Looking at 124 ML systems from between 2009 and 2022, we find the cost has grown by approximately 0.5OOM/year.

Jan 31, 2023 · By Ben Cottier

report · 6 min read

Scaling laws literature review

I have collected a database of scaling laws for different tasks and architectures, and reviewed dozens of papers in the scaling law literature.

Jan 26, 2023 · By Pablo Villalobos

update · 1 min read

An interactive model of AI takeoff speeds

We have developed an interactive website showcasing a new model of AI takeoff speeds.

Jan 24, 2023 · By Jaime Sevilla and Edu Roldán

report · 16 min read

Literature review of transformative artificial intelligence timelines

We summarize and compare several models and forecasts predicting when transformative AI will be developed.

Jan 17, 2023 · By Keith Wynroe, David Atkinson and Jaime Sevilla

paper · 2 min read

Revisiting algorithmic progress

Examining over 100 computer vision models, we find that every 9 months, better algorithms contribute the equivalent of a doubling of compute budgets.

Dec 12, 2022 · By Ege Erdil and Tamay Besiroglu

paper · 3 min read

Will we run out of ML data? Evidence from projecting dataset size trends

We project dataset growth in language and vision domains, estimating future limits to training by evaluating the availability of unlabeled data over time.

Nov 10, 2022 · By Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn and Anson Ho

report · 12 min read

The longest training run

Training runs of large ML systems will likely last less than 14-15 months, as shorter runs starting later use better hardware and algorithms.

Aug 17, 2022 · By Jaime Sevilla, Tamay Besiroglu, Owen Dudney and Anson Ho

report · 22 min read

A time-invariant version of Laplace’s rule

We discuss estimating event probabilities with past data, addressing issues with Laplace’s rule and proposing a modification to improve accuracy.

Jul 15, 2022 · By Jaime Sevilla and Ege Erdil

paper · 2 min read

Machine learning model sizes and the parameter gap

Since 2018, the size of ML models has been growing 10 times faster than before. Around 2020, model sizes saw a significant jump, increasing by 1OOM.

Jul 5, 2022 · By Pablo Villalobos, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Anson Ho and Marius Hobbhahn

report · 14 min read

Trends in GPU price-performance

Improvements in hardware are central to AI progress. Using data on 470 GPUs from 2006 to 2021, we find that FLOP/s per dollar doubles every ~2.5 years.

Jun 27, 2022 · By Marius Hobbhahn and Tamay Besiroglu

update · 4 min read

Announcing Epoch AI: A research initiative investigating the road to transformative AI

We are a new research initiative forecasting developments in AI. Come join us!

Jun 23, 2022 · By The Epoch AI Team

paper · 7 min read

Compute trends across three eras of machine learning

We’ve compiled a comprehensive dataset of the training compute of AI models, providing key insights into AI development.

Updated May 2, 2022 · By Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn and Pablo Villalobos

report · 24 min read

Estimating training compute of deep learning models

We describe two approaches for estimating the training compute of Deep Learning systems, by counting operations and looking at GPU time.

Jan 20, 2022 · By Jaime Sevilla, Lennart Heim, Marius Hobbhahn, Tamay Besiroglu, Anson Ho and Pablo Villalobos

report · 8 min read

What’s the backward-forward FLOP ratio for neural networks?

Determining the backward-forward FLOP ratio for neural networks, to help calculate their total training compute.

Dec 13, 2021 · By Marius Hobbhahn and Jaime Sevilla

report · 9 min read

How to measure FLOP for neural networks empirically?

Computing the utilization rate for multiple Neural Network architectures.

Nov 29, 2021 · By Marius Hobbhahn

All publications

We value your privacy