AI Software Progress: Data & Research

0 results

Newsletter

Apr. 7, 2026

Keeping up with the GPTs

Can Chinese and open model companies compete with the frontier through e.g. distillation and talent?

By Anson Ho

The least understood driver of AI progress

Newsletter

Feb. 25, 2026

The least understood driver of AI progress

An opinionated guide to “algorithmic progress” and why it matters

By Anson Ho

An FAQ on Reinforcement Learning Environments

Newsletter

Jan. 12, 2026

An FAQ on Reinforcement Learning Environments

We interviewed 18 people across RL environment startups, neolabs, and frontier labs about the state of the field and where it's headed.

By Jean-Stanislas Denain and Chris Barber

The software intelligence explosion debate needs experiments

Newsletter

Nov. 14, 2025

The software intelligence explosion debate needs experiments

The existing debate rests on data and assumptions that are shakier than most people realize. To make progress, we need better evidence, and experiments are the best way to get it on the margin.

By Anson Ho and Parker Whitfill

Why GPT-5 used less training compute than GPT-4.5 (but GPT-6 probably won’t)

Newsletter

Sep. 26, 2025

Why GPT-5 used less training compute than GPT-4.5 (but GPT-6 probably won’t)

OpenAI focused on scaling post-training on a smaller model

By Yafah Edelman, Jean-Stanislas Denain, Jaime Sevilla, and Anson Ho

Newsletter

Aug. 22, 2025

Why future AI agents will be trained to work together

Many multi-agent setups are based on fancy prompts, but this is unlikely to persist

By Anson Ho and Jean-Stanislas Denain

Newsletter

Aug. 2, 2025

Quantifying the algorithmic improvement from reasoning models

Reasoning models were as big of an improvement as the Transformer, at least on some benchmarks

By Anson Ho and Arden Berg

Newsletter

May 16, 2025

How fast can algorithms advance capabilities?

This week's issue is a guest post by Henry Josephson, who is a research manager at UChicago's XLab and an AI governance intern at Google DeepMind.

By Henry Josephson

Newsletter

May 9, 2025

How far can reasoning models scale?

Available evidence suggests that rapid growth in reasoning training can continue for a year or so.

By Josh You

LLM responses to benchmark questions are getting longer over time

Data Insight

Apr. 17, 2025

LLM responses to benchmark questions are getting longer over time

By Luke Emberson, Ben Cottier, Josh You, Tom Adamczewski, and Jean-Stanislas Denain

Newsletter

Feb. 28, 2025

The promise of reasoning models

AI reasoning models will achieve superhuman performance in math and coding, yet their economic applications will lag behind, limiting real-world impact.

By Matthew Barnett

Newsletter

Feb. 14, 2025

Algorithmic progress likely spurs more spending on compute, not less

Algorithmic progress in AI may not reduce compute spending—instead, it could drive higher investment as efficiency unlocks new opportunities.

By Matthew Barnett

Newsletter

Jan. 31, 2025

What went into training DeepSeek-R1?

This Gradient Updates issue explores DeepSeek-R1's architecture, training cost, and pricing, showing how it rivals OpenAI's o1 at 30x lower cost.

By Ege Erdil

Newsletter

Jan. 17, 2025

How has DeepSeek improved the Transformer architecture?

This Gradient Updates issue goes over the major changes that went into DeepSeek's most recent model.

By Ege Erdil

AI in 2030, scaling bottlenecks, and explosive growth

Podcast

Jan. 17, 2025

AI in 2030, scaling bottlenecks, and explosive growth

Epoch AI presents their first podcast, exploring AI scaling trends, discussing power demands, chip production, data needs, and how continued progress could transform labor markets and potentially accelerate global economic growth to unprecedented levels.

By Jaime Sevilla, Tamay Besiroglu, and Ege Erdil

Newsletter

Dec. 20, 2024

How do mixture-of-experts models compare to dense models in inference?

This Gradient Updates issue explores how mixture-of-experts models compare to dense models in inference, focusing on costs, efficiency, and decoding dynamics.

By Ege Erdil

Newsletter

Dec. 13, 2024

Frontier language models have become much smaller

In this Gradient Updates weekly issue, Ege discusses how frontier language models have unexpectedly reversed course on scaling, with current models an order of magnitude smaller than GPT-4.

By Ege Erdil

Do the returns to software R&D point towards a singularity?

Paper

May 17, 2024

Do the returns to software R&D point towards a singularity?

The returns to R&D are crucial in determining the dynamics of growth and potentially the pace of AI development. Our new paper offers new empirical techniques and estimates for this crucial parameter.

By Tamay Besiroglu, Ege Erdil, and Anson Ho

Paper

Mar. 12, 2024

Algorithmic progress in language models

Progress in pretrained language model performance surpasses what we’d expect from merely increasing computing resources, occurring at a pace equivalent to doubling computational power every 5 to 14 months.

By Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, and Jaime Sevilla

AI capabilities can be significantly improved without expensive retraining

Paper

Dec. 12, 2023

AI capabilities can be significantly improved without expensive retraining

While scaling compute for training is key to improving LLM performance, some post-training enhancements can offer gains equivalent to training with 5 to 20x more compute at less than 1% the cost.

By Tom Davidson, Jean-Stanislas Denain, Pablo Villalobos, and Guillem Bas

Trading off compute in training and inference

Report

Jul. 28, 2023

Trading off compute in training and inference

We explore several techniques that induce a tradeoff between spending more resources on training or on inference and characterize the properties of this tradeoff. We outline some implications for AI governance.

By Pablo Villalobos and David Atkinson

The limited benefit of recycling foundation models

Report

Jul. 7, 2023

The limited benefit of recycling foundation models

While reusing pretrained models often saves training costs on large training runs, it is unlikely that model recycling will result in more than a modest increase in AI capabilities.

By Matthew Barnett

Power laws in speedrunning and machine learning

Paper

Apr. 21, 2023

Power laws in speedrunning and machine learning

We develop a model for predicting record improvements in video game speedrunning and apply it to predicting machine learning benchmarks. This model suggests that machine learning benchmarks are not close to saturation, and that large sudden improvements are infrequent, but not ruled out.

By Ege Erdil and Jaime Sevilla

Paper

Dec. 12, 2022

Revisiting algorithmic progress

We use a dataset of over a hundred computer vision models from the last decade to investigate how better algorithms and architectures have enabled researchers to use compute and data more efficiently. We find that every 9 months, the introduction of better algorithms contribute the equivalent of a doubling of compute budgets.

By Ege Erdil and Tamay Besiroglu

What’s the backward-forward FLOP ratio for neural networks?

Report

Dec. 13, 2021

What’s the backward-forward FLOP ratio for neural networks?

Determining the backward-forward FLOP ratio for neural networks, to help calculate their total training compute.

By Marius Hobbhahn and Jaime Sevilla

AI Progress

Industry

Infrastructure

Impacts

Featured

Publications

Data explorers

Benchmarks by Epoch AI

Papers & Reports

Data Insights

Newsletter

Podcast

Capabilities

Models

Frontier Data Centers

Chip Owners

Companies

Polling on AI Use

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

AI software progress

Featured

Filter

AI Software Progress: Data & Research

AI Progress

Industry

Infrastructure

Impacts

Featured

Publications

Data explorers

Benchmarks by Epoch AI

Scaling

Software progress

Open models

Capabilities

Math

Leading companies

Finances

Geopolitics

Chips

Data centers

Energy

Adoption and use

Economic impact

Future of AI

Publications

Papers & Reports

Data Insights

Newsletter

Podcast

Data explorers

Capabilities

Models

Frontier Data Centers

Chip Owners

Companies

Polling on AI Use

Benchmarks by Epoch AI

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

About Epoch AI

Donate

Team

Careers

Consultations

For press

Transparency

AI software progress

Featured

Filter