Machine Learning Trends

Our ML Trends dashboard offers curated key numbers, visualizations, and insights that showcase the significant growth and impact of artificial intelligence.

Last updated on Jan 13, 2025

Display growth values in:

Training compute

Likely

67303.0 x/year

The training compute of frontier AI models has grown by since 2020.

90% confidence interval: to .

Computational performance

Likely

1.35 x/year

The amount of FLOP/s for GPUs in FP32 precision grows by . A similar trend is observed for FP16.

90% confidence interval: to .

Algorithmic improvements

Plausible

5.1 %/year

The physical compute required to achieve a given performance in language models is declining at a rate of .

90% confidence interval: to .

Training cost

Likely

3086.6 x/year

The cost in USD of training frontier models has grown by per year since 2020.

90% confidence interval: to .

Hardware acquisition cost

Likely

297.6 x/year

The acquisition cost in USD of the hardware used to train frontier AI models has grown by since 2016.

90% confidence interval: to .

Compute Trends

Frontier training compute

Likely

67303.0 x

The training compute of frontier AI models has grown by since 2020.

90% confidence interval: to .

Power Usage

Likely

249.4 x

The power required to train frontier AI models has grown by since 2020.

90% confidence interval: to .

Report

Training compute of frontier AI models grows by 4-5x per year

Our expanded AI model database shows that the compute used to train recent models grew 4-5x yearly from 2010 to May 2024. We find similar growth in frontier models, recent large language models, and models from leading companies.

Most compute used in a training run

Plausible

3.5e26 FLOP

The total training compute for the final training run of Grok 3 is estimated at 3.5e26 FLOP.

Data Trends

Language training dataset size

Likely

3.7 x

The training dataset size for language models has grown by since 2010.

90% confidence interval: to .

When will the largest training runs use all public human-generated text?

Plausible

2028

The median projected year in which most of the effective stock of publicly available human-generated text will be used in a training run is 2028.

90% confidence interval: 2026 to 2033.

Paper

Will we run out of data? Limits of LLM scaling based on human-generated data

We estimate the stock of human-generated public text at around 300 trillion tokens. If trends continue, language models will fully utilize this stock between 2026 and 2032, or even earlier if intensely overtrained.

Largest training dataset used to train an LLM

Uncertain

> 30 trillion tokens

Llama 4 models were trained using a collection of over 30 trillion tokens from text, image, and video datasets. This makes them the AI models with the largest publicly confirmed datasets.

Stock of data on the internet

Plausible

510 trillion tokens

The amount of tokens in the indexed web, the portion of the web that is publicly accessible from search engines, is estimated at 510 trillion tokens.

95% confidence interval: 130 trillion tokens to 2100 trillion tokens.

Hardware Trends

Computational performance

Likely

1.35 x

The amount of FLOP/s for GPUs in FP32 precision is growing at . A similar trend is observed for FP16.

90% confidence interval: to .

Lower-precision number formats

Plausible

9 x

The average performance gain in FLOP/s from switching from FP32 to tensor-FP16 is 9x.

Memory capacity

Likely

1.2 x

DRAM capacity (Byte) is growing by .

90% confidence interval: to .

Memory bandwidth

Likely

1.18 x

DRAM bandwidth in Byte/s is growing by .

90% confidence interval: to .

Report

Trends in Machine Learning Hardware

FLOP/s performance in 47 ML hardware accelerators doubled every 2.3 years. Switching from FP32 to tensor-FP16 led to a further 10x performance increase. Memory capacity and bandwidth doubled every 4 years.

Highest performing GPU in Tensor-FP16

Likely

2.25e15 FLOP/s

The highest performance of a GPU in Tensor-FP16, from the NVIDIA B200 SXM, is 2.25e15 FLOP/s.

Highest performing GPU in INT8

Likely

4.5e15 OP/s

The highest performance of a GPU in INT8, from the NVIDIA B200 SXM, is 4.5e15 OP/s.

Algorithmic Progress

Compute-efficiency in language models

Plausible

5.1 %

The physical compute required to achieve a given performance in language models is declining at a rate of .

90% confidence interval: to .

Compute-efficiency in computer vision models

Plausible

331.1 %

The physical compute required to achieve a given performance in computer vision models is declining at a rate of .

95% confidence interval: to .

Contribution of algorithmic innovation

Plausible

35%

The improvements to compute efficiency explain roughly 35% of performance improvements in language modeling since 2014, vs 65% explained by increases in scale.

Paper

Algorithmic Progress in Language Models

Progress in language model performance surpasses what we’d expect from merely increasing computing resources, occurring at a pace equivalent to doubling computational power every 5 to 14 months.

Chinchilla scaling laws

Plausible

20 tokens per parameter

The ratio of data to parameters to achieve compute-optimal scaling for LLMs is 20 tokens per parameter.

Investment Trends

Training cost

Likely

3086.6 x

The cost in USD of training frontier models has grown by per year since 2020.

90% confidence interval: to .

Hardware acquisition cost

Likely

297.6 x

The acquisition cost in USD of the hardware used to train frontier AI models has grown by since 2016.

90% confidence interval: to .

Paper

How Much Does It Cost to Train Frontier AI Models?

The cost of training frontier AI models has grown by a factor of 2 to 3x per year for the past eight years, suggesting that the largest models will cost over a billion dollars by 2027.

Most expensive AI model

Uncertain

$460 million

The total amortized cost of developing Grok-4, including hardware, electricity, staff compensation, and preliminary experiments, is estimated at $460 million USD.

Most Expensive AI Training Cluster

Uncertain

$3 billion

The acquisition cost of the hardware to train Grok-3, including GPUs, other server components, and networking, is estimated at $3 billion USD. The hardware used to train Grok 4 may have been more expensive.

Biological Models

Training compute

Likely

8.7 x

The training compute of biological sequence models has been growing by since 2018.

Key DNA sequence database

Likely

8.3 x

The number of sequences stored in GenBank (INSDC) has been growing by between 2022 and 2023.

Report

Biological Sequence Models in the Context of the AI Directives

The expanded Epoch database now includes biological sequence models, revealing potential regulatory gaps in the White House’s Executive Order on AI and the growth of the compute used in their training.

Most compute-intensive biological sequence model

Likely

1.1e24 FLOP

The training compute for ESM3 (98B), the most compute-intensive biological sequence model to date, is estimated at 1.1e24 FLOP.

Protein sequence data

Uncertain

~7 billion entries

The number of unique entries across protein sequence databases is estimated at ~7 billion entries.

Acknowledgements

We thank Tom Davidson, Lukas Finnveden, Charlie Giattino, Zach Stein-Perlman, Misha Yagudin, Jai Vipra, Patrick Levermore, Carl Shulman, Ben Bucknall and Daniel Kokotajlo for their feedback.

Several people have contributed to the design and maintenance of this dashboard, including Jaime Sevilla, Pablo Villalobos, Anson Ho, Tamay Besiroglu, Ege Erdil, Ben Cottier, Matthew Barnett, David Owen, Robi Rahman, Lennart Heim, Marius Hobbhahn, David Atkinson, Keith Wynroe, Christopher Phenicie, Nicole Maug, Aleksandar Kostovic, Alex Haase, Robert Sandler, Edu Roldan and Andrew Lucas.

Use this work

Epoch AI’s work is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.

Cite this work as

      Epoch AI (2023), "Key Trends and Figures in Machine Learning". Published online at epoch.ai. Retrieved from: 'https://epoch.ai/trends' [online resource]
    

BibTeX citation

  @misc{epoch2023aitrends,
    title="Key Trends and Figures in Machine Learning",
    author={{Epoch AI}},
    year=2023,
    url={https://epoch.ai/trends},
    note={Accessed: }
  }

Epoch AI’s work is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.

Cite this work as

                
                  Epoch AI (2023), "Key Trends and Figures in Machine Learning". Published online at epoch.ai. Retrieved from: 'https://epoch.ai/trends' [online resource]

BibTeX citation

  
  @misc{epoch2023aitrends,
    title={Key Trends and Figures in Machine Learning},
    author={{Epoch AI}},
    year={2023},
    url={https://epoch.ai/trends},
    note={Accessed: }
  }
  

              

If you spot an error or would like to provide feedback, please reach out at .