The computational performance of leading AI supercomputers has doubled every nine months

Since 2019, the computational power of leading AI supercomputers has grown by 2.5x per year, enabling both faster training and far larger models. GPT-3’s original two-week training run in 2020 would now take under two hours on xAI’s Colossus.

This growth was enabled by two factors: the number of chips deployed per cluster has increased by 1.6x per year, and performance per chip has also improved by 1.6x annually.

Published

April 30, 2025

Last updated

April 30, 2025

Learn more

Data

Data come from our AI Supercomputers dataset, which collects information on 726 supercomputers with dedicated AI accelerators, spanning from 2010 to the present. We estimate that these supercomputers represent approximately 10-20% (by performance) of all AI chips produced through March 2025. We focus on the 501 AI supercomputers which became operational in 2019 or later, since these are most relevant to modern AI training.

For more information about the data, see Pilz et. al., which describes the supercomputers dataset and analyzes key trends, and the dataset documentation.

Analysis

We measure theoretical computational performance as the number of 16-bit floating-point operations per second (FLOP/s) based on reported hardware specs. Performance values were either taken directly from reported data or calculated using processor specifications and quantity. After excluding supercomputers without known 16-bit performance data, we retained 482 observations. From these, we identify 57 “leading” AI supercomputers, defined as those that ranked among the top 10 most powerful supercomputers at the time they first became operational. We then run log-linear regressions to estimate annual growth rates in computational performance, number of chips, and per-chip performance for these leading supercomputers.

To estimate confidence intervals, we first calculate the standard error of the slope from the regression line. Using this standard error, we construct a 90% confidence interval by adding and subtracting 1.645 times the standard error from the estimated slope. Confidence intervals are rounded to the smallest number of significant figures needed to clearly distinguish the estimate from the upper and lower bounds.

Our results are as follows:

Chip quantity Chip performance Total performance
1.6x per year (1.5 - 1.8) 1.6x per year (1.5 - 1.7) 2.5x per year (2.4 - 2.7)

Note that AI training used different numerical precisions throughout our study period. While 32-bit was still common in 2019, most training likely happened on 16-bit performance in the early 2020s. By 2025, some AI training workloads had begun to move to 8-bit precisions. When considering the highest performance available across these three precisions, we find the following trends:

Chip quantity Chip performance Total performance
1.5x per year (1.3 - 1.6) 1.8x per year (1.6 - 2.0) 2.6x per year (2.3 - 2.8)

Find an explanation and discussion of numerical precision in Appendix B.8.

Assumptions

We chose to focus on AI supercomputers that became operational in 2019 or later. This date cutoff was predetermined prior to data collection in order to cover the time period with best data availability about supercomputers, and modern machine learning practices. Some large clusters, used for traditional scientific research rather than AI computing, existed prior to 2019, and would appear among the top 10 if the start date were earlier. We assume these are not indicative of current growth trends, but if they are included, the growth rate results would likely be slower, especially in the first half of the study period.

Explore this data

The AI supercomputers dataset will be released in early May. Visit this page to be notified about its release.