Investigating the trajectory of AI for the benefit of society
Epoch AI is a research institute investigating key trends and questions that will shape the trajectory and governance of AI
Featured Work
Essential publications and resources.
The stock of computing power from NVIDIA chips is doubling every 10 months
The stock of computing power from NVIDIA chips is doubling every 10 months
Total available computing power from NVIDIA chips has grown by approximately 2.3x per year since 2019, enabling the training of ever-larger models. The Hopper generation of NVIDIA AI chips currently accounts for 77% of the total computing power across all of their AI hardware. At this pace of growth, older generations tend to contribute less than half of cumulative compute around 4 years after their introduction.
Note this analysis does not include TPUs or other specialized AI accelerators, for which less data is available. TPUs may provide comparable total computing power to NVIDIA chips.
Over 20 AI models have been trained at the scale of GPT-4
Over 20 AI models have been trained at the scale of GPT-4
The largest AI models today are trained with over 1025 floating-point operations (FLOP) of compute. The first model trained at this scale was GPT-4, released in March 2023. Since then, we have identified over 25 publicly announced AI models from 11 different AI developers that we believe to be over the 1025 FLOP training compute threshold.
Training a model of this scale costs tens of millions of dollars with current hardware. Despite the high cost, we expect a proliferation of such models—we saw an average of roughly two models over this threshold announced every month during 2024. Models trained at this scale will be subject to additional requirements under the EU AI Act, coming into force in August 2025.
The training compute of notable AI models is doubling roughly every five months
The training compute of notable AI models is doubling roughly every five months
Since 2010, the training compute used to create AI models has been growing at a rate of 4.7x per year. Most of this growth comes from increased spending, although improvements in hardware have also played a role.
Leading AI companies have hundreds of thousands of cutting-edge AI chips
Leading AI companies have hundreds of thousands of cutting-edge AI chips
The world's leading tech companies—Google, Microsoft, Meta, and Amazon—own AI computing power equivalent to hundreds of thousands of NVIDIA H100s. This compute is used both for their in-house AI development and for cloud customers, including many top AI labs such as OpenAI and Anthropic. Google may have access to the equivalent of over one million H100s, mostly from their TPUs. Microsoft likely has the single largest stock of NVIDIA accelerators, with around 500k H100-equivalents.
A large share of AI computing power is collectively held by groups other than these four, including other cloud companies such as Oracle and CoreWeave, compute users such as Tesla and xAI, and national governments. We highlight Google, Microsoft, Meta, and Amazon as they are likely to have the most compute, and there is little public data for others.
The power required to train frontier AI models is doubling annually
The power required to train frontier AI models is doubling annually
Training frontier models requires a large and growing amount of power for GPUs, servers, cooling and other equipment. This is driven by an increase in GPU count; power draw per GPU is also growing, but at only a few percent per year.
Training compute has grown even faster — around 4x/year. However, hardware efficiency (a 12x improvement in the last ten years), the adoption of lower precision formats (a 8x improvement) and longer training runs (a 4x increase) account for a roughly 2x/year decrease in power requirements relative to training compute.
Partner with Epoch AI
We're proud to consult with select stakeholders on projects aligned with our mission. Considering commissioning work from Epoch AI?
Contact us