GPU Clusters

Our database of over 500 GPU clusters and supercomputers tracks large hardware facilities, including those used for AI training and inference.

This dataset was previously called ‘AI Supercomputers’ but was renamed to account for its broad coverage of GPU clusters.

By Konstantin Pilz, Robi Rahman, James Sanders and Lennart Heim

Last updated October 09, 2025

Disclaimer: Our dataset covers an estimated 10–20% of existing global aggregate GPU cluster performance as of March 2025. Planned systems are subject to changes and inherently lower confidence. While coverage varies across companies, sectors, and hardware types due to uneven reporting, we believe the overall distribution remains broadly representative. Future country shares may change dramatically as exponential growth continues. Chinese systems are anonymized and their specifications are rounded.

Data insights

Selected insights from this dataset.

See all our insights

The private sector’s share of global AI computing capacity has grown from 40% in 2019 to 80% in 2025. Though many leading early supercomputers such as Summit were run by government and academic labs, the total installed computing power of public-sector clusters has only increased at 1.8x per year, rapidly outpaced by private-sector clusters, whose total computing power has grown at 2.7x per year. The rising economic importance of AI has spurred the private sector to build more and faster clusters for training and inference.

As of May 2025, the largest known public AI supercomputer, Lawrence Livermore’s El Capitan, achieves less than a quarter of the computational performance of the largest known industry cluster, xAI’s Colossus.

Learn more

As of May 2025, the United States contains about three-quarters of global GPU cluster performance, with China in second place with 15%. Meanwhile, traditional high-performance computing leaders like Germany, Japan, and France now play marginal roles in the AI cluster landscape. This shift largely reflects the increased dominance of major technology companies, which are predominantly based in the United States.

Learn more

FAQ

What is a GPU cluster?

A GPU cluster is a collection of specialized chips, such as GPUs and TPUs, organized to efficiently collaborate and achieve high computational performance. A subset, but not all, of these clusters are used for machine learning and AI workloads.

How are GPU clusters different from AI supercomputers?

Many GPU clusters are constructed and used for tasks unrelated to AI, even if the same chips could be used to run AI workloads. In some cases a GPU cluster is built without the networking infrastructure necessary to efficiently support the most demanding AI workloads, or with infrastructure that is extraneous for such workloads.

AI supercomputers are used for training or serving neural network models. Therefore, they typically support number formats favorable for AI training and inference, such as FP16 or INT8, contain compute units optimized for matrix multiplication, have high-bandwidth memory, and rely on AI accelerators rather than CPUs for most of their calculations. A more detailed definition can be found in our documentation and in section 2 of our paper.

How do you measure performance for GPU clusters?

We provide the total computational rate of the hardware in the computing cluster, which is the performance of each ML hardware chip times the number of chips.

Which types of organizations own GPU clusters?

Historically, government and academic research organizations such as Oak Ridge National Laboratory and Sunway have owned many of the top supercomputers. In recent years, most AI supercomputers are owned by cloud computing providers such as AWS, Google, and Microsoft Azure, or AI laboratories such as Meta and xAI.

How was the GPU clusters dataset created?

The data was primarily collected from machine learning papers, publicly available news articles, press releases, and existing lists of supercomputers.

We created a list of potential supercomputers by using the Google Search API to search key terms like “AI supercomputer” and “GPU cluster” from 2019 to 2025, then used GPT-4o to extract any supercomputers mentioned in the resulting articles. We also added supercomputers from publicly available lists such as Top500 and MLPerf, and GPU rental marketplaces. For each potential cluster, we manually searched for public information such as number and type of chips used, when it was first operational, reported performance, owner, and location. A detailed description of our methods can be found in the documentation and Appendix A of our paper.

How do you estimate details like performance, cost, or power usage?

Performance is sometimes reported by the owner of the cluster, or in news reports. Otherwise, it is calculated based on the performance per chip of the hardware used in the cluster, times the number of chips.

Costs are sometimes reported by the owner or sponsor of the cluster. Otherwise, costs are estimated from the cost per chip of the hardware, times the number of chips, multiplied by adjustment factors for intra- and inter-server network hardware.

Power draw is sometimes reported by the owner or operator of the cluster. Otherwise, it is estimated from the power draw per chip, times the number of chips, multiplied by adjustment factors for other hardware and the power usage efficiency of the datacenter.

Detailed methodology definitions can be found in the paper and documentation.

How accurate is the information about each cluster?

We strive to accurately convey the reported specifications of each cluster. The Status field indicates our assessment of whether the cluster is currently operational, not yet operational, or decommissioned. The Certainty field indicates our assessment of the likelihood that the cluster exists in roughly the form specified in the dataset and the linked sources. If you find mistakes or additional information regarding any clusters in the dataset, please email data@epoch.ai.

How is the dataset licensed?

We have released a public dataset with a CC-BY license. This public dataset includes all of our data on clusters outside of China and Hong Kong, along with anonymized data on clusters within China, with values rounded to one significant figure and names and links removed. This dataset is free to use, distribute, and reproduce, provided the source and authors are credited under the Creative Commons Attribution license.

Data on Chinese clusters is stored privately to protect the data sources. For inquiries about this data, please contact data@epoch.ai.

How up-to-date is the data?

Although we strive to maintain an up-to-date database, new GPU clusters are constantly under construction, so there will inevitably be some that have not yet been added. Generally, major clusters should be added within one month of their announcement, and others are added periodically during reviews. If you notice a missing cluster, you can notify us at data@epoch.ai.

How can I access this data?

Download the data in CSV format.
Explore the data using our interactive tools.
View the data directly in a table format.

Who can I contact with questions or comments?

Feedback, questions, and comments can be directed to the data team at data@epoch.ai.

Documentation

This dataset tracks GPU clusters, identified from sources including model training reports, news articles, press releases, and web search results. Additional information about our approach to identifying clusters and collecting data about them can be found in the accompanying documentation.

Read the complete documentation

Use this work

Licensing

Epoch AI's data is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.

Citation

            Konstantin Pilz, Robi Rahman, James Sanders, Lennart Heim, ‘Data on GPU Clusters’. Published online at epoch.ai. Retrieved from ‘https://epoch.ai/data/gpu-clusters’ [online resource]. Accessed .
          

BibTeX Citation

@misc{EpochAISupercomputers2025,
  title = {Data on GPU Clusters},
  author = {Konstantin Pilz, Robi Rahman, James Sanders, Lennart Heim},
  year = {2025},
  month = {04},
  url = {https://epoch.ai/data/gpu-clusters},
  note = {Accessed: }
}

Download this data

GPU Clusters

CSV, Updated October 09, 2025

GPU Clusters

Data insights