GPU Clusters Documentation

Inclusion

The dataset focuses on GPU clusters, which contain large quantities of processors known as accelerators, and are deployed on a continuous campus without significant physical separation. GPU clusters are sometimes configured as AI supercomputers, in which the accelerators are linked with high-bandwidth networking hardware.

Criteria

A cluster should be included in this database if both of the following criteria are satisfied:

The system contains chips that can accelerate AI workloads, if used with the right infrastructure and software. These include NVIDIA’s V100, A100, H100, and GB200 GPUs, Google’s TPUs, and other chips commonly used to train frontier AI models.
The system has high theoretical performance relative to other GPU clusters at the time it was built. In order to train state-of-the-art machine learning models, developers typically need access to more compute than the amount that was used for previous models. Additionally, hardware improvement is rapid and follows an exponential trend over time, so we use a dynamic threshold during the study period. For inclusion, the theoretical performance of the system must be at least 1% of the largest known GPU cluster that existed on the date when it first became operational.
- Theoretical performance is calculated by multiplying the number of chips by their theoretical maximum (non-sparse) FLOP/s value, for the highest available FLOP/s metric on 32-, 16-, or 8-bit number formats. See here for the minimum FLOP/s count required at any point in time to be included.
Outside of the study period, we apply an inclusion threshold that does not scale over time:
- For systems before 2017, the cluster is included if its theoretical performance is at least 10^16 FLOP/s, in any numerical precision, which roughly corresponds to 1% of the highest performance of any pre-2017 cluster.
- For systems after 2024, the cluster is included if its theoretical performance is equivalent to at least 1,000 H100s, which is 1% the size of the leading cluster as of EOY 2024.

Additionally, we also include planned clusters, if we anticipate they will likely meet the above requirements once they become operational. These are indicated with a value of “Planned” in the Status field, and can be shown or hidden in the visualization.

Data sources

We collect and maintain data on GPU clusters from a variety of sources, including machine learning papers, publicly available news articles, press releases, and existing lists of supercomputers.

We created a list of potential clusters by using the Google Search API to search key terms like “AI supercomputer” and “GPU cluster” from 2019 to 2025, then used GPT-4o to extract any clusters or supercomputers mentioned in the resulting articles. We also added supercomputers from publicly available lists such as Top500 and MLPerf, and GPU rental marketplaces. For each potential cluster, we manually searched for public information such as number and type of chips used, when it was first operational, reported performance, owner, and location. A detailed description of our methods can be found in Appendix A of our paper.

Overview

Coverage

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Papers & Reports

Data Insights

Newsletter

Podcast

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

Inclusion

Criteria

Data sources

GPU Clusters Documentation – Inclusion

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Publications

Papers & Reports

Data Insights

Newsletter

Podcast

Data explorers

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

Benchmarks by Epoch AI

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

Scaling

Software progress

Open models

Capabilities

Math

Leading companies

Finances

Geopolitics

Chips

Data centers

Energy

Adoption and use

Economic impact

Future of AI

About Epoch AI

Donate

Team

Careers

Consultations

For press

Transparency

GPU Clusters Documentation

Inclusion

Criteria

Data sources