AI Data Centers

About AI data centers

What is an AI data center?

We define an AI data center as a collection of one or more buildings which are located near each other, have a shared hardware owner or facility operator, and which run hardware specialized for AI. This hardware includes GPUs or custom chips like Google's TPUs. These data centers may be used to experiment on, train, and deploy AI models.

We don't use a hard limit for how close together the data center buildings have to be, but a rule of thumb is less than 10 km apart. The buildings may or may not be networked together.

We do not distinguish between campuses of multiple buildings and individual buildings in what we classify as a data center.

What are the largest AI data centers?

The largest known AI data center by IT power is Colossus 2, owned by SpaceXAI in Memphis, TN, drawing 946 MW. It is followed by Anthropic-Amazon New Carlisle (Amazon, 910 MW), Microsoft Fairwater Atlanta (Microsoft, 636 MW), Meta Prometheus (Meta, 631 MW), and OpenAI Stargate Abilene (Oracle, 421 MW).

The largest AI data center by compute is Colossus 2, owned by SpaceXAI in Memphis, TN, with 1,112k H100-eq. It is followed by Microsoft Fairwater Atlanta (Microsoft, 768k H100-eq), Meta Prometheus (Meta, 763k H100-eq), Anthropic-Amazon New Carlisle (Amazon, 687k H100-eq), and OpenAI Stargate Abilene (Oracle, 510k H100-eq).

Where are AI data centers located?

The United States has the most large-scale AI data centers, concentrated in Texas (13 sites covered), Virginia (6 sites covered), Ohio (5 sites covered), Nebraska (4 sites covered), and Iowa (4 sites covered). International facilities include Huawei Horinger in Hohhot, China and DayOne Nusajaya in Johor Bahru, Malaysia. Many new sites are clustered near major power infrastructure and fiber networks in the American Midwest and South.

How much power do AI data centers consume?

The 72 AI data centers we cover have a combined 11.2 GW of IT power capacity. Total facility power is 20-50% higher due to cooling and other infrastructure overhead, pushing the real-world footprint of these 72 sites to 14.6 GW of capacity. This is more than New York City's peak demand, at 11 GW. However, the average power consumption of these data centers is typically 60–80% of capacity, due to idle time and maintenance. Individual sites range from under 50 MW—enough to power a small city like Albany, NY—to over 1000 MW, enough to power a mid-sized city like San Diego, CA.

What hardware do AI data centers use?

Major AI data centers deploy NVIDIA H100, H200, and B200 GPUs, Google TPU v5 and v6 chips, and AWS Trainium2 accelerators. Hardware varies by owner: Google uses custom TPUs, Amazon deploys Trainium, while Meta, xAI, and Microsoft primarily use NVIDIA GPUs.

More about this dataset

Documentation

Epoch’s AI Data Centers Hub is an independent database covering the construction timelines of major AI data centers through high-resolution satellite imagery, permits, and public documents.

Read the complete documentation

Downloads

AI Data Centers: All Datasets

ZIP, Updated Jul. 14, 2026

AI Data Centers

CSV, Updated Jul. 14, 2026

AI Data Center Timelines

CSV, Updated Jul. 14, 2026

Data Center Cooling: Chillers

CSV, Updated Jun. 8, 2026

Data Center Cooling: Cooling Towers

CSV, Updated Jan. 20, 2026

Citations

Epoch AI’s data is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.

Citation

Epoch AI, 'AI Data Centers'. Published online at epoch.ai. Retrieved from 'https://epoch.ai/data/ai-data-centers' [online resource]. Accessed 14 Jul 2026.

BibTeX Citation

@misc{EpochAIDataCenters2026, title = {{AI Data Centers}}, author = {{Epoch AI}}, year = {2026}, month = {7}, url = {https://epoch.ai/data/ai-data-centers}, note = {Accessed: 14 Jul 2026} }

Frequently asked questions

How do you choose what data centers to cover?

We mostly prioritize AI data centers by computing capacity. However, we have a lower bar for capacity internationally compared to the US, to achieve more globally balanced coverage. We also limit coverage to data centers that become operational in 2024 or later, with some exceptions. This includes planned data centers, but only if those data centers are under construction and have a significant chance of being completed. In practice, this limits most of the future coverage to 2–3 years from now.

Why are most of the data centers in this database located in the United States?

Based on our prior work we believe most, but not all, of the largest data centers are in the United States. Additionally, focusing on the US first allowed us to become familiar with permitting standards, providing an additional source for us to validate our methodology. We continue to expand coverage in other countries.

What information do you collect about AI data centers?

We collect general information about each data center, such as the location, owner and users. We also collect satellite, aerial and drone images, and estimate key metrics over time as each data center evolves. Every data center we cover has a timeline of total IT power capacity, compute capacity, and capital cost.

How do you estimate IT power, compute, and cost?

IT power is estimated for each data center based on the available evidence, which may include cooling equipment observed in satellite imagery, permitting documents, and statements from companies. Compute is usually calculated from IT power, based on the energy efficiency of chips that we believe are most likely to be used (which is mostly informed by the Chip Owners dataset). In the uncommon case where the specific type and quantity of chips is reported, then compute estimates are based on that. Capital costs are entirely calculated from IT power, based on a general cost-per-watt model. We estimate construction start dates, operation dates and expansion dates primarily from commercial satellite imagery, but also publicly available imagery and commercial drone imagery. For details, see the methodology.

How comprehensive is the data?

The database is the result of a continuous effort since mid-2025 to cover the largest AI data centers globally, most of which are in the US. Our database covers an estimated 27% of AI compute that has been delivered by chip manufacturers globally as of April 2026, assuming a one-quarter lag from chip sales to deployment. We continue to expand coverage in the US and other countries.

The coverage is strongest for the largest AI data centers between 2024 and 2028, because we have prioritized finding those. We believe the database captures the majority of record-holding facilities (in terms of compute capacity) during that period.

How is this database different from the GPU Clusters database?

The deprecated GPU Clusters database covers computing clusters used for AI and other applications. These clusters may make up just part of the total compute present in one data center building or campus. GPU Clusters has broad coverage across time and space, but only covers an estimated 10–20% of global compute as of March 2025. In contrast, this database looks at AI data centers at a project level, and focuses on the largest current and upcoming data centers to achieve higher compute coverage. It also uses primary sources much more, including permitting and satellite imagery, to get greater detail and accuracy on individual data centers.

How confident are you in the data?

We model our uncertainty both in terms of quantities and construction timelines. We expect that 80% of the time, a given IT power capacity estimate is accurate within a factor of 1.4x. For compute capacity, this factor increases to 1.5x, and for cost, the factor is 1.6x. As for the timelines, we expect that 80% of the time, our estimated date is within 6 months of the actual date.

These confidence levels are informed by data on how our estimates differ from the most credible reference values, and modeling on top of that data. We expect our estimates to become more accurate as we find more data centers and do more research on how they work.

What do “Speculative” and “Likely” mean next to data center users?

By default, if we list a user, owner, or other affiliate of a data center, we have strong evidence for it. However, sometimes we are uncertain about these affiliations, especially the data center users. We indicate this uncertainty with “Speculative” and “Likely” tags.

“Speculative” means that we have no record of an affiliation, but we have some reason to believe it. For example, we speculate that Anthropic will use two Amazon data centers in Mississippi because the New York Times reported that at least one of the Anthropic-Amazon Project Rainier data centers is located there, but they didn’t specify which ones.

“Likely” means we have some record of an affiliation, but we aren’t confident. For example, OpenAI is a “Likely” user for Microsoft Fairwater because a Microsoft spokesperson stated it “initially will be used to train OpenAI models”, but Microsoft’s partnership with OpenAI has been weakening since 2023.

Can an entire data center be used to train one AI model?

This is possible, but does not happen often in practice. The total capacity of a data center is more like a cap on the size of a training run in that data center. Data centers often run multiple jobs in parallel and/or get partitioned for multiple users. Even if the entire data center is deployed on a single job, hardware failures will slightly reduce the capacity. Relatedly, when we report the total compute capacity of a data center in 8-bit OP/s, this is the theoretical peak capacity for number formats of 8 bits or above. In practice, the computational performance is typically 20–50% of the peak value, due to inefficiencies.

Can any of these data centers be networked together to do bigger training runs?

A previous analysis concluded that distributed training runs across distant data centers is technically feasible. AI companies have also proposed distributed training runs. Our database will document which data centers are networked together as we learn more.

Featured

Publications

Data explorers

Benchmarks by Epoch AI

AI Progress

Industry

Infrastructure

Impacts

Publications

Papers & Reports

Data Insights

Newsletter

Podcast

Data explorers

Capabilities

Models

Data Centers

Chip Owners

Companies

Polling on AI Use

Benchmarks by Epoch AI

MirrorCode

Epoch Capabilities Index

FrontierMath: Open Problems

FrontierMath: Tiers 1-4

Scaling

Software progress

Open models

Capabilities

Math

Leading companies

Finances

Geopolitics

Chips

Data centers

Energy

Adoption and use

Economic impact

Future of AI

About Epoch AI

Donate

Team

Careers

Consultations

For press

Transparency

AI Data Centers

Display

Data

Explore the world's AI data centers on an interactive map

We cover 72 data centers

About AI data centers

What is an AI data center?

What are the largest AI data centers?

Where are AI data centers located?

How much power do AI data centers consume?

What hardware do AI data centers use?

Related work

More about this dataset

Documentation

Downloads

Citations

Citation

BibTeX Citation

Frequently asked questions

How do you choose what data centers to cover?

Why are most of the data centers in this database located in the United States?

What information do you collect about AI data centers?

How do you estimate IT power, compute, and cost?

How comprehensive is the data?

How is this database different from the GPU Clusters database?

How confident are you in the data?

What do “Speculative” and “Likely” mean next to data center users?

Can an entire data center be used to train one AI model?

Can any of these data centers be networked together to do bigger training runs?

Trusted by leaders at OpenAI, DeepMind, and governments worldwide

Trusted by leaders at OpenAI, DeepMind,
and governments worldwide