Epoch’s AI Chip Sales dataset tracks aggregate AI accelerator shipments and total compute capacity across major chip designers and manufacturers.
The data is available on our website as a visualization or table, and is available for download as a CSV file, updated daily. For a quick-start example of loading the data and working with it in your research, see this Google Colab demo notebook.
If you would like to ask any questions about the data, or suggest companies that should be added, feel free to contact us at data@epoch.ai.
Epoch’s data is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.
This dataset currently tracks estimates of total sales or shipments of the flagship AI chips designed or sold by Amazon (Trainium/Inferentia), AMD, Huawei, Nvidia, and Google (TPU).
For most chip types and time periods, exact chip sales are not disclosed by the relevant companies, though Nvidia has provided the most informative direct disclosures on total Hopper and Blackwell sales. All of our figures are estimates grounded in data and research, though often with significant uncertainty.
Below, we break down specific methodologies and assumptions for each chip designer. In short:
Nvidia is the largest seller of AI chips, which they call “GPUs”. Here, we explain how we estimated total Nvidia AI GPU shipments from 2022 onwards. We focus on Nvidia’s flagship AI GPUs, not its gaming GPUs (which are used by some AI practitioners) or chips for self-driving cars.
We estimate Nvidia AI GPU sales based on Nvidia’s revenue from AI compute, and reported/estimated prices of its GPUs over time, as well as announcements and Nvidia statements of which chip types were shipped over time.
We then compare the results of this model from a public disclosure from Nvidia that it has shipped 4M Hopper GPUs and 3M Blackwell GPUs through October 2025.
The code for this model can be found here.
The revenue-based model uses this approach:
We restrict our data to 2022, which should capture a very large majority of the existing stock of Nvidia chips, given a typical lifespan that is likely between 3-6 years1 and rapid growth in sales over time.
Nvidia’s AI GPU revenue
Nvidia breaks out revenue for its “Data Center” division, which sells products intended for data centers. Data Center has been >80% of Nvidia’s total revenue since 2024 and includes all of Nvidia’s mainstream AI chips, but not gaming GPUs and its self-driving car chips (which are out of scope for our counts).
Nvidia breaks down “Data Center” further into Compute and Networking, with Compute making up 80-85% of Data Center in recent years.2 Networking means cables, switches, and other components used to connect chips together. This is vitally important for creating chip clusters, but we mostly ignore networking revenue for the purpose of estimating chip counts.
A small nuance is that “Compute” is more than physical compute sales. Nvidia describes (page 5) its Compute sales as including hardware (GPUs, CPUs, and interconnects), cloud compute revenue (DGX Cloud), and AI-related software products.3 Software and cloud are likely a very small share; for example, they were $2 billion in revenue in 2024, compared to ~$100B in total Compute revenue. We assume that ~95-99% of Compute revenue is from hardware sales. Additionally, Nvidia’s cloud division mostly uses compute that Nvidia sells and then rents back.4
This revenue data is summarized below. Note that Nvidia’s fiscal year ends in January and runs ~11 months ahead of the calendar year (e.g. FY25 ended January 26, 2025). Because their quarters do not line up with calendar quarters, in our visualizations we use interpolation to distribute GPU counts by calendar quarter.
| Period | Data center revenue | Compute revenue |
|---|---|---|
| Fiscal year 2023 (approx. Feb 2022 - Jan 2023) | $15.0B | $11.3B |
| Fiscal year 2024 (Feb 2023 - Jan 2024) | $47.5B | $38.9B |
| Fiscal year 2025 (Feb 2024 - Jan 2025) | $115.3B | $102.3B |
| FY 2026, first three quarters (Feb 2025 - Oct 2025) | $131.4B | $111.0B |
Nvidia’s revenue mix over time
Given total revenue and prices of chips over time, the next step is allocating quarterly revenue to chip type. Nvidia doesn’t consistently break down sales by chip, but they provide substantial details in earnings commentary.
Nvidia’s Hopper generation (H100) launched in late 2022 and ramped massively in 2023 while A100s ramped down. The next major generation, Blackwell, ramped in early to mid-2025. And Nvidia’s China-spec chips made up around 15% of sales while their export was legal (late 2022 to October 2023, and ~Feb 2024 to April 2025; the H200 will also be sold in China in 2026).
An abbreviated timeline with notes is provided below, and more detailed estimates and notes can be found in this sheet.
Note is that we round down the sale of data center GPUs that don’t appear in this list to zero. The main GPU this applies to is the lower-grade L40, released early 2023. There is little information available about L40 sales, but they may have made up a small portion of 2023-2024 sales. We also bucket some GPUs together, like H200s with H100s and B100s with B200s.
| GPU | First shipped | Approx. revenue share | Notes |
|---|---|---|---|
| A100 | June 2020 | Dominant through late 2022 | Flagship pre-Hopper. By September 2023, A100 revenue was “declining sequentially” and at the “tail end” of the architecture. |
| H100 and H200 “Hopper” (bucketed with 2024’s H200 due to lack of breakout and similar specs) | Late 2022 | ~80% of 2023-2024 ~15-20% in first half of 2025 | Flagship AI chip of the post-ChatGPT era. Started shipping by October 2022, and H100 revenue was “much higher than A100” by January 2023. Hoppers were still sold through late 2025. By April, transition to Blackwell was “nearly complete” with Blackwell at ~70%, but the following quarter H100/H200 sales actually increased. |
| A800 | Oct 2022 | A800 and H800 ~20% through late 2023 | China-compliant A100 variant; banned Oct 2023. |
| H800 | Late 2022 | China-compliant H100 variant, banned Oct 2023. China was 19% of data center sales during relevant period. | |
| H20 | Feb 2024 | ~13% from Feb 2024 to April 2025 | Heavily-downgraded Hopper variant for China; eventually banned April 2025. China was about 13% of 2024 revenue, and H20 had $4.6B in sales in spring 2025, or ~13% of compute sales. |
| “Blackwell” B200 (bucketed with B100) | Late 2024 | Majority by mid-2025 | Next generation after Hopper. We bucket B200 bucket with the less powerful B100 due to lack of breakout, and presumed low B100 share. Most B200s are sold in GB200 systems (see notes on B200/B300 below). Quarter ending Jan 2025 had $11B of Blackwell revenue or ~33% share of data center. Next quarter, Blackwell was “nearly 70%” of data center compute. |
| B300 (aka Blackwell Ultra) | Mid 2025 | Majority by late 2025, displacing B200 | Mid-cycle refresh of B200 with “tens of billions” in sales by July 2025, perhaps roughly half of Blackwell. The next quarter it reached two-thirds of Blackwell. |
Chip prices
Note on servers vs chips
To estimate how many GPUs Nvidia shipped, the next step is to divide their Compute revenue by average prices per chip, which we collect using reports and estimates from the media and analysts.
As mentioned above, their compute revenue includes a small amount of software and cloud revenue. Another potentially significant nuance is that Nvidia sells some GPUs bundled into complete servers, which means Nvidia’s Compute revenue captures revenue from non-chip, non-networking server components as well.5 This would increase the effective Compute revenue that Nvidia makes per AI chip.
For context, most Nvidia AI chips are deployed in servers, which are necessary for training and commercial-scale inference, rather than individual cards. Most of the cost of a server is the GPUs themselves, but server components add costs: one bill-of-materials estimate of an 8-GPU H100 server has the full server sold at a ~38% premium vs the 8 GPUs, including networking components. And as we note below, Blackwell NVL72 systems command a broadly similar premium vs individual Blackwell GPUs, with a roughly 40k to 50k server price per GPU for GB200 NVL72 vs 30-40k for standalone B200s.
However, Nvidia’s overall “server premium” is probably much lower than ~30%:
Overall, a detailed analysis of the Nvidia server market is outside our scope here. We choose uncertainty intervals about chip prices wide enough to incorporate this uncertainty about the server premium.
A100
Standalone A100s cost around $10k in 2023 per CNBC, while SemiAnalysis described it as “$10k+”. A DGX A100 server had a launch MSRP of $199k in 2020 or $25k per GPU (maybe the last Nvidia flagship AI product to have an MSRP), though this includes a server premium, and is before bulk discounts.
For A800s specifically, Financial Times reports a price of ~$10k each in 2023. This is probably a standalone chip price.
Overall, we use an uncertainty interval of $10k to $15k for A100s in 2022-2023, given that reports generally give a lower end of 10k, and a possible server premium.
H100
The Hopper generation has perhaps the best direct evidence of average price, thanks to a disclosure from CEO Jensen Huang himself: Nvidia made a total of $100B in revenue from selling 4 million Hopper GPUs through October 2025 (excluding China). That works out to $25k per GPU on average—though both figures appear to be round numbers, suggesting some rounding.
Other reports include:
Overall, there is a relative consensus of reports of H100 prices in the low to mid 20k range by 2024 and 2025. We have more uncertainty about the average price in 2023, when there was reportedly a widespread shortage of H100s.
H200 vs H100
In 2024, Nvidia started selling the H200, a mid-cycle update to the H100. The main difference is improved memory: 141 GB vs 80 GB for the H100. High-bandwidth memory generally costs $10-20 per GB, so the H200 is ~$1k more expensive to produce than the H100, implying a price increase of ~$4k if Nvidia applied its overall GPU gross margin of ~75% to this additional cost.
Nvidia never broke H200 vs H100 sales, but at least one analyst predicted that the majority of Hopper sales would be H200 from H2 2024. So it is possible that the introduction of the H200 meaningfully increased the average price of Hoppers (relative to if all sales were H100).7
Reconciling with Jensen Huang’s volume disclosures
In late 2025, Jensen Huang claimed that Nvidia had sold a total of 4 million Hopper GPUs for $100 billion in total, excluding sales to China (that is, 4M H100s and H200s, not including H20s or H800s). This suggests an average sales price of ~$25k across the generation, ignoring the possibility of rounding.
A deeper look into Hopper revenue complicates this picture: if we estimate total Hopper revenue by multiplying Compute revenue by our estimates of Hopper share of Nvidia’s GPU sales, this yields a total of ~$126B, not including networking. This could be consistent with “$100B” due to rounding. But it suggests average revenue of closer to ~$30k (which would imply ~4.2M total sales) rather than $25k (meaning 5M total sales, which is less likely to be rounded down to 4M).
There are a few reasons our estimated Hopper revenue could differ from the reported $100 billion besides rounding:
Our overall credence intervals for average Hopper (H100/H200) prices are listed below. Because the H100 and H200 were on sale for over three years, we break down our estimates by year.
| Year | Low price | High price | Geometric mean | Notes |
|---|---|---|---|---|
| 2022 | 27k | 35k | ~29.6k | Low volume in 2022 in any case |
| 2023 | 27k | 35k | ~30.7k | |
| 2024 | 25k | 32k | ~28.2k | H100s were reportedly 20k-25k by late 2024. This may be pushed up by the H200. Adjusted upwards due to possible server revenue and CEO’s volume disclosure. |
| 2025 | 22k | 30k | ~25.6k | Entire H100 server costs reportedly in low 20k range by 2025, but H200s are more expensive. Adjusted upwards due to possible server revenue and CEO’s volume disclosure. |
H800
Reuters reported that 8-GPU H800 servers were sold for 2 million yuan, or ~$280k. This is similar to SemiAnalysis’ 2023 report of $270k for an HGX H100 server, suggesting a similar price for the GPU. One article notes a range of “street” prices for H800s from $35k up to $69k, though the higher end is almost certainly not representative.
Overall, we use a range of 25k to 35k, identical to what we use for H100s.
H20
Based on several sources, unit prices were typically in the low-10k range. Reuters reports they sold for $12k to 15k “per card” in February 2024, while TechPowerUp reported $12k in July 2024. Reuters also reported in February 2025 that “Analysts estimate Nvidia shipped approximately 1 million H20 units in 2024, generating over $12 billion in revenue for the company.”
Given these reports, we use the following price assumptions:
| Year | Low price | High price | Geometric mean | Notes |
|---|---|---|---|---|
| 2024 | 10k | 15k | ~12.2k | Most reports in 2024 converge on 12k to 15k |
| 2025 | 10k | 13k | ~11.4k |
Blackwell (B200 and B300)
Nvidia’s next-gen Blackwell chips were released in late 2024 and come in three variants: B100, B200, and B300 (aka Blackwell Ultra).
B200s and B300s are the flagship GPUs of Blackwell, with B300 being a mid-cycle upgrade. They are mostly deployed in the form of GB200 and GB300 systems, which contain one Nvidia Grace CPU per two Blackwell GPUs, with the most common configuration being NVL72 server racks containing 72 Blackwell GPUs.8 GB200/GB300 systems have a 10% higher FLOP/s rating per GPU than standalone B200/B300s.
Overall, NVL72 appears to be the overwhelmingly popular form factor for Blackwell.9 Given the predominance of NVL72, B100s are likely a low-volume product and we group them together with B200s.
How much do B200s and B300s cost? Let’s start with B200 and discuss how the B300 differs.
For estimating the price of (G)B200s, there are two possible approaches: (1) find the price of standalone B200 GPUs, or (2) estimate the price-per-GPU of NVL72 systems, and adjust downwards for non-GPU costs like networking.
For standalone B200s:
Meanwhile, the price of GB200 NVL72s is fairly well documented:
However, the effective price of GB200s (Nvidia Compute revenue per GPU shipped) will be lower than this:
Overall, these NVL72 figures point to a range of 35k to 40k per GPU. We use a price range of 33k to 42k for GB200s (a somewhat unprincipled widening of 35K-40K, which is a narrow range but would be a reasonable choice for a middle-50% interval).
GB300 vs GB200
How might GB300s differ in price from the GB200?
A naive model of applying a 10-20% premium on our GB200 interval, with no correlation between GB300 premium and GB200 price, suggests a GB300 price range of 38k to 49k. The higher end is consistent with higher-end reports of GB300 NVL72 systems bought by Apple.
Nvidia CEO Jensen Huang disclosed that through October 2025, Nvidia had shipped a total of 4M Hoppers (H100s and H200s) and three million Blackwell GPUs through October, excluding the Chinese market in both cases. (excluding China sales): 4M Hopper GPUs, and 3M Blackwell GPUs (technically, he says 6M Blackwell GPUs, but he’s referring to the two individual dies within each GPU package)12
How does this compare with our estimates? Our model puts overall Nvidia GPU shipments through October 2025:
These aren’t perfectly consistent, especially the Blackwell total which is ~15% lower than 3 million. Since the 3M Blackwell count is actually derived from 6M Blackwell dies divided by two, this isn’t explained by rounding, since ~2.5M Blackwell GPUs would be 5M dies.13
Some other potential explanations for the Blackwell difference:
Given that the conclusions are not far off, we won’t attempt to fully resolve this difference. We intend to continue modeling Nvidia chip shipments after October 2025, and not just Hopper and Blackwell, so our focus is on building the best comprehensive model of Nvidia sales.
Google designs custom AI accelerators called Tensor Processing Units (TPUs), which are used by Google DeepMind for training and inference, by Google at large for AI/machine learning tasks, and by external cloud customers such as Anthropic.14
Google partners with Broadcom, a major chip designer, to produce the TPU. Google designs the high-level TPU architecture, while Broadcom handles physical design and supply logistics, including placing manufacturing orders with TSMC. In short, Google buys TPUs from Broadcom, which oversees their production.
Google does not disclose its TPU volumes, but we have a significant amount of evidence we can use to estimate these volumes. We use the following methodology:
We focus on 2024-2025 sales for now because there is more information available on total TPU spending.
Given that many inputs are uncertain, we use Monte Carlo simulations and report results with confidence intervals. The full model can be found here.
Broadcom reports AI semiconductor revenue each quarter, but this includes both custom AI accelerators (so-called XPUs, which includes TPUs) and networking products. However, on earnings calls, Broadcom’s leadership has said that XPUs account for roughly 65 - 70% of its AI semiconductor revenue, which allows us to isolate XPU revenue. For example, Broadcom’s FY2024 AI semiconductor revenue was $12.2B, implying $8 - 8.5B in XPU revenue.
Note that Broadcom’s fiscal quarters run about two months ahead of the calendar: for example, their Q4 2025 ended on November 2. When visualizing our results by quarter, we use interpolation to distribute TPU counts by calendar quarter.
The next step is estimating Google’s share of that XPU revenue:
In 2024, Broadcom’s two primary XPU customers were Google and Meta. Meta reported spending approximately $987M with Broadcom in 2024.15 Subtracting disclosed customer spending from total XPU revenue implies that Google’s TPU-related spend with Broadcom in 2024 was roughly $7 - 7.5B.
In 2025, Broadcom had almost exactly $20B in AI semiconductor revenue in 2025. They said XPUs were a 65% share of AI semis in Q3 2025, similar to 2024, so the full year share was likely similar as well, for around $13B in XPU revenue.
We can estimate Google’s TPU/XPU spend a few ways:
From the previous step, we have an estimate of how much Google spent on TPUs with Broadcom. To translate this spending into TPU production volumes, we divide estimated revenue by per-chip prices. This has two steps:
Bill of materials cost model
The cost of manufacturing a TPU can be broken down into four components.
Our estimates draw from publicly available information including wafer pricing, memory costs by generation, and packaging costs derived from TSMC’s capacity and revenue disclosures. The estimates vary by TPU generation, since newer chips use more advanced processes, larger dies, and more memory.
Broadcom margins
Broadcom may make around a 60% gross profit margin on TPUs, with a plausible range of 50% to 65%:
A 60% margin would mean that a chip that costs Broadcom $1k to produce would then sell for $1k / (1 - 60%) = $2,500.
Summary of results
Our cost estimates cover the TPU module only and exclude networking. While networking is a meaningful system-level cost, Broadcom counts it separately from its XPU revenue. We therefore exclude networking to ensure consistency between the revenue data and the cost model.
Our estimates are generally aligned with external reports. One semiconductor research firm estimated prices of $3k for the v5e, $6k for the v5p, and $4k for the v6e, all of which align closely with our median estimates. However, the same source estimated the v7 (formerly v6p) at $8k, significantly lower than our $12k median. Other reporting places the v7 higher: one analyst estimated $13k, while The Information reported $12k. These latter estimates align well with our own.
To convert total TPU revenue into chip counts by generation, we estimate the quarterly production mix across TPU versions over time. The production mix matters because TPU generations differ substantially in cost. Inference-optimized chips are cheaper than training-focused chips with larger memory footprints, and newer generations are more expensive than older ones due to advanced process nodes and packaging. We model the quarterly production mix as probability distributions across TPU versions, with shares normalized to sum to 100%.
Our production mix estimates are informed by discussions with experts as well as industry reporting. Several key insights shape our modeling approach:
First, our understanding is that TPU production ramps are much shorter than (e.g.) Nvidia or AMD GPU ramps. Unlike GPU production, which typically involves gradual multi-quarter transitions with significant overlap between generations, TPU production tends to follow a ‘one chip at a time’ pattern. Production switches relatively quickly from one generation to the next, with minimal overlap. For example, when v5e production ended and v6e began in Q4 2024, there was no extended period of parallel production. Rather, the line switched from making one chip to the other.
One nuance to the above is that inference-optimized chips (v4i, v4, v5e, v6e) and training-focused chips (v5p, v7) tend to be produced in parallel. Within each of these two classes of chips, production from one generation to the next rather than running multiple generations in parallel.
Next, TPU models ramp into production before or around the time that Google makes a “preview” announcement, and are already in full production by the time Google announces that they are “generally available”. SemiAnalysis noted that “Google started announcing TPUs as they ramp into production rather than after the next generation was being deployed.” This means production often begins before or around the time of public preview announcements, and scales quickly to general availability. For example, for TPUv5e, meaningful production volumes began several months before the August 2023 GCP preview announcement.
Here is a summary table and plot describing when each TPU reached general availability. The full parameters can be found in our model.
| TPU Model | Preview availability date19 | Description | Production mix notes |
|---|---|---|---|
| v4i | Q1 2020 | Inference only chip | Included as a small share in Q4 2022 production before transitioning to v4. |
| v4 | May 2022 | Efficient and scalable chip. | Primary production in Q1 - Q2 FY23, declining rapidly in Q3 FY23 as v5e ramps. Production ends by Q3 FY23. |
| v5e | August 2023 | Small, cost-optimized inference chip. | Production begins earlier than v5e Google Cloud preview date, with meaningful volume starting Q2 FY23. Becomes dominant by Q3-Q4 FY23 as the production line quickly switches from v4 to v5e. Production ends as v6e ramps in Q4 FY24 |
| v5p | Dec 2023 | Training-focused chip. | Ramps alongside v5e starting Q1 FY24. Since the v5p is a high-performance training-focused chip with significantly more memory than the v6e, it may remain better suited for certain workloads even after v6e reaches full volume. We assume v5p production continues at relatively constant levels through FY25 before transitioning to v7. |
| v6e | Oct 2024 | Next-gen cost-optimized chip | Appears in pilot volumes in late 2024. We model full production volume of v6e starting in Q4 2024 and continued volume in 2025. Minimal overlap with v5e production given the production line switches from one to the other. |
| v7 | Nov 2025 | Powerful next-gen chip | Enters production in H2 2025. Contributes a minority but meaningful share in Q3 2025, with large-scale ramp deferred to 2026. |
Chip volumes are calculated by dividing quarterly TPU revenue by the weighted-average price per chip. We implement this using Monte Carlo simulations to propagate uncertainty from all input parameters including revenue estimates, production mix shares, manufacturing costs, margins, and more. Results are reported as median values with 90% confidence intervals.
This analysis relies on indirect estimation from proxies, which introduces uncertainty in a few areas:
This model estimates Google’s TPU production volumes by combining Broadcom’s publicly disclosed AI semiconductor revenue with bottom-up cost estimates for each TPU generation. We isolate custom accelerator revenue from Broadcom’s reported totals, estimate Google’s share, model per-chip manufacturing costs, and infer the quarterly production mix from product timelines and industry reporting. Dividing estimated spending by weighted-average prices gives us chip production volumes.
There are several sources of uncertainty including the allocation of Broadcom revenue between customers, contract-specific pricing, and the production mix between TPU generations. Results are reported as median values from Monte Carlo simulations with confidence intervals reflecting these uncertainties.
Amazon has ramped up its deployment of its custom Trainium and Inferentia AI chips in recent years, with Trainium2, launching in late 2024 and the newest Trainium3 becoming generally available in December 2025. This ramp is largely on behalf of its largest compute customer, Anthropic. We model volumes for 2024-2025 only due to limited evidence and presumed low volumes for years before 2024.
In summary:
Amazon said in its Q4 2025 earnings on February 5, 2026 stated that “Trainium2 is fully subscribed with 1.4 million chips landed.” This is a very informative disclosure but requires some interpretation:
Two time lag issues that run in different directions: deployed chips vs delivered chips, and the earnings release coming ~35 days after EOY. Given this, 1.4M still seems like the best central estimate of Trainium2 stock at end of 2025, but this could add (say) 10% error bars.
Below, we sketch a methodology based on evidence from Trainium data centers. While we now primarily defer to Amazon’s disclosure of Trainium volumes, we used the evidence below for our Trainium estimates we initially published in early January 2026 (which, for transparency, initially yielded a point estimate of 2.5M Trainium2 in total, roughly 1.8x higher than Amazon’s confirmed 1.4 million).
Our Frontier AI Data Centers dataset tracks two large Amazon–Anthropic data centers, both part of their larger “Project Rainier”, located in New Carlisle, Indiana20, and Madison, Mississippi. Both data centers probably primarily or exclusively use Trainium2, though this is only officially confirmed for New Carlisle. Put together, we estimate that ~1M Trainium2 were online in the two Project Rainier sites at the end of 2025, up from ~750k by mid-2025, based on a
mix of analysis of satellite data and statements from Amazon.
Note: the y-axis of this graph is in H100e. 1 H100e ~= 1.5 Trainium2.
Several lines of evidence (independent of the total Trainium2 volume disclosures) support that Rainier makes up a large chunk of total Trainium2:
What about upper bounds on Rainier’s share of Trainium2? SemiAnalysis estimates that Anthropic’s Trainium capacity in Q1, largely predating these two campuses, was roughly 15% of its Q4 2025 capacity. This conceivably could be a mix of Trainium1 and Trainium2, though we aren’t aware of any reports of Anthropic using any Amazon chips older than Trainium2.

To help inform our estimates, we create two quantitative models with parameters derived from the evidence from Project Rainier and Trainium wafer allocations. Neither model represents our all-things-considered evidence, since they don’t fully incorporate the information from Amazon’s disclosure of Trainium volumes, but they illustrate the implications of the other evidence above.
| Parameter | Central estimate for end-2025 | 90% uncertainty interval | Notes |
|---|---|---|---|
| New Carlisle Trainium2 | ~714k Trainium2 units | 500k to 1M | We estimate 714k Trn2 in late December. Amazon confirmed “nearly” 500k by October, 1M planned by EOY. |
| Madison Trainium2 | ~320k Trainium2 units | ~190k to ~540k | Our data center estimates generally have an 80% CI of 1.5x in either direction, or [~213k to 480k], Converted to 90% CI using z-scores. (The interval on New Carlisle is tighter than this due to the Amazon statements). |
| Rainier share Anthropic’s Trainium2 | – | 70% to 85% | 15% of Anthropic’s Trainium2 stock predates Rainier per SemiAnalysis |
| Anthropic’s share of all Trainium2 | – | 60% to 90% | Rough guess: Rainier makes up the majority (likely >60%) of cloud Trainium deployments, Anthropic likely primary user of Trainium overall |
| Model output of total Trainium2 stock | 1.9 million Trainium2 units | 1.3 million to 2.7 million Trainium2 units | Trainium2 total based on Rainier scale and Rainier/Anthropic share estimates. Implemented in Guesstimate |
Trainium3 became generally available in December 2025, and Amazon said in early February that Trainium3 was “now delivering production workloads and seeing strong demand” without clarifying volumes.
While there is limited info, the most useful public anchor may be that Morgan Stanley forecast that TSMC would produce ~180k Trainium3 chips in 2025 (and 1.1M Trainium2 chips, which is quite consistent with a 1.4M cumulative total). However, this estimate dates back to February 2025, making them quite uncertain. In addition, these numbers are based on TSMC’s wafer allocations for CoWoS (one of the last stages of chip production), which happens ahead of chip shipment/delivery.
As of writing in February 2026, we omit a model of Trainium3 volumes, at the risk of underestimating total Trainium stocks.
We also have relatively limited evidence about Amazon AI chips pre-Trainium2.
SemiAnalysis’s chart, which shows 2024 production of Trainium1 and Inferentia grouped together, has the total at roughly ~14% of the Trainium2 total across 2024-2025 (cumulative orange bars vs blue bars). If Trainium2 deployed volumes in February 2026 were 1.4 million (which is an underestimate of Trainium2 production by February, and probably also Trainium2 production through 2025 to a lesser degree), this suggests a total of ~200k chips produced in 2024.
In addition, Omdia estimated that Amazon deployed 1.3 million Trainium and Inferentia chips in total in 2024. Since Trainium2 volumes were likely in the low-hundreds of thousands in 2024 (given SemiAnalysis reporting, as well as a total volume of 1.4m and likely ramp-up in 2025), this would imply roughly 1 million Trainium1 and Inferentia in 2024. However, we don’t know how Omdia broke this down; they could have estimated much higher Trainium2 volumes than suggested by other reports.
Trainium1 and Inferentia2 are only around 30% as powerful as Trainium2 on paper. Even with the high count of 1 million, given 1.4 million Trainium2 deployed in 2024-2025, these older chips would make up around 20% of the compute-equivalent total in Amazon’s custom AI chips, so this uncertainty is not highly consequential.
Given our estimates of cumulative Trainium stocks, how might they have been shipped over time?
Our data center model gives some clues: Project Rainier was ramping heavily in by mid-2025: we estimate that New Carlisle and Canton were already at a combined 750k Trainium2 by the middle of 2025, just over half of Amazon’s disclosed cumulative 1.4M total for 2024-2025, and over one-third of the median estimate of the data center-based model. These estimates are uncertain but suggest that Trainium2 shipments weren’t too back-loaded in the second half of 2025.
SemiAnalysis also helpfully estimated relative growth of Trainium package production over time, showing that total Trainium2 production in 2025 was ~4x higher than in 2024, implying that 2024 Trn2 production was ~25% of 2025 production and ~20% of the cumulative total, though they note there is a time delay between package production and server shipments.21

This same chart also shows relative growth in Trainium1 and Trainium2 production for every quarter in 2024 and 2025. Trainium1 peaked in mid-2024 and then ramped down, while Trainium2 ramped in 2024 and grew more modestly in 2025 (with H1 making up 40-45% of the 2025 total).
To avoid taking these quarterly estimates too literally, and because variance in package volumes could get smoothed out in actual deliveries, we could model this as a steady growth rate of between 11% and 22% per quarter (corresponding to the 40% to 45% H1 share of 2025 volumes)
Overall, this yields a (rough) timeline of Amazon AI chip deliveries over time
There is limited media and analyst coverage of Trainium costs, though SemiAnalysis estimates Trainium2 at a ~$4k unit cost per chip for Amazon.
This is much lower than competitors like the Nvidia H100 (priced at >20k each), despite Trainium2 having ~66% the FLOP/s spec of an H100 and a similar amount of memory. But this figure is plausible because Amazon’s custom chip effort is heavily vertically integrated, Amazon having acquired the chip designer Annapurna Labs. The Nvidia B200, which is more complex and has 2x more memory than Trainium2, likely only costs around $6k to produce.
Amazon still partners with external chip design firms Marvell and Alchip, but this is reportedly much less lucrative for those design partners than (e.g.) Google’s TPU is for Broadcom.22
Huawei’s Ascend 910-series chips are China’s leading domestically-designed AI accelerators, with the 910B and 910C being the primary models in recent years. Huawei Ascend volumes are covered fairly extensively by other analysts, so we primarily summarize and synthesize this external work. We focus on 2024 and 2025, as there is less coverage of previous years, and it is likely that the vast majority of Huawei Ascends are less than two years old.
For the Ascend 910B, analysts largely agree on 400,000–450,000 units. 36kr reported roughly 400,000 shipments, while both SemiAnalysis and Bernstein came in at 450,000. The Financial Times was a notable outlier at just 200,000. For the newer 910C, only SemiAnalysis provided an estimate: around 50,000 units in limited initial production.
2024 totals: ~400k 910B and ~50k 910C
Estimates diverge more for 2025, especially on the 910B/910C mix. SemiAnalysis projects 152,000 910Bs and 653,000 910Cs—a significant shift toward the newer chip. Bernstein sees a more even split at 350,000 of each. Bloomberg reported ~300,000 910Cs for 2025, with plans to double that in 2026.
The U.S. Commerce Department offered a much lower figure—just 200,000 total Ascends. This number is out of step with other estimates and may be scoped to chips produced entirely in China’s indigenous supply chain. Until the relevant export controls were tightened recently, Huawei relied on imports from TSMC to acquire logic dies for Ascends, as well as HBM imports.
For our 910C estimate, we have greater confidence in SemiAnalysis’s higher figure of ~600,000, due to its far greater detail (the other analyst estimates are simply reported in the media) and SemiAnalysis’ relative expertise in semiconductors.
2025 totals (best estimate): ~200k 910B and ~600k 910C
Given these estimates of Ascend volumes, what is the total computing power of the Ascend stock? The challenge is that there are multiple published specs for both the 910B and 910C. The 910B is commonly quoted at 320 16-bit TFLOP/s, but subvariant specs range from 280 to 400 TFLOP/s. Similarly, the 910C is most often quoted at 800 TFLOP/s (possibly rounded), though SemiAnalysis claims 780 TFLOP/s.
To handle this uncertainty, we use a Monte Carlo model, with a range of 280 to 400 TFLOP/s for the 910B and 780 to 800 TFLOP/s for the 910C. In addition, for the purpose of this dataset, we use the maximum OP/s rating for 8-bit (or greater) number formats. For the 910B, INT8 specs are 2x higher, or 560 to 800 TOP/s. While published INT8 specs are not available for the 910C, we assume the 910C’s maximum INT8 output is approximately 2x higher than its FP-16 rating, at ~1500 to 1600 TOP/s.
We also apply uncertainty intervals for unit counts based on the research above.
This model yields 180K H100-equivalents in 2024 (90% credence interval: 130K–240K), of which 140k comes from 910Bs and 40k from 910C, and 530K H100-equivalents in 2025 (450K–620K), split between 70k 910B and 460k 910C.
AMD sells AI chips called Instinct GPUs. They are arguably the largest competitor to Nvidia in selling AI chips that are not custom-designed for specific companies/applications (such as Google/Broadcom’s TPU and Amazon’s Trainium).23
We estimate AMD’s AI chip sales24 by first:
With quarterly Instinct revenue, GPU pricing, and production mix estimates, we can calculate unit volumes by dividing revenue by weighted-average prices. We implement this using Monte Carlo simulation to model uncertainty from all input parameters and report results as median values with confidence intervals. The full model can be found here.
AMD reports its “data center” revenue each quarter (as in, revenue from its division that sells products intended for data centers), but they do not break out how much revenue comes from Instinct AI GPUs versus other products like its EPYC CPUs. To estimate Instinct revenue, we rely on company disclosures and work backwards from a few anchor points.
The most important anchor is AMD’s disclosure that Instinct revenue exceeded $5 billion in 2024. A secondary anchor helps us distribute this $5B across quarters in 2024. In AMD’s Q2 2024 earnings call, Lisa Su, AMD’s CEO, noted that Instinct MI300 revenue exceeded $1 billion for the first time. This helps ground how we model the quarterly distribution of the $5+ billion for 2024.
For 2025, we are less certain about Instinct GPU revenue.
AMD’s data center segment grew overall in 2025, but experienced headwinds in Q1 and Q2 2025 from two factors. First, new U.S. export controls in April 2025 restricted AMD’s MI308 chips designed for the Chinese market, resulting in an $800M charge in Q2. Second, AMD faced unexpectedly weak demand for its new MI325X GPU. We estimate that Instinct’s share of datacenter revenue fell from ~50% in Q4 2024 to ~40% in Q1 2025 and ~30% in Q2 2025.
This decline reversed in Q3 2025 when AMD introduced a new GPU generation, the MI350 series, which “ramped really nicely” and drove Q3 growth. We estimate Instinct represented 35-50% of Q3 2025’s $4.34B datacenter revenue, or ~$1.5-2.3B in revenue.
We have meaningful uncertainty because AMD does not regularly disclose the EPYC/Instinct split. The actual revenue distribution could differ, particularly in quarters where we have limited direct information.
| Quarter | DC Revenue | Instinct Share | Instinct Revenue | Notes |
|---|---|---|---|---|
| Q1 2024 | $2.34B | ~30% | ~$0.7B | Assumption based on ramp trajectory |
| Q2 2024 | $2.83B | ~36% | >$1B | Company disclosure |
| Q3 2024 | $3.55B | ~42% | ~$1.5B | Interpolation |
| Q4 2024 | $3.86B | ~47% | ~$1.8B | Interpolation |
| 2024 Total | $12.6B | 44% | ~$5B | AMD FY2024 |
| Q1 2025 | $3.67B | 35-45% | $1.3-1.7B | MI308 export controls, MI325X weak demand |
| Q2 2025 | $3.24B | 25-40% | $0.8-1.3B | $800M MI308 charge, continued softness |
| Q3 2025 | $4.34B | 40-53% | $1.7-2.4B | MI350 ramp |
Average Selling Price
In order to convert Instinct revenue into unit volumes, we need to estimate what AMD sells each Instinct GPU for. Our estimates primarily draw from industry analyst reports.
We model each ASP as a uniform distribution within ranges anchored on media reports and analyst estimates.
| GPU Model | Price Range | Source |
|---|---|---|
| MI250X | $8,000-12,000 | Massed Compute |
| MI300A | $6000 - $7500 | Derived from El Capitan cost / # of units |
| MI300X | $10,000-15,000 | Tom’s Hardware; Citi via SeekingAlpha (Microsoft pays ~$10k) |
| MI325X | $10,000-20,000 | Low end reflects hyperscaler discounts amid soft demand |
| MI350X | $22,000-25,000 | Seeking Alpha (reports of $25k pricing) |
| MI355X | $25,000-30,000 | Seeking Alpha |
GPU Revenue Mix Over Time
Estimating the quarterly split between GPU models is necessary to measure the total compute capacity of AMD chips because GPUs vary in cost-effectiveness.
We model the revenue mix using product launch timelines, known customer deployments, and company commentary about production ramps. We estimated most 2024 sales came from the MI300 series, with a small share for the legacy MI250X, MI300A, and MI300X. In Q3 2025, the next-gen MI350 series launched ahead of schedule and we estimate that it quickly captured the majority of Instinct revenue. A more detailed timeline is shown below, with the full parameters available in our model.
We have significant uncertainty about the revenue mix because AMD does not disclose precise product shares. The actual mix could differ substantially, particularly during product transition periods when multiple generations overlap.
| GPU | Release date | Description | Revenue share notes |
|---|---|---|---|
| MI250X | Nov 2021 | Previous-gen flagship GPU | Legacy product with minor share in early 2024 |
| MI300A | Jan 2023 | First 300 series, designed for HPC over AI | Most of the 44.5K MI300A for El Capitan were delivered in Q2 and Q3 2024. Ramped down after |
| MI300X | Dec 2023 | Flagship AI GPU, competing with Nvidia H100 | Ramped through 2024, phasing out in 2025 for MI325X |
| MI308X | Dec 2023 | China-spec 300X to meet export controls | Unclear if high-volume shipments ever took place, exports definitively blocked in April 2025 |
| MI325X | Oct 2024 | 300X refresh with more memory, akin to Nvidia H200 | Shipped Q2 2025 at 5-20% share in Q1 2025, grew to 15-40% by Q2 2025 despite soft demand; timing vs B200 hurt |
| MI350 | Jun 2025 | Next-gen GPU, competing with Nvidia Blackwell | Volume production ahead of schedule; primary driver of Q3 2025 growth |
Volume Calculation
With quarterly Instinct revenue estimates, ASP distributions for each model, and production mix shares, we calculate unit volumes by multiplying total GPU revenue by each model’s production share to get model-specific revenue, then dividing by that model’s ASP.
Rather than working with point estimates, we sample from all the probability distributions simultaneously and compute the resulting unit volumes using Monte Carlo simulations.
Limitations
Our analysis has several sources of uncertainty that create several limitations:
Revenue allocation - AMD doesn’t report the EPYC/Instinct split for most quarters. We anchor on disclosed data points but if AMD’s actual EPYC/Instinct mix differs from our estimates, either because EPYC grew faster or slower than we assumed, our GPU revenue estimates would be biased.
Production mix - We estimate GPU model shares from product timelines and deployment reports, but AMD doesn’t disclose actual splits. The actual splits could differ substantially from our estimates, especially during product transitions. This matters because different models have different prices and misestimating the mix between a $15,000 and $25,000 GPU directly affects unit counts.
Pricing - GPU pricing varies by customer, contract structure, and timing in ways our ASP ranges would not fully capture. Large bulk purchases as well as customer demand can meaningfully affect average per-GPU prices. We model ASPs as uniform distributions within reported ranges, but actual contract prices likely have more complex distributions.
MI308 export controls on Q2 2025 and subsequent quarters remains uncertain. AMD disclosed the $800M Q2 charge and $1.5-1.8B annual impact, but the full effect on product mix, pricing, and customer allocation is difficult to model precisely.
Deployment lag - Our estimates reflect chips produced and sold by AMD, not necessarily chips deployed and operational in customer data centers. There may be deployment lags between when AMD recognizes revenue and when chips are online. For quarterly analysis, these timing differences could shift volumes between periods.
Summary
AMD’s Instinct GPUs compete with Nvidia’s data center accelerators but represent a much smaller share of the AI compute market. We estimate Instinct GPU volumes in 2024 and 2025 to understand AMD’s position in the AI chip market.
AMD bundles its Instinct GPU revenue with EPYC server CPU revenue in a single data center segment. This makes it more difficult to track AMD’s revenue from their Instinct GPU line. We estimate unit volumes by inferring GPU revenue and dividing by average chip prices. The main revenue anchor point is AMD’s statement that 2024 Instinct revenue exceeded $5 billion. We also use their Q2 2024 disclosure that MI300 revenue surpassed $1 billion that quarter. These two points let us estimate how GPU revenue was distributed across other quarters, accounting for product launches, media reports on adoption, and production schedules AMD discussed in earnings calls.
We estimate prices for each GPU model from analyst reports and media coverage of customer contracts. Hyperscalers like Microsoft pay around $10,000 per MI300X in volume, while newer models like the MI350X sell for $22,000-25,000. Production mix across GPU generations comes from product timelines, known deployments, and company statements about manufacturing. We run Monte Carlo simulations to propagate uncertainty rather than using point estimates.
Our AI Chip Sales hub currently covers Nvidia, AMD, Google TPU, Amazon Trainium, and Huawei AI chips, omitting other AI chip families from other designers. We believe that these are collectively minor compared to the chip families we do cover, likely <10% of the global stock of AI computing power. This conclusion comes from a combination of: research on specific companies (albeit relatively limited and shallow in most cases), industry estimates and supply chain analysis of AI chip inputs like high-bandwidth memory and CoWoS, and tacit knowledge (if an AI chip company not on our list had achieved the scale of AMD or Trainium, we would be hearing more about it).
We give a brief overview below. For reference, we estimate that the five chip designers we cover sold over 10M in H100-equivalent AI chips in 2025 (~20M cumulative), worth hundreds of billions of dollars.
Meta has a custom AI chip program called MTIA, in partnership with Broadcom. In March 2026, they disclosed that they had deployed “hundreds of thousands” of MTIA chips. Because they only recently launched the MTIA 300 chip, it is likely that most of these hundreds of thousands consist of the older MTIA 100 and 200 chips, both a small fraction of H100 in computing power.
Microsoft also has a custom AI chip program called Maia, which they have deployed to some degree, with limited public information.
Intel sells a series of AI chips (“Gaudi”). In 2024, they failed to achieve their Gaudi sales target of just $500M. In late 2024, they guided a more ambitious sales target for 2025 of 200k to 250k Gaudi3 chips (roughly equivalent to 1 H100 each on paper), worth ~$3 billion. But Intel appears to have not broken out Gaudi revenue since then, and they disclosed $361 million in Gaudi-related inventory writeoffs in mid-2025, so it is doubtful that they achieved even this relatively modest guidance.
Cambricon, a Chinese AI chip startup, reportedly sold around 150k AI chips in 2025, with plans to triple production in 2026.
There are several relatively established AI chip startups, including Cerebras, Groq, and SambaNova. Groq sells powerful, low-latency AI chips, and they told investors that they expected $500M in revenue in 2025. They were later quasi-acquired by Nvidia, so their product lineup may be absorbed by Nvidia going forward. Cerebras also works on low-latency chips and has signed a major deal with OpenAI, but there is limited information on revenue or chip shipments to date. We are not aware of concrete public information on SambaNova sales, though we have not looked hard.
While this is not an exhaustive list of AI chip companies, any company not on this list is likely to be less established, and may not have developed or sold any chips.
Non-AI/non-datacenter chips
We also don’t track consumer GPUs and other chips that are not billed as commercial/data center-grade AI chips. Some of these may be useful for small-scale AI work. The most important category here may be Nvidia’s non-datacenter chips, which includes gaming GPUs, professional-grade GPUs, and “automotive” (self-driving/robotics) chips. Close to 90% of Nvidia’s revenue has come from their data center sales in recent years, so these segments are much smaller than Nvidia’s flagship data center products, setting aside their reduced suitability for major AI workloads.
Consumer devices in general can do some edge inference of small AI models, even when powered by CPUs that are much less capable of the parallel operations used in AI (e.g. matrix multiplications) than GPUs. But collectively they do not seem very significant. Consider that AI chips are already a sizable share of all computer chips (~$800B in global semiconductor revenue in 2025, vs ~$200B for Nvidia data center alone), and dedicated AI chips are far more suitable than consumer CPUs for AI work.
The AI Chip Production dataset describes estimated AI accelerator shipments in “timeline” batches broken down by time periods, chip designer, and chip type. The level of specificity in time period and chip type can vary by the available information and data.
We provide a comprehensive guide to the database’s fields below. This includes example field values as reference.
If you would like to ask any questions about the database, or request a field that should be added, feel free to contact us at data@epoch.ai.
This is the primary data record, batching estimates of AI chip shipments by time period and chip type.
| Column | Type | Definition | Example value | Coverage |
|---|---|---|---|---|
Organizations involved in chip design
| Column | Type | Definition | Example value | Coverage |
|---|---|---|---|---|
Metadata table with specifications for each chip model. This is a consolidated set of key fields; most of the data is synced with the Epoch ML Hardware dataset using the linked record.
| Column | Type | Definition | Example value | Coverage |
|---|---|---|---|---|
2026-03-23
Trainium model updates
Updated Trainium2 model to incorporate volume disclosures from Amazon, rather than using inferences from Project Rainier deployments. This reduced our estimate of Trainium2 volumes substantially, from ~2.5M to ~1.4M units. See methodology for more information.
2026-01-29
TPU model updates
Updated TPU volume and compute estimates with revised assumptions on production mix and margins based on expert feedback.
In summary, we now model more rapid transitions to new TPU generations. Specific changes to our production mix estimates include:
Revisions to Broadcom’s gross profit margins:
Implications:
Code refactoring
Ported our TPU and AMD models to Python Jupyter notebooks, which we previously used for Nvidia, and refactored all three models to use similar formats and helper functions. These can be found in Epoch AI’s ai-chip-counts repository.
The new models are functionally the same as before, with some specific modeling updates:
Data refactoring
Reorganized our “Timelines by chip” tables into a single table, with substantially similar columns, instead of multiple tables broken down by designer. In addition, the rows in the table are now all broken down by calendar quarter, where previously they were often broken down by fiscal quarters that do not line up with calendar quarters (e.g. for Google TPU and Nvidia) or other non-quarter periods such as years or half-years. This does not affect our graph views, which previously already interpolated the table results into calendar quarters.
Our models still output intermediate results broken down by fiscal quarter when they are distinct from calendar quarters, which you can view in the respective code notebooks (see full methodology).
Download the AI Chip Sales dataset as individual CSV files for specific data types, or as a complete package containing all datasets.
This data was collected by Epoch AI’s employees and collaborators, including John Croxton, Josh You, Venkat Somala, and Yafah Edelman.
This documentation was written by Josh You and Venkat Somala.
See estimates of chip lifespan in “Analysis” here. It is possible that some 2022-era chips have already been retired but we don’t explicitly model retirements/failures for simplicity.
For 2022, an explicit Networking breakout was not available, so we estimate/interpolate this as 75% Compute (networking share has generally gone down since 2020).
As a very minor point, Nvidia actually report two sets of breakdowns. Nvidia has two “reportable segments”: “Compute & Networking” and “Graphics”. Nvidia also reports a separate breakdown over several “market platforms”, including Data Center. Within Data Center (which is what we use), revenue further is broken into “Compute” and “Networking”. For whatever reason, Data Center revenue (“Compute” plus “Networking”) is ~1% smaller than the “Compute & Networking” reportable segment.
Nvidia’s cloud reportedly runs mostly on rented capacity from Oracle and neoclouds. Nvidia also uses external cloud compute for at least some of the compute it uses for internal research and model training. This suggests that Nvidia owns very little of its own hardware (besides undelivered inventory, which we don’t include in our counts).
Networking equipment within a server would count as Networking revenue.
“we think if it was any higher than that, OEMs and ODMs would be screaming”, referring to third parties that sell Nvidia servers or assemble them as a service.
The fact that Nvidia received an export license to sell H200s, but not H100s, to China in late 2025 is perhaps a sign that the H100 was out of production or irrelevant by that time.
Alternatively, they can be deployed in 8-GPU B200 servers or standalone.
Nvidia said that by April 2025, hyperscalers were deploying almost 1000 NVL72 racks per week. At $2 to 3 million in revenue each, this would be 13 * 1000 * 2-3 million = $26B to 39B in revenue, versus $41 billion in total data center revenue the following quarter.
Full quote (while holding up a physical B200 die): “This will cost, you know, 30, 40 thousand dollars”
Company wide gross margin is about 75%, and could be higher for AI GPUs specifically.
The graph says 6M Blackwell GPUs, the small text on the slide “Blackwell and Rubin are 2 GPUs per chip” indicates that this is a case of Jensen math: he is counting the two GPU dies in each package as a separate GPU. So 6M actually means 3M Blackwell GPUs as traditionally understood (as in, a standalone B200 contains two GPUs under this accounting, and an NVL72 system contains 144). This is not a reference to the fact that GB200/GB300 superchips contain two (or four) GPUs each.
And a CEO of a publicly traded company discussing sales figures would most likely round down, not up.
Recently, Google/Broadcom have also decided to start selling TPUs directly, e.g. to Anthropic and to other cloud companies. These deliveries will likely begin in 2026 and the TPU volumes we estimate are presumably owned entirely by Google.
This was disclosed because Broadcom’s CEO, Hock Tan, is on Meta’s board. Not all of this revenue from Meta necessarily went to XPUs, in which case Google’s share of XPUs would be even higher.
The assumptions that TPU spend with Broadcom scales proportionally with overall CapEx may not hold exactly if spending shifts toward data centers, networking, or other areas.
For example, they forecast in Q2 that overall gross margins would decline “primarily due to a higher mix of XPUs in AI revenue” implying XPUs have a lower margin than their overall AI business. Hock Tan also didn’t push back against a caller who said “custom [chips] is probably dilutive within semis”.
The values in the Squiggle modeling may very slightly differ from the values in the table due to small stochastic variation from independent Monte Carlo draws, as well as minor implementation differences in how distributions are represented and sampled, despite identical parameter ranges and distributional assumptions. The values reported above are from the Python notebook that we derive the final estimates from.
If a TPU version did not have a Preview period and instead moved directly to General Availability (GA), we list the GA date.
Amazon often uses the name “Project Rainier” to describe New Carlisle specifically.
These numbers are produced by digitizing the chart image.
“Amazon and Annapurna are heavily cost focused and drive their suppliers hard. Compared to Broadcom’s ASIC deals, the Trainium projects have much less profit pool available to the chip design partners Alchip and Marvell.”
Not including AMD’s gaming GPUs which can be used for AI/ML but are not designed for it.