Show sidebar Changelog

Changelog

2026-03-23

Trainium model updates

Updated Trainium2 model to incorporate volume disclosures from Amazon, rather than using inferences from Project Rainier deployments. This reduced our estimate of Trainium2 volumes substantially, from ~2.5M to ~1.4M units. See methodology for more information.

2026-01-29

TPU model updates

Updated TPU volume and compute estimates with revised assumptions on production mix and margins based on expert feedback.

In summary, we now model more rapid transitions to new TPU generations. Specific changes to our production mix estimates include:

  • Removed TPU v3 from our 2022 estimates; limited v4i to late 2022
  • Shifted v5e earlier; ended v4 after H1 2024
  • 2024 modeled as v5e/v5p mix with increased v5p share
  • Delayed v6e to Q3 2024; becomes dominant by Q1 2025
  • v5e phased out by end of 2024; Q3 2025 transitions from v6e to v7

Revisions to Broadcom’s gross profit margins:

  • FY23: 55–70% (down from 65–75%)
  • Post-FY23: 50–65% (down from 50–70%)

Implications:

  • These changes have a significant impact on the implied compute capacity of TPUs. Compared to the previous version, the updated model estimates 36% more compute capacity from TPUs, due to faster ramps of more advanced TPUs such as v6e.
  • The effect on unit volumes and total cost is much smaller. The updated model estimates 5.28% more TPU units and 2.73% higher total cost than the prior model.

Code refactoring

Ported our TPU and AMD models to Python Jupyter notebooks, which we previously used for Nvidia, and refactored all three models to use similar formats and helper functions. These can be found in Epoch AI’s ai-chip-counts repository.

The new models are functionally the same as before, with some specific modeling updates:

  • We now model AMD and TPU price uncertainty as log-normally distributed, instead of uniform or normally distributed respectively (Nvidia was already log-normal). This is a more appropriate modeling choice given exponential changes in chip price-performance over time, but only has a minor impact on our median estimates or confidence intervals.
  • We discovered and resolved a bug in the Nvidia code where it was using higher prices for 2024-2025 than intended (leading to a reduced unit count estimate). However, because correcting this bug resulted in an estimate of H100 and H200 shipments that is apparently higher than the amount disclosed by Nvidia, we slightly revised our price assumptions upwards, with ultimately almost no net change in cumulative H100/H200 sales.

Data refactoring

Reorganized our “Timelines by chip” tables into a single table, with substantially similar columns, instead of multiple tables broken down by designer. In addition, the rows in the table are now all broken down by calendar quarter, where previously they were often broken down by fiscal quarters that do not line up with calendar quarters (e.g. for Google TPU and Nvidia) or other non-quarter periods such as years or half-years. This does not affect our graph views, which previously already interpolated the table results into calendar quarters.

Our models still output intermediate results broken down by fiscal quarter when they are distinct from calendar quarters, which you can view in the respective code notebooks (see full methodology).