Epoch’s AI Chip Components dataset collects our estimates of the share of global advanced semiconductor manufacturing capacity consumed by the leading AI accelerator designers, by quarter, from Q1 2024 through Q4 2025. The designers covered are NVIDIA, AMD, Google, and Amazon.
Throughout this methodology, “Google” and “Amazon” refer to the component consumption associated with Google TPUs and AWS Trainium chips, respectively. These are end-customer attributions. Broadcom is Google’s long-running TPU ASIC partner, while AWS Trainium is developed by AWS’s Annapurna Labs with support from external ASIC partners such as Marvell and Alchip. We attribute the associated logic, HBM, and CoWoS demand to Google and Amazon because they are the designers and these chips are built for and deployed in their AI infrastructure.
We focus on three supply chain components:
Advanced-node logic wafers: 3–5 nm wafers fabricated at TSMC (N3/N3E/N3P and N5/4N/4NP nodes), which form the silicon dies inside AI accelerators.
CoWoS advanced packaging wafers: Chip-on-Wafer-on-Substrate (CoWoS) packaging is used to integrate multiple chiplets and high-bandwidth memory stacks onto a single substrate.
High-Bandwidth Memory (HBM): Specialized DRAM stacks placed next to the logic die inside the accelerator package to quickly feed data to the accelerator’s compute cores.
We focus on the key semiconductor components used in chip packages, not the broader infrastructure required to deploy those chips. In particular, our estimates do not cover rack-level, networking, or data-center equipment such as servers, optics, cooling systems, or power.
Completed AI chips are usually sold weeks or months after their main inputs were fabricated, packaged, or attached. For example, the logic wafer, CoWoS capacity, and HBM used in a chip sold in Q2 may have been consumed in Q1. To compare chip demand with available component supply, we therefore assign each chip’s component demand to the quarter when those inputs were likely used, not the quarter when the finished chip was sold.
Our estimation model proceeds in three steps. First, we estimate quarterly chip production by designer and chip type using our AI Chip Sales dataset and adjust for manufacturing lags, inventory accumulation, and stranded inventory due to export controls (write-downs). Second, we translate those chip volumes into demand for logic wafers, CoWoS wafers, and HBM using per-chip specifications and yield assumptions. Third, we compare each designer’s component demand against our estimates of total quarterly component supply.
This lets us estimate both absolute component consumption and each designer’s share of the global supply pool.
The model has two sides. The first side is component demand. We estimate how many chips of each type were produced in each quarter, then estimate how much logic wafer capacity, CoWoS packaging capacity, and HBM each chip required. This requires chip shipment estimates, manufacturing lag assumptions, inventory adjustments, and per-chip bill-of-materials parameters.
The second side is component supply. We estimate the total quarterly global supply of each component. For logic wafers, we work backward from TSMC’s per-node wafer revenue. For CoWoS, we interpolate from reported capacity benchmarks in wafers per month. For HBM, we build quarterly market estimates from supplier disclosures, HBM market-share estimates, and annual market-size anchors.
We then calculate the share of supply by dividing component demand by component supply.
Our analysis proceeds in several steps:
We estimate total global quarterly production for each supply chain component. Each component uses a different approach.
We estimate global advanced-node logic wafer supply by working backward from TSMC’s per-node quarterly wafer revenue. Per-node wafer revenue (3 nm and 5 nm) is sourced directly from TSMC’s quarterly earnings disclosures and can be found in this input sheet. Our estimates refer to 12-inch wafers, the standard wafer size for advanced-node logic fabrication. Each 12-inch wafer is later cut into dozens of usable AI accelerator dies, depending on the die size and yield.
To convert quarterly wafer revenue into wafer counts, we divide revenue by the estimated per-wafer price for each node. We use the following price ranges:
| Node | P5 | Median | P95 |
|---|---|---|---|
| 3 nm | $17,000 | $20,000 | $22,000 |
| 4 nm | $16,000 | $18,500 | $20,000 |
| 5 nm | $16,000 | $17,500 | $19,000 |
After estimating TSMC wafer counts, we scale up to global supply by dividing by TSMC’s estimated share of global advanced-node capacity. Analysts often estimate that TSMC accounts for roughly 90% of global advanced-node capacity, but this figure may include 7 nm capacity. Since our model focuses on 5 nm and below, TSMC’s share of the relevant capacity pool could be slightly different. We therefore model TSMC’s share of global 5 nm-and-below logic wafer capacity as a range, using 85% / 90% / 95% for our low, median, and high cases.
| Quarter | TSMC 3nm Wafer Revenue | TSMC 5nm Wafer Revenue | TSMC 3nm Wafers (k) | TSMC 5nm Wafers (k) | Estimated Global 3nm Wafers (k) | Estimated Global 5nm Wafers (k) |
|---|---|---|---|---|---|---|
| 2024 Q1 | $1.45B | $6.07B | 72 | 347 | 80 | 386 |
| 2024 Q2 | $2.79B | $6.48B | 139 | 370 | 155 | 411 |
| 2024 Q3 | $4.02B | $6.35B | 201 | 363 | 223 | 403 |
| 2024 Q4 | $6.01B | $7.92B | 300 | 453 | 334 | 503 |
| 2025 Q1 | $4.87B | $7.74B | 244 | 442 | 271 | 491 |
| 2025 Q2 | $6.03B | $9.33B | 302 | 533 | 335 | 592 |
| 2025 Q3 | $6.64B | $10.81B | 332 | 617 | 369 | 686 |
| 2025 Q4 | $8.01B | $10.08B | 400 | 576 | 445 | 640 |
We focus on 3 nm and 5 nm nodes since leading AI accelerators from all four tracked designers use either 3 nm or 5 nm logic dies. Accelerators marketed on TSMC’s “4nm” tier (for example, B200 and B300) are part of the N5 family and are priced using N5-family wafer prices in our model.
As a cross-check, SemiAnalysis estimated TSMC’s 3 nm wafer shipments at approximately 215k, 285k, 300k, and 385k wafers in Q1 – Q4 2025, respectively. These figures are broadly consistent with our revenue-based estimates, though they are somewhat lower in each quarter. Our median estimates imply 244k, 302k, 332k, and 400k TSMC 3 nm wafers over the same period, or about 13% higher in Q1, 6% higher in Q2, 11% higher in Q3, and 4% higher in Q4.
One reason for this discrepancy could be that we use a fixed 3 nm wafer ASP across quarters. In reality, TSMC’s blended 3 nm ASP likely varies over time as the mix shifts across N3 variants and customers. Different customers may also face different wafer prices depending on product complexity, volume commitments, contract terms, and whether they are buying early-node or more mature-node capacity. If the true blended 3 nm ASP in a given quarter was higher than our fixed $20,000 median assumption, our revenue-based method would overestimate wafer counts for that quarter.
Overall, our estimates are within the same ballpark as SemiAnalysis’s estimates.
CoWoS (Chip-on-Wafer-on-Substrate) is TSMC’s advanced packaging technology used for virtually all high-end AI accelerators. We estimate CoWoS capacity in wafers per month (WPM) from public industry reports.
TSMC does not directly disclose packaging capacity figures, so we rely on benchmarks reported by industry analysts at year-end. We use the following anchor points:
| Year-end | WPM (low) | WPM (high) | Source |
|---|---|---|---|
| End of 2023 | 13,000 | 16,000 | SemiWiki |
| End of 2024 | 35,000 | 40,000 | SemiWiki |
| End of 2025 | 65,000 | 75,000 | SemiWiki, TrendForce |
| End of 2026 | 90,000 | 110,000 | SemiWiki, DigiTimes |
These benchmarks represent end-of-year capacity snapshots, not annual averages. We linearly interpolate between year-end anchors to construct a monthly WPM profile, then sum monthly values within each calendar quarter to obtain quarterly capacity. This approach assumes a smooth ramp in TSMC’s CoWoS capacity throughout the period.
We anchor HBM supply to publicly disclosed estimates of the total HBM market size. Micron has stated the total HBM addressable market was approximately $18B in 2024 and $35B in 2025. Goldman Sachs estimated a $36B HBM market for 2025, corroborating Micron’s figure.
We measure HBM supply in USD to match these market-size estimates. HBM prices vary by generation, including HBM2e, HBM3, and HBM3e, so we apply generation-specific price assumptions drawn from analyst estimates when converting physical GB consumed into dollar values.
To estimate quarterly HBM supply, we do not simply divide the annual market totals evenly across quarters, because the HBM market did not grow smoothly quarter over quarter. Supply and revenue were affected by sharp product ramps, changing HBM3E qualification timelines, customer pull-ins, and temporary shipment fluctuations, including a reported contraction in HBM shipment volumes in Q1 2025.
Instead, we build quarterly estimates using manufacturer disclosures and market-share estimates. Specifically, we combine reported or estimated HBM revenue from the major HBM suppliers, primarily SK Hynix and Micron, with Counterpoint Research’s estimates of each supplier’s share of the global HBM market. For a given quarter, we estimate the total global HBM market size as:
Global HBM market = Company HBM revenue / Company HBM market share
For SK Hynix, we estimate HBM revenue from its quarterly DRAM revenue and company disclosures about HBM as a share of DRAM revenue. For example, SK Hynix stated that HBM was roughly 30% of DRAM revenue in 2024 Q3 and was 40% in 2024 Q4. We combine these disclosures with SK Hynix’s DRAM revenue and Counterpoint’s estimated HBM market share to infer the total quarterly HBM market size.
For Micron, we use its HBM revenue disclosures as direct anchors where available. For example, Micron stated that HBM revenue exceeded $1B in its fiscal Q2 2025, grew by nearly 50% sequentially in fiscal Q3 2025, and reached nearly $2B in fiscal Q4 2025. We use these disclosures to estimate Micron’s quarterly HBM revenue and then scale to the total market using Counterpoint’s estimated Micron HBM market share.
We then select the most informative company-based estimate for each quarter. These company-level estimates are then checked against annual market-size estimates to ensure the quarterly path is consistent with the reported $18B 2024 and $35B 2025 global HBM markets.
This approach produces a quarterly HBM supply path that reflects the actual ramp in the market rather than imposing a smooth linear trend.
| Quarter | Estimated Global HBM Market |
|---|---|
| 2024 Q1 | $2.5B |
| 2024 Q2 | $3.5B |
| 2024 Q3 | $4.9B |
| 2024 Q4 | $7.5B |
| 2025 Q1 | $6.5B |
| 2025 Q2 | $7.9B |
| 2025 Q3 | $9.5B |
| 2025 Q4 | $11.2B |
These quarterly estimates have meaningful uncertainty because HBM suppliers do not consistently report stand-alone HBM revenue, and market-share estimates vary across analysts. They also rely heavily on Counterpoint’s quarterly market-share estimates, which may themselves be inaccurate or uncertain. We place more confidence in the annual totals, which are close to public market-size anchors, than in any single quarter estimate. Still, this approach gives a more realistic quarterly profile of the HBM ramp than simply splitting the annual market size evenly across quarters.
Estimating how much of each supply chain component a given chip designer consumed in a quarter requires two inputs: how many chips of each type they produced, and how much of each component each chip requires. Multiplying those together gives component consumption. We then aggregate across chip types and designers to get the total demand.
Our estimates of how many chips of each type were produced are based on data from our AI Chip Sales hub as well as company-disclosed inventory values. The subsections below detail each step: chip volume inputs, the manufacturing lag adjustment, accounting for inventory, and the conversion from units to component consumption.
Quarterly shipment volumes for each chip come from the Epoch AI Chip Sales Hub, which has per-accelerator unit estimates for every quarter in 2024 and 2025.
The Chip Sales Hub estimates quarterly chip shipments in part from chip designers’ disclosed revenue. But the quarter in which a chip is sold is not necessarily the quarter in which its semiconductor components were manufactured or consumed. Logic dies, HBM, and advanced packaging capacity are consumed before the finished accelerator is recognized as revenue or shipped to a customer.
To estimate quarterly component consumption, we therefore shift chip volumes backward from the quarter in which they are sold to the quarter or quarters in which their inputs were likely consumed. This is necessarily an approximation: production timelines vary by chip, designer, inventory strategy, supply chain dynamics, and quarter.
| Component | Lag Assumed | Lag Range | Source |
|---|---|---|---|
| Logic die fabrication | 10 weeks | 8–14 weeks | BCG/SIA, ECI |
| HBM memory | 8 weeks | 8–16 weeks | Electronic Component |
| CoWoS packaging | 8 weeks | 6–10 weeks | Electronic Component |
For each component, we define “consumption” to match the timing convention used in the corresponding supply denominator.
For logic wafers, we count consumption at wafer completion / shipment, not wafer start. Our logic supply denominator is derived from TSMC’s process-node wafer revenue, and TSMC reports node mix as a share of wafer revenue from shipments. For example, in 4Q25, TSMC noted that “shipments of 3-nanometer accounted for 28% of total wafer revenue.” Since the supply denominator is based on completed wafer shipments, we align demand to the point when the relevant logic wafer is completed and transferred. We assume this occurs roughly 10 weeks before the finished accelerator is shipped or sold. A 10-week lag is shorter than a full calendar quarter, so the associated consumption splits across the prior quarter and the shipment quarter. We use a 10/13–3/13 split: roughly 77% of logic demand is assigned to the prior quarter and 23% to the shipment quarter.
For CoWoS, the denominator is different. Public CoWoS estimates are usually reported as wafers per month of packaging capacity, not revenue from completed packages. We therefore count CoWoS consumption when a chip enters and occupies the packaging flow. This better matches the capacity constraint: a package consumes scarce CoWoS capacity while it is moving through the line, not only when it exits. We assign HBM to the same timing because HBM is committed to the package at the CoWoS stage.
We assume CoWoS and HBM consumption occurs roughly eight weeks before the finished accelerator is shipped or sold. An eight-week lag is shorter than a full calendar quarter, so the associated consumption often falls across both the prior quarter and the shipment quarter. We therefore use a simple 2/3–1/3 split: two-thirds of HBM and CoWoS demand is assigned to the prior quarter, and one-third is assigned to the shipment quarter. Intuitively, this assumes production is spread roughly evenly across the quarter. For a chip shipped during a given quarter, an eight-week lookback will usually place most of the packaging-stage activity in the prior quarter, but a meaningful share remains in the same quarter, especially for chips shipped later in the quarter. This split is an approximation, but it better captures the shorter lag for HBM and CoWoS than assigning all consumption to the prior quarter.
This timing treatment is approximate, but the principle is consistent: logic demand is aligned to completed wafer shipments because the supply denominator is shipment-based, while CoWoS demand is aligned to packaging-line usage because the supply denominator is capacity-based.
One consequence of the lag between chip sales and manufacturing is that estimating Q4 2025 consumption requires information about Q1 2026 shipment volumes. Many of the chips sold in Q1 2026 would have already consumed semiconductor inputs in Q4 2025, or even earlier, depending on the stage of production.
We therefore need to estimate production that occurred before the end of 2025 but was not captured in Q4 2025 revenue. The available evidence varies by designer and we take a different approach for each. For AMD, Q1 2026 shipment volumes can be estimated from their Q1 26 earnings release, so most Q4 2025 component consumption is captured through the standard lagged-shipment path. For NVIDIA, Q1 2026 shipment volumes are not yet observable, so we use the Jan. 2026 inventory balance to proxy for late-2025 production not yet captured in revenue. For Google and Amazon, balance-sheet inventory is not informative because their chips are manufactured through external partners, so we extrapolate Q1 2026 volumes from 2025 production trajectories.
For NVIDIA, Q1 2026 shipment volumes are not yet fully observable. To account for this, we use NVIDIA’s year-end 2025 inventory balances as a proxy for production that had occurred but had not yet been recognized as revenue. Work-in-process inventory captures chips and components that are still moving through the production process, while finished-goods inventory captures completed products that have not yet been sold. Both categories can therefore reflect logic wafers, HBM, and advanced packaging capacity consumed before year-end 2025. We use these balances to estimate the portion of Q4 2025 production, and associated component consumption, that is not captured by Q4 2025 sales.
This approach relies on a sell-through assumption: WIP that is close to completion, along with finished-goods inventory on the year-end balance sheet, is assumed to convert into shipments within roughly one quarter. This assumption, while not strictly true, is sensible for NVIDIA. NVIDIA is the market leader in AI accelerators, and demand for its GPUs remains exceptionally high, so year-end inventory is more likely to reflect near-term shipments rather than excess supply. However, NVIDIA can hold GPUs in inventory for several quarters in some cases. For example, Reuters reported that NVIDIA had roughly 700,000 H200s in stock at the end of 2025 that they likely intended to sell to Chinese customers once H200 exports were approved, which we cover further below.
Inventory is segmented into three categories, each representing a different stage of the supply chain.
Raw materials are physical components and materials received but not yet used in the production process. For data center GPUs, we assume the GPU-attributable share of raw materials is primarily HBM, alongside substrates, power-delivery components, and other auxiliary inputs. We make this assumption because HBM is one of the highest-cost purchased inputs in advanced AI GPUs and was supply-constrained in 2025, making it more likely to be procured ahead of final assembly than lower-value components.
Work-in-process inventory consists of chips and components at intermediate stages of assembly. Some WIP may have consumed only logic wafer capacity: for example, fabricated logic dies that have not yet entered CoWoS packaging. Other WIP may be further along and have already consumed logic, HBM, and CoWoS capacity. Since CoWoS packaging attaches HBM to the logic die through an interposer, HBM must be available before the packaging step can begin.
Finished goods are completed accelerators that have passed through all manufacturing stages but have not yet been sold. Finished goods have consumed logic wafers, HBM, CoWoS packaging, and auxiliary components. Given that they are completed-chips, we first estimate when they were added to FG inventory, then apply the standard manufacturing lag to estimate when their semiconductor inputs were consumed.
The inventory balances for NVIDIA are:
| Inventory Category | NVIDIA Jan ‘26 |
|---|---|
| Raw materials | $3.8B |
| Work-in-process | $8.8B |
| Finished goods | $8.8B |
| Total | $21.4B |
Before converting inventory balances to units, we first isolate the GPU share of total inventory. NVIDIA’s inventory spans multiple business lines. Thus, we proxy the GPU share using GPU revenue as a fraction of total revenue, which is 70%–80%.
We then convert GPU-attributable inventory balances into implied chip units. GAAP inventory is recorded at manufacturing cost, not selling price or market value. NVIDIA’s finished-goods balance, therefore, is the production cost of those chips rather than the revenue they would generate. To estimate unit counts from dollar balances, we divide inventory value by a per-chip bill-of-materials cost.
This bill-of-materials estimate is built bottom-up: we estimate logic die fabrication cost from wafer prices, dies per wafer, die yields, and chiplets per GPU; HBM cost from HBM capacity and price per GB; CoWoS packaging cost from packaging wafer prices, packages per wafer, and packaging yield; and then add auxiliary component costs. We also adjust logic and HBM costs to account for expected losses during packaging, since failed packaging attempts can scrap the attached die and memory. This gives BOM estimates for each chip, which we use to convert inventory balances into implied unit counts. You can find our BOM estimates for each chip here.
We attribute inventory to the quarter in which the underlying semiconductor inputs were consumed, rather than the quarter in which the inventory appears on the balance sheet. This matters because raw materials, work-in-process inventory, and finished goods represent different stages of production.
For raw materials, we do not apply an additional manufacturing lag. We assume the data center GPU share of raw materials is primarily HBM, alongside substrates and other components. The relevant event is the purchase or receipt of HBM, not the completion of a chip. We therefore phase raw-materials inventory directly across the quarters in which we estimate the HBM was procured. Since HBM3E supply ramped most sharply in the second half of 2025, we attribute the year-end stockpile as 30% in Q3 and 70% in Q4 2025.
For work-in-process, we apply a separate inventory timing treatment rather than the standard shipment lag. WIP is a point-in-time inventory balance: it represents chips or components that NVIDIA already owns or controls, but that have not yet reached finished-goods inventory. We model WIP as a mix of chips that have consumed only logic wafer capacity and chips that have also consumed HBM and packaging capacity.
At the median, we assume 70% of WIP has reached the packaging stage and has therefore consumed logic, HBM, and CoWoS. The reasoning is based on the relative time a chip spends at each stage of WIP. A GPU enters WIP when its logic die is received from TSMC. It then spends roughly 1–4 weeks for wafer testing and dicing and then CoWoS packaging begins, followed by another ~8 weeks moving through CoWoS packaging, final assembly, and testing before reaching finished goods. This means only about less than one-third of WIP time is spent in the logic-only stage, and the remaining more than two-thirds is past CoWoS start. Thus, we assume that, at any given snapshot, roughly 70% of WIP units have already consumed HBM and CoWoS capacity. We model this share as uncertain (range 55%–82%) to capture variation in lead times across chip generations and year-end production effects.
We attribute WIP component consumption mostly to Q4 2025. Chips still in WIP at NVIDIA’s fiscal year-end on Jan. 26, 2026, were likely in active production shortly before the snapshot. Even for WIP chips that had reached CoWoS packaging or final testing, the associated logic wafer fabrication likely occurred roughly 8–12 weeks earlier, placing it mostly in calendar Q4 2025. HBM and CoWoS integration occur later in the process and are therefore also likely to fall in Q4 for chips still in WIP at the Jan. 26 snapshot.
This treatment may overstate Q4 2025 consumption because NVIDIA’s fiscal year-end falls nearly four weeks after calendar year-end. Some inventory captured in the Jan. 26 balance corresponds to component consumption in early calendar Q1 2026. We accept this approximation for simplicity and because the WIP balance is intended to capture NVIDIA’s late-2025 production pipeline rather than precisely date each unfinished unit.
For finished goods, we apply the standard manufacturing lag. Finished goods are completed accelerators, so we can treat their assigned production quarter as a completion event. We back-shift logic die fabrication using the 10-week lag (10/13 prior quarter, 3/13 same quarter) and CoWoS/HBM integration using the standard two-thirds prior-quarter / one-third same-quarter split. NVIDIA’s finished-goods balance is phased across fiscal Q2, Q3, Q4 2025 at 5%, 10%, 85%, respectively. We then map those fiscal-quarter finished-goods balances into calendar-quarter component consumption using the standard manufacturing lags.
The core NVIDIA timing assumptions are:
| Inventory Category | Q3 | Q4 | Notes |
|---|---|---|---|
| Raw materials | 30% | 70% | HBM was supply-constrained throughout 2025. Memory makers ramped HBM3e output through the year, so the stockpile that NVIDIA holds at year-end was acquired progressively, with the largest portion arriving in Q4. The split reflects this gradual accumulation. |
| Work-in-process – chips with only logic completed | 0% | 100% | WIP chips with only logic completed would have had their logic dies fabricated in the few weeks immediately before year-end. We attribute 100% of their consumption to Q4. |
| Work-in-process – full-stack chips | 0% | 100% | Both logic and CoWoS+HBM steps fall in Q4 in the current model. |
We apply the following consumption assumptions:
Raw materials: We model HBM as 30%–70% of GPU-attributable raw materials, with a median of 50%.
Work-in-process: We attribute WIP consumption across chip types with meaningful uncertainty. NVIDIA WIP is assumed to consist predominantly of Blackwell-generation chips: B300 at a median of 80%, with a 70%–90% range, and B200 at a median of 20%, with a 10%–30% range. By Q4 2025, the Blackwell ramp had largely displaced Hopper in the production pipeline, and B300 accounted for 63% of unit sales. Since we expect the B300 share of sales to continue rising, we model NVIDIA WIP as 80% B300 at the median.
Finished goods: We use NVIDIA’s full January 2026 finished-goods balance of $8.8B, adjusted for stranded H200 inventory as described below. After deducting stranded H200 inventory, we allocate the remaining finished-goods balance across Blackwell chips, with B300 at a median of 75% and B200 at a median of 25%.
AMD is treated differently because we can estimate Instinct shipments from its Q1 FY2026 earnings. With the standard manufacturing lag, those Q1 2026 shipments already capture much of AMD’s Q4 2025 logic, HBM, and CoWoS consumption. This means AMD’s Dec. 2025 WIP balance is largely an intermediate pipeline snapshot of chips that later flowed into Q1 2026 shipments. Modeling that WIP separately would risk counting the same production twice.
However, shipment-based volumes may still undercount AMD’s component consumption if AMD manufactured Instinct GPUs that consumed semiconductor inputs but remained unsold after the Q1 FY2026 shipment window. This risk is higher for AMD than for NVIDIA. AMD’s Instinct sales were more uneven, with weaker MI325X demand than expected, MI308X export-control disruption, and even a sequential decline in Data Center AI revenue from Q4 FY2025 to Q1 FY2026. These factors make AMD’s sell-through assumption weaker: completed Instinct GPUs may have accumulated in inventory rather than flowing cleanly into near-term shipments.
AMD’s inventory disclosures support this treatment. AMD said its 2025 inventory increase was primarily to support the continued ramp of Data Center products in advanced process technology nodes. The composition of the inventory build is also informative: from FY2024 year-end to FY2025 year-end, raw materials increased from roughly $350M to $909M (+159%), WIP increased from roughly $4.3B to $4.8B (+11%), and finished goods increased from roughly $1.1B to $2.2B (+105%). This suggests that AMD’s production pipeline remained large, but the incremental inventory build was concentrated in inputs procured ahead of future production and completed products sitting on the balance sheet, rather than a proportional increase in unfinished WIP.
Finished goods are therefore different from WIP. AMD’s Mar. 2026 finished-goods balance represents completed products that still had not shipped after the Q1 FY2026 shipment window. These GPUs consumed logic, HBM, and CoWoS capacity, but are not captured by the Q1 FY2026 shipment path. We therefore exclude AMD WIP, but include a separate finished-goods adjustment for completed Instinct GPUs that remained unsold at the end of Q1 FY2026.
AMD raw materials also remain relevant because HBM procurement is a separate event from packaging-stage HBM attachment. Raw-material HBM may have been purchased before a chip entered packaging, so it is not fully captured by the shipment-based lag adjustment.
AMD’s Mar. 2026 inventory balances are:
| Inventory Category | AMD Mar ‘26 |
|---|---|
| Raw materials | $752M |
| Work-in-process | $4.7B |
| Finished goods | $2.5B |
| Total | $8.0B |
For AMD, we estimate that Instinct GPUs generated approximately $7.2B of revenue in 2025, compared with AMD’s total revenue of $34.6B, equating to Instinct accounting for roughly 21% of total revenue. However, AMD attributed its 2025 inventory increase to advanced-process data center products, a category that includes Instinct GPUs, EPYC server CPUs, and related data center products. Since the inventory buildup was concentrated in advanced data center products rather than spread evenly across AMD’s full business, we model Instinct GPUs as a somewhat higher share of inventory than their share of total revenue. We use a 15%–70% range, with a median of 40%, to reflect both the Instinct ramp through 2025 and the fact that AMD’s data center inventory also includes a sizable non-GPU component.
We then convert Instinct-attributable inventory balances into implied chip units using the same BOM approach used for NVIDIA. GAAP inventory is recorded at manufacturing cost, so we divide inventory value by a per-chip BOM cost, built bottom-up from logic die fabrication, HBM, CoWoS packaging, packaging yield losses, and auxiliary components.
For AMD raw materials, we do not apply an additional manufacturing lag. The relevant event is the purchase or receipt of HBM, not the completion of a chip. We therefore phase AMD raw-material HBM procurement directly across the quarters in which we estimate the HBM was procured. We use AMD’s Mar 28, 2026 raw-materials balance ($752M) and spread it as 30% to Q4 2025 and 70% to Q1 2026.
We use AMD’s Mar. 28, 2026 finished-goods balance of $2.5B, multiplied by the Instinct share of inventory: 15%–70%, with a median of 40%. We treat this as cost-basis inventory, not revenue, and convert it into component consumption using chip-specific BOM assumptions rather than ASPs. Since finished goods are completed products, the adjustment is modeled as full-BOM content: logic, HBM, CoWoS packaging, and auxiliary components.
Rather than assigning this balance to the quarter in which it appears on the balance sheet, we attribute it to the quarters in which the underlying chips were completed and added to finished-goods inventory, then apply the standard manufacturing lag to back-shift component consumption. Allocation across SKUs is as before: MI325X 20%, MI350X 30%, and MI355X 50% at the median.
| Quarter completed and added to FG | p5 | p50 | p95 |
|---|---|---|---|
| Q3 FY2025 | 5% | 20% | 30% |
| Q4 FY2025 | 20% | 35% | 50% |
| Q1 FY2026 | 30% | 45% | 70% |
Each Monte Carlo sample’s three shares are normalized to sum to 100%. We then apply the standard manufacturing lag to each fab quarter: logic die fabrication uses a 10-week lag — roughly 77% of consumption shifts one quarter earlier and 23% remains in the same quarter as the production-completion event. CoWoS/HBM integration follows an 8-week lag at two-thirds prior-quarter / one-third same-quarter. For example, FG units completed in Q1 2026 contribute their logic consumption mostly to Q4 2025 and split their CoWoS/HBM consumption two-thirds to Q4 2025 and one-third to Q1 2026.
This adjustment is an inferred inventory signal, not a disclosed product-level inventory figure. AMD inventory is company-wide, finished goods can be affected by shipment timing and customer acceptance, and export controls can distort the relationship between production, inventory, and revenue. We therefore use broad uncertainty ranges and interpret the adjustment as a way to capture plausible completed-but-unsold Instinct production, rather than as a precise estimate of AMD’s unsold GPU units.
For Google and Amazon, we take a different approach. Unlike NVIDIA and AMD, which hold finished chip inventory, Google and Amazon rely on external backend partners for chip manufacturing and packaging, so finished chips that haven’t been shipped yet do not appear on their balance sheets. Given that the inventory approach is infeasible for Google and Amazon, we extrapolate Q1 2026 volumes from their 2025 production trajectories.
For Amazon, we project forward from the 2025 Trainium2 trajectory. Volumes grew at roughly 17% quarter-on-quarter through 2025: 210k, 250k, 290k, and 335k units. A log-linear regression, a geometric-mean projection, and a cross-check against AWS’s announced FY2026 capex guidance of $200B all converge on approximately 390,000 units for Q1 2026. For simplicity, we model Amazon’s Q1 2026 ASIC production as Trainium2, though some Trainium3 production is already underway. We apply ±30% uncertainty bands to account for ramp-timing uncertainty and ambiguity around the Trainium2/Trainium3 generation mix.
For Google, we estimate Q1 2026 TPU volumes using Broadcom’s fiscal Q2 2026 AI semiconductor guidance. Broadcom guided to $10.7B of AI semiconductor revenue for fiscal Q2 2026. We first separate AI networking from custom ASIC revenue. On Broadcom’s fiscal Q1 2026 earnings call, management stated that AI networking represented roughly one-third of AI revenue in fiscal Q1 and was expected to rise to about 40% in fiscal Q2. We therefore estimate fiscal Q2 AI networking revenue at $4.28B and the remaining custom ASIC / XPU revenue pool at $6.42B.
We assume Google TPUs account for 85% of Broadcom’s custom ASIC / XPU revenue in fiscal Q2 2026. This implies Google TPU-related revenue of approximately $5.46B for Broadcom in fiscal Q2 2026.
We added this fiscal Q2 2026 TPU revenue row to the input sheet used by our TPU volumes script. The script then converts the fiscal-quarter revenue estimate into calendar-quarter TPU volumes using the same revenue-to-unit conversion logic used for prior TPU quarters. This gives us a calendar Q1 2026 TPU volume estimate that can be fed into the component-consumption model and lagged backward into the quarters in which logic wafers, HBM, and CoWoS packaging capacity were consumed. The resulting Q1 2026 TPU unit estimates are:
| TPU Chip Type | Estimated Units in Q1 26 | 90% CI |
|---|---|---|
| TPU v7 | 310k | 232k – 399k |
| TPU v6e | 131k | 29k – 229k |
| Total | 441k | 261k – 628k |
Changes to export control restrictions resulted in chips that were manufactured but could not be sold. These chips consumed logic wafer, HBM, and advanced packaging capacity during production, but would not appear in revenue if they were blocked from sale by export restrictions. We therefore add the write-down volumes back into demand and attribute consumption to the quarters in which we estimate the chips were manufactured.
NVIDIA disclosed a $1.9B inventory write-down on H20 GPUs following US export restrictions in April 2025. Converting at H20 bill-of-materials cost yields approximately 600,000–700,000 units. This aligns with independently reported estimates of 600,000–700,000 H20 GPUs caught by export controls, providing a point of validation for our BOM model.
We assume the affected H20s were finished and added to inventory primarily in the months immediately before the restrictions took effect. We weight the split toward Q1 2025 because H20 sales volumes remained high in both Q4 2024 and Q1 2025, at roughly 350,000 units per quarter, implying that much of the H20 output from earlier quarters was likely sold rather than held in inventory. The April 2025 write-down, therefore, was likely from a more recent inventory build-up, with most of the affected units produced in Q1 2025 shortly before the new restrictions took effect.
The table below shows the median assumed timing of when the affected H20 units were completed and added to inventory.
| Month Completed | Share of H20 Writeoff Units |
|---|---|
| December 2024 | 9% |
| January 2025 | 15% |
| February 2025 | 23% |
| March 2025 | 31% |
| April 2025 | 22% |
| Total | 100% |
While the units were mostly completed in Q1 and Q2 2025, the component inputs for these chips were consumed before the finished GPU entered inventory. We apply our manufacturing lags, which models H20s completed in February, March, and April 2025, consuming logic wafer capacity in roughly December 2024, January 2025, and February 2025, respectively. Their HBM and CoWoS capacity would have been consumed in roughly December 2024, January 2025, and February 2025, respectively.
As a result, even though the affected H20s were finished mostly in Q1 2025 and early Q2 2025, a meaningful share of their semiconductor inputs was consumed in Q4 2024.
AMD recorded a gross $800M write-down on MI308X inventory in Q2 2025, partially reversed by $360M in Q4 2025 when export licenses were granted and those units were sold. The net unresolved provision is $440M. Since the sold chips and the written-down chips came from the same production batch, we treat both tranches identically for timing purposes: converting the full batch at MI308X bill-of-materials cost and applying the same Q4 2024 / Q1 2025 production split used for H20. The Q4 2025 MI308X shipments that appear in the Chip Sales Hub are excluded from the standard shipped-units calculation to avoid double-counting.
Reuters reported that NVIDIA had around 700,000 H200 units in inventory at the end of 2025. These chips consumed supply chain capacity during production but had not yet generated revenue by year-end.
We attribute these units to the quarters in which they were produced rather than treating them as Q4 2025 inventory. The table below shows the median assumed timing of when the stranded H200 units were completed and added to inventory.
| Month Completed | Share of H200 Inventory Units |
|---|---|
| 2025 Q1 | 15% |
| 2025 Q2 | 22% |
| 2025 Q3 | 45% |
| 2025 Q4 | 18% |
| Total | 100% |
This spread is based on two assumptions, but holds significant uncertainty. First, NVIDIA is unlikely to hold large volumes of finished GPUs in inventory across many quarters, especially for products that still have strong demand. H200s made up the majority of units sold in Q4 2024, with roughly 680,000 units sold, suggesting that the H200 production line was operating at scale entering 2025. H200 sales also remained meaningful in Q1 2025, at roughly 330,000 units. We therefore assume most of the stranded H200 inventory came from relatively recent production rather than from chips built early in 2025 and held for the rest of the year.
Second, we still taper H200 production by late 2025 because Blackwell was moving into full production and sales. As Blackwell absorbed more of NVIDIA’s HBM, CoWoS, and advanced-node logic capacity, H200 production likely declined. This is why we do not place most stranded H200 production in Q4, even though the inventory was reported at year-end. Instead, we model the largest share in Q3: recent enough to plausibly remain in inventory at year-end, but early enough that H200 lines and supply commitments may still have been meaningfully active before the Blackwell ramp fully displaced Hopper-generation production.
In the finished goods accounting, we value the 700,000 stranded H200 units at H200 bill-of-materials cost ($3,480) and deduct that amount ($2.45B) from NVIDIA’s total finished goods balance before allocating residual dollars to Blackwell chips. This ensures H200 and Blackwell production are attributed to their respective consumption quarters without double-counting.
Once we have quarterly unit volumes for each chip, including shipment volumes, inventory adjustments, and export-control write-down adjustments, we translate those units into consumption of each supply chain component using per-chip bill-of-materials specifications and yield estimates.
Logic wafer consumption is driven by die yield and packaging yield:
Logic wafers = Units ÷ (DPW × logic_die_yield × packaging_yield)
CoWoS packaging is performed on 12-inch wafers at TSMC’s advanced packaging facilities. Multiple packages are produced per wafer, and not all survive the packaging process.
CoWoS wafers = Units ÷ (PPW × packaging_yield)
HBM demand is expressed in USD to match the market-level supply denominator.
HBM ($) = (Units × HBM_per_chip_GB × price_per_GB) ÷ packaging_yield
All bill-of-materials parameters, including die sizes, dies per wafer, logic yields, packages per wafer, and packaging yields, have P5, median, and P95 uncertainty ranges. We sample across these ranges in the Monte Carlo analysis.
We also translate physical component demand into dollar values to estimate quarterly semiconductor and component spend by AI chip designer. The component cost chart breaks total spend into four categories: logic, CoWoS, HBM, and auxiliary components.
Logic cost is calculated as logic wafers consumed multiplied by the relevant node-specific wafer price. Node prices are drawn from the parameters sheet. At the median, we use approximately $19,000 per wafer for 3 nm and $17,500 per wafer for 5 nm.
CoWoS cost is calculated as CoWoS wafers consumed multiplied by the relevant packaging wafer price. CoWoS-S is priced at $12,000 per wafer and CoWoS-L at $17,000 per wafer.
HBM cost is already modeled in terms of cost and is taken directly from the model.
Auxiliary cost captures per-unit components not included in logic, CoWoS, or HBM, such as the PCB and power-delivery components. We apply a fixed auxiliary cost per chip ranging from approximately $250 per unit for TPU v5e to approximately $800 per unit for MI355X.
All four cost components are assigned to the quarter in which production consumes the relevant supply-chain inputs. They include demand from shipped units, WIP and finished-goods inventory builds and drawdowns, export-control write-downs, and HBM raw-material inventory changes. The chart covers Q1 2024 through Q4 2025.
In 2025, the top 4 US AI accelerator designers consumed an estimated:
AI now dominates the advanced packaging and HBM markets, but is still a single-digit-percent consumer of leading-edge logic capacity. Apple alone likely consumes more N3-family wafers than all four AI accelerator designers combined. The supply-chain bottleneck for AI accelerators in 2024 and 2025 was likely to be packaging and memory, which aligns with TSMC CEO’s repeated statements that CoWoS was “sold out through 2025 and into 2026” while leading-edge logic remained more elastic.
NVIDIA dominates and consolidated share materially through 2025:
| Component | NVIDIA 2024 Share | NVIDIA 2025 Share | Change |
|---|---|---|---|
| Logic | 5.4% | 7.4% | +2.0 pp |
| CoWoS Packaging | 53.4% | 58.5% | +5.1 pp |
| HBM | 45.6% | 69.3% | +23.7 pp |
NVIDIA’s HBM share jump is the most significant of the components: a 24 percentage-point gain YoY. NVIDIA’s HBM consumption grew far faster than the global HBM market itself. While the market roughly doubled from $18B to $35B, NVIDIA’s HBM consumption nearly tripled from ~$8.4B to ~$24.3B.
Two effects drove this outsized growth: (1) NVIDIA’s average HBM-per-chip rose sharply with Blackwell. The B300 has 288 GB of HBM3e per chip versus 80 GB on H100, so as the mix shifted toward B300 in 2H 2025, NVIDIA’s per-unit HBM consumption ballooned. And (2) US HBM export restrictions took effect in December 2024, sharply reducing China’s HBM consumption and redistributing a meaningful share of the supply pool toward US AI accelerator designers.
The bottlenecks to AI chip manufacturing can shift from year to year as production volumes grow, chip specifications change, and the supply of each key component expands at different rates.
In late 2024 and early 2025, CoWoS was acutely supply-constrained, then eased as TSMC ramped capacity. We estimate that our four tracked AI designers consumed 98% of the total CoWoS supply in Q4 2024 and 99% in Q1 2025. From Q2 2025 onward, TSMC expanded CoWoS capacity rapidly, and AI consumption eased to 83% in Q2 2025, 86% in Q3, and 84% in Q4. This is consistent with TSMC’s CoWoS WPM roughly doubling from end-2024 to end-2025 (from ~39k to ~66k WPM) and it also matches industry reporting that CoWoS-S in particular saw inventory build in mid-2025 even as CoWoS-L stayed tight for Blackwell.
HBM moved in the opposite direction. While supply was already constrained in 2024, it became extremely tight in H2 2025. Our four tracked AI designers consumed roughly 64–68% of global HBM throughout 2024, which was meaningful but not yet fully binding. By 2025, with Blackwell ramping and carrying far more HBM per chip than Hopper, and US HBM export restrictions sharply curtailing China’s HBM consumption, the four designers’ share climbed steeply to 92% in Q3, and 97% in Q4. By the second half of the year, AI accelerator designers were essentially absorbing all incremental HBM the memory makers could produce, with SK Hynix and Micron sold out for the year.
NVIDIA captured capacity in a tight HBM market. Its individual HBM consumption nearly tripled while the global HBM market only roughly doubled, pulling NVIDIA’s share from 46% to 69%. This is consistent with NVIDIA’s own commentary that it was adding to inventory and “ordering to secure long lead-time components” to meet Blackwell demand and support future architecture ramps.
Each period in the demand tables includes an “Other” row that captures global component supply not assigned to the four tracked designers. This residual can include several things: component capacity consumed by chip designers we do not track, capacity that was available but not utilized in that period, upstream inventory held by partners we do not directly model, and error in our component-consumption estimates.
The “Other” row ensures that stacked bar charts sum to 100% of global supply. It should be interpreted as an unattributed residual, not as a clean estimate of unused capacity. A positive residual may represent real demand from untracked designers, unused capacity, upstream inventory, or underestimation of tracked-designer demand. A zero residual does not necessarily mean the four tracked designers consumed all available supply; it can also mean that uncertainty or timing assumptions pushed estimated demand above the median supply denominator.
While the residuals should not be treated as precise point estimates, they can still be informative. In some cases, changes in residual line up with known market dynamics and provide useful signal about demand outside the four tracked designers. For example, in Q4 2024, the HBM “Other” residual rises to 36% of our estimated global HBM supply, or about $2.65B out of a $7.5B quarterly market. More than one-third of global HBM supply that quarter was not accounted for by the four major US chip designers. That would be surprising if the residual were just model noise. But it lines up with reports that Chinese companies aggressively stockpiled HBM before US HBM export controls took effect at the end of December 2024.
SemiAnalysis estimated that Samsung exported around 7 million HBM stacks to Chinese companies in December 2024 alone. If those were mostly 16GB HBM2e stacks, then at $12–$15 per GB this would imply roughly $1.3B–$1.7B of HBM exports to China in December alone. We assume 16GB rather than 24GB stacks because the Huawei Ascend 910B has 64GB of HBM2e across four stacks. This December stockpiling accounts for a meaningful share of the Q4 2024 “Other” residual and helps explain why the residual increased sharply from Q3 to Q4.
By Q1 2025, after export controls began to bite, the HBM residual fell significantly to 15% of estimated quarterly supply, or about $940M. This decline is consistent with China’s HBM consumption falling after the controls took effect.
We therefore interpret “Other” as a meaningful but imperfect signal. At the annual level, the residuals appear reasonably robust. For full-year 2025, the probability that tracked demand exceeds total supply is approximately 0% for CoWoS and approximately 5% for HBM. This suggests the annual residuals are likely real, though their composition remains uncertain. For CoWoS, the residual likely reflects a mix of unattributed AI demand and some unutilized capacity during the capacity ramp. For HBM, the residual likely reflects a mix of untracked demand, upstream inventory, and remaining model error.
Quarter-level residuals require more caution because these estimates are more sensitive to timing assumptions. CoWoS in Q1 2025 is a fragile quarter: tracked demand and median supply are close, so a meaningful share of Monte Carlo draws imply tracked demand above supply. HBM in Q4 2025 is also sensitive because inventory assumptions concentrate a large amount of year-end WIP and finished-goods consumption into that quarter. In these cases, the residual should be read as a directional signal rather than a precise estimate of spare capacity or untracked demand.
We estimate total AI chip component spend at:
| Year | Logic | CoWoS | HBM | Auxiliary | Total |
|---|---|---|---|---|---|
| 2024 | $3.1B | $4.0B | $12.1B | $2.8B | $22.0B |
| 2025 | $7.0B | $8.3B | $31.5B | $5.3B | $52.1B |
HBM stands out as the highest cost component, accounting for 60% of all AI chip components spent in 2025. Total component spend grew from $3.2B in Q1 2024 to $17.4B in Q4 2025 — a 5.4× increase over two years. NVIDIA alone accounts for $13.4B (77%) of Q4 2025 component spend.
We compare our estimates against independently published analyst and industry estimates to validate the methodology’s core assumptions. Strong convergence across independently derived estimates increases confidence in the results.
Our central estimate of approximately 377,000 CoWoS wafers for NVIDIA in 2025 (~59% of global supply) aligns with independent published estimates:
JP Morgan also estimates NVIDIA 2024 CoWoS consumption to be 192,000 wafers, which matches our estimate of 187,000.
Our AMD estimate of approximately 55,000 CoWoS wafers for 2025 aligns closely with JP Morgan’s estimate of 52,000 wafers. SemiWiki’s estimate of AMD at 8% of global CoWoS capacity demand matches our result of 8.6%. JP Morgan also estimated AMD at 45,000 CoWoS wafers in 2024, which is close to our estimate of 49,000 wafers.
Our Google TPU share estimate of approximately 13% of global 2025 CoWoS demand is consistent with SemiWiki’s estimate of 13%. Global Semi Research estimates Google’s 2025 CoWoS demand at approximately 80,000 wafers, essentially identical to our estimate of 81,000 wafers. JP Morgan’s estimates are higher: 96,000 Broadcom CoWoS wafers in 2025 and 70,000 in 2024, compared with our estimates of 81,000 in 2025 and 51,000 in 2024. Since their methodology is not public, it is difficult to say which modeling assumptions explain the discrepancy. JP Morgan also buckets this demand under Broadcom, which could include some non-TPU ASIC projects, such as Meta’s custom ASICs.
Our 2025 HBM share estimates are broadly consistent with external figures. TrendForce estimates NVIDIA’s share of global HBM value consumed in 2025 at 68%, AMD’s at 10%, Google’s at 9%, and Amazon’s at 8%. Our median estimates are NVIDIA at 69%, AMD at 7%, Google at 8%, and Amazon at 6%.
Companies report revenue at the segment level, not the accelerator level. Allocating revenue to individual chip generations requires assumptions about product mix that can introduce meaningful error, particularly during transition quarters when multiple product generations are sold simultaneously.
Manufacturing lag estimates are derived from public industry sources and represent typical durations. If actual lags differ — for example, if TSMC prioritized AI chip fabrication turnaround — the lag-adjusted supply window would shift, changing the denominator and calculated shares. The assumption that supply is evenly distributed within each calendar year is a simplification that introduces modest error in periods of rapid capacity change.
Companies do not disclose the composition of raw materials, WIP, or finished goods at the accelerator level. Our assumptions about what sits in each inventory bucket — HBM in raw materials, fabricated dies in WIP, completed packages in finished goods — are working models, not directly observed facts. If the true accounting differs, our estimates of consumption from inventory could be off.
The two-stage WIP model is an assumption about the average stage of chips in the WIP pipeline; the true distribution is unobservable from public filings.
We do not model inventory for Google and Amazon. This leads to a likely understatement of their supply chain consumption, particularly for HBM, where raw-material inventory ahead of CoWoS packaging could be substantial, given the TPU and Trainium ramps.
The parameter isolating GPU-related inventory from total company inventory is a broad estimate. The revenue-share proxy may not accurately reflect the inventory composition on any given balance sheet date.
Per-chip BOM specifications, including die sizes, dies per wafer, logic yield rates, packages per CoWoS wafer, packaging yields, and HBM capacities, are estimated from public chip specifications, analyst reports, and media sources rather than disclosed by chip designers or TSMC. These specifications directly affect the conversion of unit volumes into wafer and HBM consumption; errors propagate into all share estimates.