The constraint on AI compute moved from the chip to the power plant

Key takeaways

The real GPU chokepoint was never the logic die (about $300 of an H100's ~$3,320 bill of materials) but the high-bandwidth memory and TSMC's CoWoS packaging that assembles it, both largely sold out into 2026.
Power, not silicon, is now the binding constraint: a single large training site draws 100 to 1,000 megawatts, and the substations and grid interconnection that feed it run on multi-year timelines a chip order cannot expedite.
Global data-centre electricity use is on track to more than double, from about 415 TWh in 2024 to roughly 945 TWh by 2030 in the IEA base case.
Hyperscalers have shifted from buying renewable credits to locking up firm, around-the-clock generation, 13 nuclear deals committing more than 9.7 GW by May 2026, though the IEA still expects gas and coal to meet about 40% of the new demand through 2030.
The cost lands on ratepayers: PJM's market monitor tied billions in higher capacity payments to data-centre growth, and households near major clusters in Virginia, Texas, and Georgia are already seeing 8 to 15% rate increases.

For two years the AI compute story has been told as a chip story: who has the most GPUs, and who is stuck in Nvidia’s queue. That framing is a year out of date. The binding constraint on frontier AI has moved off the silicon and onto the power line, and that shift changes who actually controls the frontier.

The most contested object in AI is no longer something you can hold in your hand. It is a transformer at a substation, and the contract that says whose load it serves first.

The chip was never quite the chip

Start with the thing everyone watched. A Nvidia H100, the workhorse of the 2024 buildout, sells for roughly $25,000, but its estimated bill of materials runs around $3,320: about $300 for the logic die, around $1,350 for the high-bandwidth memory, and roughly $750 for the packaging that bonds the two together. The expensive, scarce parts of a “GPU shortage” were never the processor. They were the memory and the step that assembles it.

That step is CoWoS, the advanced packaging process at TSMC that stacks memory next to the compute die. It is the tightest link in the chain. TSMC’s chief executive C.C. Wei described its capacity as sold out through 2025 and into 2026, and Nvidia has booked more than half of the available 2026 to 2027 allocation for its own parts. Capacity is climbing, from roughly 75,000 wafers a month in 2025 toward 135,000 by 2027, and demand still outruns it.

The memory tells the same story. SK Hynix supplies most of the high-bandwidth memory Nvidia ships, Micron has signaled it can meet only 55 to 60 percent of demand, and HBM3e contract prices rose about 20 percent for 2026. A price increase in memory is itself the tell. This is an industry built on falling cost per bit, raising prices because it cannot make enough.

So when people say Nvidia controls AI compute, the claim is half right. Nvidia holds roughly 80 to 90 percent of the accelerator market and more than 90 percent of training, and CUDA keeps customers from leaving. But the chokepoint inside that dominance sits one layer down, at a packaging line in Taiwan and a memory fab in Korea. Control the die and you still wait on the things that surround it.

What is actually scarce is power

You can argue about packaging and memory all day and still miss the larger constraint: the GPU only matters once you can power it.

The numbers are not subtle. The IEA reports that global data center electricity demand grew 17 percent in 2025, and demand from AI-focused sites surged about 50 percent in a single year. Global data center consumption sat near 415 terawatt-hours in 2024, about 1.5 percent of world electricity. The IEA’s base case has it roughly doubling to about 945 TWh by 2030. In the United States, utility power delivered to data centers is forecast to rise from about 62 gigawatts in 2025 to roughly 76 by 2026.

Global data-centre electricity use is set to more than double by 2030 Worldwide data-centre electricity consumption: 2024 actual vs the IEA base-case projection for 2030. Sources: Brookings (2024 baseline) and the IEA's Energy and AI base case (2030 projection). Source: IEA, Energy and AI (base case); Brookings (2024 baseline)

Global data-centre electricity use is set to more than double by 2030
Category	Global data-centre electricity use
2024	415 TWh
2030 (base case)	945 TWh

Cite or embed this

Free to reuse with a credit link back to The Counter Brief.

A single large training facility now draws between 100 and 1,000 megawatts, the load of a small city. And unlike a chip order, you cannot expedite a substation. Transmission upgrades and grid interconnection run on a clock measured in years, not the 36 to 52 week lead times that GPUs were running.

The training-cost math makes the point in reverse. Epoch AI’s accounting of a frontier training run puts hardware at 47 to 67 percent of the cost and energy at only 2 to 6 percent. Energy looks cheap on that ledger. It is not the bill that constrains you. It is whether the grid can deliver the megawatts at all, on the timeline your training run needs them.

Who controls the power now owns the queue

Watch where the money went, and the new control layer is obvious. The hyperscalers stopped buying renewable energy credits and started buying physical, around-the-clock generation, much of it nuclear.

Amazon signed an $18 billion, 1.9 gigawatt agreement with Talen Energy for the Susquehanna plant in June 2025. Microsoft underwrote a 20-year deal to restart the 835 megawatt Three Mile Island Unit 1, targeted for 2027. Meta has contracted up to 6.6 gigawatts across legacy operators and advanced reactor developers. Google committed to 500 megawatts of Kairos Power reactors. By May 2026, every major hyperscaler had at least one nuclear deal, 13 in total, committing more than 9.7 gigawatts. Hyperscaler capital spending is forecast to clear $600 billion in 2026, roughly three-quarters of it tied to AI infrastructure, and Microsoft alone has become the world’s largest corporate clean-power buyer at about 40 gigawatts contracted.

This is the honest part of the ledger, the part the press releases skip. The clean-energy framing has an asterisk: the IEA expects about 40 percent of the additional data center demand through 2030 to be met by natural gas and coal. Nuclear is the headline. Gas is filling much of the gap underneath it.

The bill that lands on someone else

The grid does not bill the data center for the cost it imposes on everyone else. PJM, the operator covering 13 states and 65 million people, attributes about 7.9 gigawatts of added data center demand in its 2025 to 2026 year and roughly 12 gigawatts the year after. Its market monitor found that stripping data centers out of the forecast would cut regional capacity payments by $9.33 billion, around 64 percent. That cost flows through to ratepayers. A March 2026 Consumer Reports analysis found households near major clusters in Virginia, Texas, and Georgia already seeing 8 to 15 percent rate increases.

Trace the line. A $1,350 memory stack goes into a GPU, the GPU goes into a 200 megawatt hall, the hall pulls on a regional grid, and a retiree in northern Virginia pays more for air conditioning. The compute story and the electricity-bill story are the same story, told at different altitudes.

What got cheaper, so you keep the ledger honest

The buildout is not all cost inflation. Per unit of intelligence, compute keeps getting dramatically cheaper. Epoch finds the cost to run a model at a fixed level of performance has been halving roughly every two months. Stanford’s AI Index put the drop at 280-fold over 18 months, to about $0.07 per million tokens at GPT-3.5 quality.

So both things are true at once. The price of a given answer is collapsing, and total electricity demand is exploding, because cheaper inference invites far more of it. That is the oldest pattern in energy economics wearing new clothes. Efficiency does not reduce consumption when it creates demand faster than it saves power.

The scoreboard worth reading

Two things decide the next 18 months, and neither is a chip launch. The first is regulatory: Constellation’s complaint at FERC over PJM’s co-location rules will set whether the next wave of plants can wire power straight to a campus or must route it through a contested grid. The second is physical: whether CoWoS packaging actually reaches that 135,000-wafer run rate, because until it does, every other number is theoretical.

The scoreboard everyone reads still counts GPUs. The one that matters counts megawatts, and it is kept by utilities, regulators, and a handful of nuclear operators who were nearly bankrupt a decade ago. If you want to know who will be training frontier models in 2028, do not count their chips. Find out whose substation is already energized.

The Counter Brief — one email, every Monday.

The week's AI-for-revenue moves in a 5-minute read: which tools are worth the budget and which to skip, plus what to do this week. Source-checked, no vendor decks.

Edited by Aditya Marin Gasga

Free. One click to unsubscribe.

Frequently asked questions

Is there really an AI chip shortage in 2026?

The shortage is real but misnamed. The scarce parts are high-bandwidth memory and the CoWoS packaging that assembles it, both of which remain largely sold out into 2026 and beyond. Logic dies are not the binding constraint, which is why "chip shortage" understates the problem.

What is actually limiting AI data centers, chips or power?

Increasingly, power. Chips can be ordered with lead times measured in weeks to months, while the substations, transmission, and grid interconnection that energize a large facility run on multi-year timelines. A single large training site can draw 100 to 1,000 megawatts, and that capacity cannot be expedited the way a hardware order can.

Why are electricity bills rising near data centers?

Large data centers add demand that drives up regional capacity costs, and those costs are shared across all ratepayers. PJM's market monitor tied billions in higher capacity payments to data center growth, and a 2026 Consumer Reports analysis found households near major clusters in Virginia, Texas, and Georgia seeing rate increases of 8 to 15 percent.

Are AI companies really going nuclear?

Yes, and at scale. By May 2026 every major hyperscaler had signed at least one nuclear deal, together committing more than 9.7 gigawatts, including Amazon's 1.9 gigawatt Talen agreement and Microsoft's deal to restart Three Mile Island Unit 1. Nuclear supplies firm, around-the-clock power, though the IEA still expects gas and coal to meet a large share of near-term demand growth.

About Aditya Marin Gasga

Founding Editor

Aditya Marin Gasga is the founding editor of The Counter Brief and Head of Growth at Demand Nexus, its parent company, where he works on sourcing qualified pipeline across SDR, content, and paid channels. His background is in performance marketing and demand generation. He studied business administration at Northumbria University.