AI

AWS Quietly Raises GPU Instance Prices 20% Effective July 1

AWS raised EC2 Capacity Block prices for Nvidia GPU instances by approximately 20% from July 1, the second hike in 2026. Here is the new rate book.

Published

4 weeks ago

June 30, 2026

Logan Pierce

Amazon Web Services is raising prices on EC2 Capacity Blocks for ML, the reservation product enterprises use to lock in Nvidia GPU capacity for large-scale training runs, by approximately 20% effective July 1, 2026. The change covers the P6-B300, P6-B200, P5, P5e, P5en, and P4de instance families, which together represent the Blackwell B300 and B200, the H100, the H200, and the prior-generation A100 silicon that anchor most frontier-model training on AWS. New per-accelerator rates include $14.04 per accelerator hour for P6-B300 and $12.355 for P6-B200.

What Just Changed in AWS Capacity Blocks for ML

The pricing change is narrow on purpose. AWS framed the move, as it did in January, as a quarterly supply-and-demand adjustment to one purchasing option, not a broad price hike. The reservation product itself lets customers lock accelerator capacity inside EC2 UltraClusters for windows from one day to several weeks, booked up to eight weeks ahead, paying a published reservation fee plus an operating-system fee. On-demand instances, savings plans, and other AWS prices are unchanged, so the customer most affected is the team that needs guaranteed multi-thousand-GPU capacity for a training run, not the developer spinning up a single H100.

Instance family	Region scope	Rate per accelerator hour (from July 1, 2026)
P6-B300 (Blackwell B300)	All AWS Regions except GovCloud	$14.04
P6-B200 (Blackwell B200)	All Regions except GovCloud	$12.355
P5 (H100)	All US Regions	$5.191
P5 (H100)	All non-US Regions	$4.72
P5e (H200)	All Regions	$5.97
P5en (H200 enhanced)	All US Regions	$6.865
P5en (H200 enhanced)	All non-US Regions	$6.241
P4de (A100)	All US Regions	$2.214

This Is the Second Hike of 2026, and Bigger Than the First

AWS raised EC2 Capacity Block prices by approximately 15% on January 4, 2026, lifting p5e.48xlarge from $34.61 to $39.80 per hour and p5en.48xlarge from $36.18 to $41.61, according to AWS pricing detail published at the time. The July move is the second hike of 2026, five percentage points steeper on a base that was already reset six months ago.

The January round was uneven across regions. In US West (N. California), p5e rose from $43.26 to $49.75 per hour, a steeper jump than the global average for the same instance family. Compounded, the two moves represent a meaningful departure from the cost assumptions baked into the AI budgets that enterprises set at the start of 2026. Capacity Block pricing has moved in the opposite direction of the broader reservation market before: AWS reduced Capacity Block prices three times, once in 2024 and twice in 2025, before flipping to hikes this year.

The contrast with December 2025 is sharper still. At re:Invent in Las Vegas, AWS cut prices on some On-Demand and Savings Plan GPU instances by up to 45%, and explicitly excluded Capacity Blocks from that reduction. Two AWS reservation products have moved in opposite directions inside one calendar year: pay-as-you-go getting cheaper, guaranteed-capacity reservations getting more expensive.

Why AWS Can Raise Prices in a Hurry

AWS is not raising prices because its business is slowing. AWS segment sales grew 28% year-over-year to $37.6 billion in Q1 2026, the fastest growth rate at AWS in 15 quarters, Amazon said in its Q1 financial release. Operating income at the AWS segment hit $14.2 billion in the same period. AI is doing the work: Amazon said Bedrock processed more tokens in the first quarter than in all prior years combined, with 170% growth in customer spend quarter-over-quarter. AI infrastructure demand is also giving 1990s tech brands a second youth as legacy server, storage, and networking vendors ride the same build-out cycle.

The supply side is the other half of the story. Amazon has committed roughly $200 billion in capital expenditure for 2026, the bulk of it earmarked for AWS and AI infrastructure, according to Amazon’s roughly $200 billion 2026 capex plan. AWS has landed more than 2.1 million AI chips over the past 12 months and is on track to receive 1 million Nvidia GPUs starting in 2026 under a deal running through end-2027, per the Nvidia-Amazon 1 million GPU deal. Amazon’s custom-silicon business, Graviton, Trainium, and Nitro, has crossed a $20 billion annual revenue run rate and is growing triple digits year-over-year, according to Amazon’s Q1 2026 financial release.

AWS Q1 2026 segment revenue: $37.6 billion, up 28% year-over-year
Amazon 2026 capex commitment: ~$200 billion, weighted to AWS and AI
AI chips landed by AWS in the past 12 months: 2.1 million+
Nvidia GPUs headed to AWS by end-2027 under the existing deal: 1 million+

Andy Jassy has argued on the Q1 earnings call that every query and inference carries a real price tag, and that customers building frontier models need that capacity reserved rather than auctioned. The July price book is the company testing whether demand is inelastic enough to fund the build-out it has committed to.

The free cash flow tells the cost of that build-out. Amazon’s trailing-twelve-month free cash flow fell to $1.2 billion from $25.9 billion a year earlier, a $59.3 billion year-over-year increase in property and equipment purchases tied primarily to AI investments, per Amazon’s Q1 release. AWS’s chips business topping a $20 billion run rate is one answer to that bill; higher reservation rates on the Nvidia GPU side are another.

The Choice Now Facing Azure and Google Cloud

Microsoft Azure and Google Cloud both sell equivalent GPU reservations for Nvidia H100 and H200 systems, and as of late June 2026 neither has announced a matching hike since the January round.

The strategic options are binary, and both carry a price. Undercutting AWS on the same reservation product would let one of the rival clouds absorb the cost-sensitive AI workloads that AWS just made more expensive, the workloads that often scale across providers anyway. Following AWS preserves industry margins but cedes the headline-grabbing price-cut narrative. Both Microsoft and Google have committed their own capex to AI infrastructure and need revenue against it.

The framing matters as much as the number. AWS has consistently described Capacity Block pricing as dynamic and supply-driven, telling IT Pro after the January hike that “EC2 Capacity Blocks for ML pricing are dynamic and vary based on supply and demand patterns… This price adjustment reflects the supply/demand patterns we expect this quarter.” Cloud economist Corey Quinn pushed back at the time.

This was AWS updating the published base rates on their pricing page… That’s a policy decision, not supply/demand.

Quinn, in a LinkedIn post quoted by InfoQ, framed the change as a deliberate rate update applied uniformly across regions rather than a market response. Steve Wade, founder of Platform Fix, made the precedent point on the same thread: “Once the door is open, it doesn’t close. Every FinOps team just added a new line to their risk register.” Wade’s point is the second-order one. AWS moving first makes the next hyperscaler hike a smaller customer shock, and shifts the burden of proving “we are cheaper” onto whichever cloud chooses not to follow.

What Customers Can Switch To, and What They Cannot

The substitution question is sharper than it looks. Capacity Blocks for ML are not the same product as on-demand GPU instances, and practitioners on the r/aws community have noted that for these specific instance families the reservation product is often the only practical way to obtain capacity at all. The published on-demand rate is, for many customers, a number that exists only on paper.

Enterprise discount agreements do not soften the hit. AWS discounts are typically percentage-based, so a 20% published price increase translates into a 20% effective cost increase even on a deeply negotiated deal. Customers that locked in 2026 budgets at January rates are now running against an arithmetic they did not plan for, and Capacity Block pricing is dynamic by design, with AWS scheduling quarterly reviews that can change the published rate between bookings. The contracts that fixed unit price are exactly the contracts customers should already be on if they want certainty.

The realistic alternatives are narrow. AWS’s own Graviton and Trainium lines, Trainium especially, now used by Anthropic and OpenAI for AWS workloads, sit outside the Capacity Block product and have not been repriced. Outside AWS, Azure’s ND H100 v5 series and Google Cloud’s A3 with H100 are the comparable reservation products, and as of late June 2026 neither has matched the hike. The cross-cloud substitution that works for inference at small scale does not work for a multi-thousand-GPU training run that needs a single contiguous reservation window inside one provider’s cluster fabric.

The Bigger Question Hiding Inside the Hike

The constraint may not be chips at all. A senior DevSecOps engineer quoted by InfoQ put it bluntly after the January round: “The supply in this case is electricity. The CEO of Microsoft has said he has warehouses full of GPUs that haven’t been installed yet. He doesn’t have anywhere to put them.” If the binding constraint is power and data-center floor space, price hikes do not free up more capacity for the customers willing to pay for it.

They do ration demand. AWS executives said on the Q1 call that AWS’s chips business is growing triple digits, that 2.1 million chips have been landed in twelve months, and that Amazon is on track for one million Nvidia GPUs starting in 2026. The order book, the 2026 capex commitment of around $200 billion, assumes customers will absorb higher unit prices to fund that build. If customers instead defer training runs, shift to cheaper custom silicon, or move workloads to a rival cloud, the math on the capex changes before the GPU shipments arrive.

That is the contest that begins on July 1. Microsoft and Google will decide whether to undercut, follow, or hold and watch. Andy Jassy has written in his 2026 shareholder letter that AWS is not investing $200 billion in capex “on a hunch.” The new Capacity Block price book is the first quarterly test of that claim, and the answer will come from whichever rival cloud announces its next move.

Frequently Asked Questions

How much are AWS GPU instance prices going up on July 1, 2026?

AWS is raising EC2 Capacity Block reservation prices for ML by approximately 20% on P6-B300, P6-B200, P5, P5e, P5en, and P4de instance families. The new per-accelerator rates include $14.04 for P6-B300 and $12.355 for P6-B200, with all other AWS prices unchanged.

Is this the first AWS GPU price hike in 2026?

No. AWS raised EC2 Capacity Block ML prices by approximately 15% on January 4, 2026, lifting p5e.48xlarge from $34.61 to $39.80 per hour and p5en.48xlarge from $36.18 to $41.61. The July 1 change is the second hike of the year.

Why is AWS raising prices now?

AWS attributes the change to quarterly supply-and-demand patterns. The hike lands as AWS reports 28% year-over-year growth to $37.6 billion in Q1 2026 segment revenue and Amazon commits roughly $200 billion in capex for 2026, the bulk of it to AWS and AI infrastructure.

Should AI customers switch to Azure or Google Cloud instead?

As of late June 2026, neither Microsoft Azure nor Google Cloud has announced a matching Capacity Block hike. The practical constraint is that Capacity Block-style reservations are often the only way to obtain Nvidia H100, H200, or Blackwell capacity at all, so substitution is limited for multi-thousand-GPU training runs that need a single contiguous reservation window.