Connect with us

CRYPTO

Coinbase Goes Dark For 7 Hours After AWS Chillers Fail In Virginia

Published

on

Coinbase went dark for almost seven hours on Thursday night because a room got too hot in Northern Virginia. The world’s largest U.S. crypto exchange could not match orders, settle trades, or process transfers between roughly 8 p.m. ET on May 7 and 5:29 a.m. PDT on May 8, after multiple chillers failed inside an Amazon Web Services data hall and AWS availability zone use1-az4 lost power. The timing was savage. CEO Brian Armstrong had spent Tuesday telling 700 laid-off employees that AI now does their work in days, then spent Thursday telling Wall Street that non-technical staff would soon ship code to production. Hours later, the centralized exchange he runs collapsed because of a single physical room.

The outage was confined to one of six availability zones in AWS’s US-EAST-1 region, the oldest and most heavily loaded cloud hub on the internet. Most Coinbase services rode it out. The exchange did not. Armstrong has now promised a review of the latency-versus-resilience tradeoffs that left the matching engine inside one room.

Customer funds were never at risk. The reputational damage is another story.

What Actually Broke Inside the Virginia Data Hall

AWS engineers logged the first thermal anomaly in availability zone use1-az4 at 5:25 p.m. PDT on May 7, according to the AWS Health Dashboard event archive. Multiple chillers in a single data hall failed at roughly the same time. Temperatures climbed past safe operating thresholds. Power tripped off the racks to protect the hardware. EC2 instances and EBS volumes hosted in the affected zone went down with it.

By 6:47 p.m. PDT, AWS was warning that other services depending on the impaired EC2 and EBS resources would degrade too. By 8:06 p.m. PDT, the company conceded recovery was “slower than originally anticipated” and told customers to launch in unaffected zones or restore from EBS snapshots. The Register’s running incident log shows engineers were still recovering racks more than 12 hours after the initial event, working in what AWS described as a controlled and safe manner.

This is the second physical-layer failure to take chunks of the internet offline through US-EAST-1 in seven months. An October 2025 DNS race condition inside DynamoDB knocked more than 70 AWS services and a long list of dependent platforms offline for roughly 15 hours.

The Exchange Failed Because Coinbase Designed It That Way

Here is the part most coverage glided past. AWS lost one zone out of six. The standard cloud architecture playbook says: distribute across zones, survive single-zone failure, move on. Coinbase’s general infrastructure did exactly that. The matching engine did not.

Armstrong said so himself. “Exchanges have unique architectures that optimize for latency and co-location of clients,” he wrote in a public post on X explaining the root cause. “It is possible to make exchanges resistant to AZ failures, but this can introduce latency delays that are not desirable along with breaking customer co-location.”

Translation: the matching engine sits in one zone on purpose. Institutional traders pay for co-location, the practice of parking their own servers in the same physical building as the exchange’s matching engine to shave microseconds off round-trip times. Spread that engine across three zones for fault tolerance and you slow it down. Slow it down and the algorithmic desks who generate the volume Coinbase needs go elsewhere.

So the company chose speed. That choice held up fine until a chiller failed.

Coinbase’s own initial messaging tried to widen the blame, claiming failures “across multiple AWS zones.” Amazon disputed that account directly to Decrypt, saying only one availability zone was affected. The company’s backup procedures, which are supposed to isolate exactly this scenario, did not fire automatically. Engineers ended up running disaster recovery by hand.

A Brutal Week, Compressed

Strip the AWS angle out and the calendar tells its own story.

  • Tuesday, May 5: Armstrong sends a 7 a.m. email cutting roughly 700 jobs, about 14 percent of the company. System access for affected staff is revoked before some finish reading the message.
  • Thursday, May 7 (after market): Coinbase posts a $394 million GAAP net loss on $1.41 billion in revenue, a 30.5 percent year-on-year decline. Earnings per share come in at a $1.49 loss against analyst expectations of a 27 cent profit.
  • Thursday, May 7 (evening): AWS use1-az4 overheats. Coinbase begins logging elevated error rates across multiple services around 8 p.m. ET.
  • Friday, May 8 (5:29 a.m. PDT): All markets re-enabled for trading on web and mobile after roughly six and a half hours in cancel-only or fully halted state.

The Earnings Call Quote That Aged Badly

On the same day the chiller failed, Armstrong was telling investors his vision for AI inside the company. He compared the rollout of agentic coding to self-driving cars, suggesting AI agents are “getting to a place where they’re actually safer than human drivers.”

There will be a point, I think, in the future, where nontechnical people will be able to write code, AI agents will be able to review it and check it for security, and improve the quality of it. And actually, in certain situations, have it go to production, but that’s not yet the case today.

Armstrong said this on Coinbase’s Q1 2026 conference call, captured in the company’s Q1 2026 earnings transcript. He clarified that today, human engineers still review every line before it ships, with multiple reviews on sensitive systems. The qualifier did not survive the news cycle.

The Tuesday layoff email had already lit the fuse. Armstrong wrote that engineers ship in days what used to take teams weeks, that non-technical teams are “shipping production code,” and that the goal was to rebuild Coinbase as “an intelligence, with humans around the edge aligning it.” That phrasing is preserved in Fast Company’s full-text reproduction of the layoff memo.

Then the matching engine went offline.

Gergely Orosz, the former Uber and Skype engineer who writes The Pragmatic Engineer, did the obvious thing. “Unfortunate optics for Coinbase to have an hours-long outage when customers could not trade, a few days after their CEO said how non-technical teams are shipping code to production,” Orosz wrote on X to his 310,000-plus followers. He pointed out that the dependency on a single AWS region was a deliberate engineering choice, not an accident, and called the timing “terrible advertising.”

Why Co-location Is Worth Defending Anyway

The reason exchanges chase low latency is competitive, not vain. High-frequency market makers are how a venue keeps spreads tight and order books deep. They will not co-locate against an exchange that runs its matching engine on a multi-zone architecture if it costs them a millisecond of edge. Coinbase’s own derivatives connectivity documentation confirms the platform sits inside US-EAST-1 specifically so professional clients can co-locate.

The October 2025 outage already demonstrated the same lesson. Recurring failures inside the same region by the same provider raise the obvious question of whether US-EAST-1’s age, density, and load have made it structurally riskier than newer AWS regions. Coinbase has been answering that question with silence and a re-architecture promise.

The Numbers Coinbase Did Not Want Stacked This Week

  • $394 million: Q1 2026 GAAP net loss, versus a $65.6 million profit in the year-ago quarter.
  • $1.41 billion: Q1 2026 revenue, down 30.5 percent year over year and 21 percent sequentially.
  • $482 million: Unrealized losses on crypto assets held for investment, mostly tied to Bitcoin’s slide.
  • 700 employees: Headcount cut in the May 5 layoff round, roughly 14 percent of the workforce.
  • $50 to $60 million: Restructuring charge expected in Q2 2026, per Coinbase’s SEC disclosures captured in CNBC’s Q1 2026 earnings recap.
  • 52 percent: Coinbase share price decline from its October 2025 high through Thursday’s close.
  • 8.6 percent: Coinbase’s all-time-high share of global crypto trading volume reported during the quarter, the lone bright spot in the deck.

The Decentralization Contradiction Nobody Solves

An industry built on the rhetoric of removing single points of failure runs almost entirely on three of them: AWS, Microsoft Azure, and Google Cloud. The May 7 chiller failure exposed the gap between the marketing and the wiring diagram. CME Group’s derivatives platform was disrupted alongside Coinbase. FanDuel was knocked offline. Quartz’s coverage of the AWS data center outage noted the pattern: companies that had fully distributed across zones recovered quickly, while those with single-zone dependencies sat dark for hours.

For crypto specifically, the contradiction cuts deeper because the asset class sells itself on the absence of central intermediaries. The custodian failed. Customer funds were safe. But for nearly seven hours, holders of those funds could not act on them, and that is the part the marketing rarely acknowledges.

What Armstrong Actually Promised To Change

Armstrong’s X post committed to two specific things. First, Coinbase will revisit the latency-versus-resilience tradeoffs that kept the exchange in one zone. Second, the company will work to shorten future outages caused by an availability zone migration, even if it cannot eliminate them entirely. Neither commitment includes a deadline. Neither specifies whether co-location customers will be moved or whether the matching engine itself will be restructured.

The detailed technical postmortem has not been published. Coinbase said it will appear once the AWS retrospective lands. Until then, the company is asking customers to trust that the same engineering organization that designed a single-zone matching engine will now redesign it without breaking the institutional revenue stream that depends on the original choice.

Coverage from Gergely Orosz at The Pragmatic Engineer has been particularly sharp on the gap between Armstrong’s AI-native messaging and the actual reliability bar a regulated trading venue has to clear.

How This Connects to the Wider AI-Agent Push

Armstrong’s vision for billions of AI agents transacting on-chain, much of it through Coinbase’s Base network, is real strategy, not just rhetoric. The x402 payments protocol Coinbase contributed to the Linux Foundation now lists Cloudflare, AWS, Stripe, Shopify and Google among its participants, and 99 percent of x402 transactions in Q1 settled in USDC. The plumbing is being built.

That plumbing depends on the exchange staying online when it matters. The same week Coinbase laid off 14 percent of staff, the broader industry race to make AI agents real money-movers continued, as we covered in our breakdown of Meta’s Hatch and Google’s Remy launches in the agentic AI wars. Coinbase’s prediction-market revenue is forecast to hit $100 million annualized by year-end, and the Deribit acquisition closed at $4.29 billion during the quarter. The strategic surface is expanding while the engineering organization shrinks.

For the on-chain agent thesis specifically, our reporting on the Solana and Google Cloud Pay.sh launch enabling AI agents to pay in USDC shows how fast the rest of the field is moving. None of it works if the venues those agents route through cannot tolerate a failed chiller.

What Comes After the Postmortem

The detailed technical summary Armstrong promised will land in the next few weeks. The most consequential paragraph in it will be the one explaining what changed about the matching engine, if anything. The second-most consequential will be whatever it says about co-location customers and whether they were consulted on the redesign.

Frequently Asked Questions

Were My Coinbase Funds At Risk During The Outage?

No. Coinbase confirmed in its post-incident X update on May 8 that customer balances remained safe throughout the disruption. The outage affected order matching, trade execution, and wallet transfers, but custody of underlying assets is held in cold storage and segregated wallets that were not impaired by the AWS zone failure. If you see lingering discrepancies in your transaction history, contact Coinbase support directly through the help portal because the company is still reconciling delayed activity.

Why Did Coinbase Go Down When Other Crypto Exchanges Stayed Up?

Coinbase runs its primary matching engine inside a single AWS availability zone, use1-az4 in Northern Virginia, to keep latency low for institutional clients who pay for server co-location. Competitors that route across multiple zones or use different cloud providers were not affected. CEO Brian Armstrong confirmed this design choice in his May 8 post on X and promised an architecture review, though he did not commit to a timeline for when changes would ship.

Can I Get Compensated For Trades I Could Not Make?

Coinbase’s user agreement does not guarantee uptime and historically the company has not issued blanket compensation for outage-related missed trades. Affected users with specific evidence of financial harm, such as a stop-loss order that could not execute, should file a formal support ticket and document the timestamp, intended trade, and market price difference. Class actions over previous outages have generally settled or been dismissed, so the practical path is direct engagement with support.

Is This Going To Happen Again?

Probably yes, until Coinbase rearchitects the matching engine. The same AWS US-EAST-1 region took down a long list of services in October 2025 through a separate DNS failure, and physical-layer issues like chiller failures are not unique events. AWS operates six availability zones in US-EAST-1, and best practice is to distribute across all of them. Until Coinbase completes the latency-versus-resilience review Armstrong promised, the same single-zone exposure remains in place.

What Was Brian Armstrong’s AI Coding Comment And Why Did It Blow Up?

On the May 7 earnings call, Armstrong said non-technical employees will eventually be able to write code that AI agents review, security-check, and in some cases push to production. He stressed this is not the case today and that human engineers still review every line. The comment landed badly because Coinbase had laid off 700 staff two days earlier citing AI productivity gains, then suffered a multi-hour outage hours after the call ended. Critics including Gergely Orosz called it terrible optics.

Where Can I Check If Coinbase Is Down Right Now?

The official source is status.coinbase.com, which posts component-level updates within minutes of an incident being detected. For underlying cloud issues, the AWS Health Dashboard at health.aws.amazon.com publishes regional and availability-zone alerts that often surface problems before exchange-level symptoms appear. Third-party trackers like Downdetector show user-reported spikes but lag the official sources. Bookmark the Coinbase status page if you trade actively because it is the authoritative timeline.

The longer arc is harder to brush off. A company that just told its workforce AI is the future spent seven hours unable to match trades because a piece of physical infrastructure no AI agent can patch from a chat window failed in Virginia. Whatever the postmortem says, the lesson sits in plain view: latency is a competitive moat, resilience is a regulatory floor, and Coinbase just proved it had been measuring one and ignoring the other.

Logan Pierce is a writer and web publisher with over seven years of experience covering consumer technology. He has published work on independent tech blogs and freelance bylines covering Android devices, privacy focused software, and budget gadgets. Logan founded Oton Technology to publish clear, no nonsense tech news and reviews based on real hands on testing. He has personally tested and reviewed dozens of mid range and budget Android phones, written extensively about app privacy, and built and managed multiple WordPress publications over the past decade. Logan holds a bachelor's degree in English and studied digital marketing at a certificate level.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending