Connect with us

AI

GPT-5.5 Catches Mythos On Cyber Tests, ARC Reveals Brittle Logic

Published

on

OpenAI’s GPT-5.5 has matched Anthropic’s Mythos Preview on offensive cyber tasks, the UK AI Security Institute reported on April 30, 2026. GPT-5.5 scored 71.4% on AISI’s hardest 95-task suite against Mythos Preview’s 68.6%, and both finished a 32-step network intrusion that a human expert needs roughly 20 hours to clear.

A separate ARC Prize Foundation study, published the next day, found both models still fail problems they could not have seen in training. The two streams of evidence landed in the same week and pull in opposite directions.

The Parity Moment AISI Flagged

Mythos Preview held the top spot on AISI’s expert-tier cybersecurity tasks for two weeks before GPT-5.5 caught and slightly cleared it. The institute treats the gap as statistically meaningless. GPT-5.4, the predecessor, sat at 52.4%. Anthropic’s Opus 4.7 came in at 48.6%. Both new frontier models jumped roughly 20 percentage points over their immediate predecessors in a few months.

AISI’s GPT-5.5 cyber capability evaluation calls the parity itself the headline finding, not the leader. “A second model, from a different developer, now reaches a similar level of performance on our cyber evaluations,” the institute wrote, and warned that further jumps could land “in quick succession” if cyber gains keep arriving as a side effect of general reasoning improvements.

Inside The Last Ones, A 32-Step Network Range

The Last Ones is a simulated breach of a fictional corporate network built jointly with SpecterOps. Spread across four network segments and roughly twenty machines, it asks a model to chain initial access, lateral movement, privilege escalation, and a final objective without prompting. AISI estimates a skilled human operator needs about 20 hours.

GPT-5.5 finished the full chain in two of ten runs. Mythos Preview, the first model to crack it per AISI’s Mythos Preview cyber capability report, did so in three of ten and averaged 22 of 32 steps when it failed. Each attempt ran with a token budget of 100 million, putting even a successful run in the hundreds of dollars on API pricing.

  • 71.4%: GPT-5.5 pass rate on AISI’s hardest expert tier
  • 68.6%: Mythos Preview pass rate on the same tier
  • 2 of 10: GPT-5.5 end-to-end completions of The Last Ones
  • 20 hours: estimated human-expert time to clear all 32 steps

The shape of those numbers matters. A model that finishes a 32-step intrusion two times out of ten is unreliable on any single run, but the long tail of the distribution is what counts for offensive use. An attacker only needs one chain to land.

The 95-task suite covers vulnerability research, reverse engineering, web exploitation, and cryptographic attacks. Tasks score binary pass-fail and group into four difficulty tiers. AISI’s tooling lets the models open shells, edit files, and call out to debuggers like a real operator would.

One footnote sharpens the picture. AISI also tested both labs on a seven-step industrial control simulation built with Hack The Box, called Cooling Tower. No model has finished it yet, GPT-5.5 included. The cyber-physical bar is still out of reach.

The Rust_vm Result Mainstream Coverage Skipped

Tucked into AISI’s report is a single task that reframes the threat picture. The challenge, called rust_vm, asks the model to reverse engineer a Rust-based virtual machine, recover its instruction set, disassemble its bytecode, reverse a custom authenticator, and solve constraints to produce a key. An expert playtester used Binary Ninja, gdb, Python, and the Z3 solver. They needed about 12 hours.

GPT-5.5 finished in 10 minutes and 22 seconds. The total API cost was $1.73.

That figure compresses the whole offensive-AI argument into one line. A task that ate half a day of an experienced reverse engineer’s time fell to a model in under eleven minutes for less than the price of a coffee. The size of the gap, not the raw capability, is what AISI wants regulators to read.

A Universal Jailbreak Found In Six Hours

AISI’s red team also tested the safeguards OpenAI ships with GPT-5.5. Six hours of expert prompting was enough to find a single bypass that defeated every malicious cyber query AISI had prepared, including the multi-step agent runs where the model has to plan and execute over many turns.

OpenAI shipped a safeguard update in response. AISI said a configuration error in the version it received kept it from confirming whether the new defenses held. The audit cycle, in other words, has not closed.

“A second model, from a different developer, now reaches a similar level of performance on our cyber evaluations.”

The line, from AISI’s published evaluation, is the institute’s polite way of saying the parity is not a fluke. OpenAI’s internal classification rates GPT-5.5 a “high” cybersecurity risk under OpenAI’s updated Preparedness Framework, the second-highest tier, meaning the model can amplify existing attack pathways but stops short of “critical,” the bar for entirely new routes to severe harm.

The high tier carries deployment commitments. OpenAI agreed under the framework to ship monitoring, abuse detection, and rate-limiting around any high-rated production model. AISI’s universal-bypass finding tests whether those commitments translate to defenses that hold against a focused attacker.

Where ARC-AGI-3 Catches Both Models Out

Cyber benchmarks measure tasks that look broadly like training data. ARC-AGI-3 was built to do the opposite. The ARC Prize Foundation, run by Greg Kamradt, places models in 135 hand-crafted environments where no instructions are given and no prior data applies. Every environment has been solved by at least two humans without special training. Frontier models score near zero.

In a study released May 1, 2026, Kamradt’s team analyzed 160 replays and reasoning traces. GPT-5.5 scored 0.43 on the semi-private set. Opus 4.7 scored 0.18. The ARC Prize analysis of GPT-5.5 and Opus 4.7 identifies three repeating failure modes, but the most striking finding is how differently the two models broke.

GPT-5.5 Failed To Compress

GPT-5.5 generated multiple competing hypotheses about each environment but could not commit to one. Kamradt called this “wider hypothesis generation” without the closing step. The model saw that an action sometimes rotated an object and sometimes did nothing, but never compressed the observations into a single rule.

That pattern shows up in offensive cyber work too, just less visibly. Solving a known capture-the-flag means matching a pattern. Reasoning about a brand-new system means building the model and committing to it. AISI’s rust_vm result hides the distinction because the underlying instruction set, while custom, follows familiar conventions.

Opus 4.7 Locked Onto The Wrong Game

Opus 4.7 went the opposite way. It compressed quickly, then refused to revise. “Opus had the wrong compression,” Kamradt wrote. “GPT-5.5 failed to compress.” Opus runs repeatedly mistook ARC environments for Tetris, Frogger, Sokoban, Breakout, Pong, and Boulder Dash, then kept playing those games even after the rules disagreed.

The transfer problem hit both labs hard. Beating one level rarely helped on the next. Whatever a model learned in level one did not survive contact with level two. Background on the benchmark’s construction sits in the ARC-AGI-3 interactive reasoning benchmark paper.

Why Capability And Brittleness Live Together

The two evaluation streams point at the same fact from opposite sides. Cyber benchmarks reward fluency in patterns the model has seen many times. Reasoning benchmarks punish that fluency the moment the patterns no longer apply. Both labs are pushing the first lever and have done little for the second.

If AISI is right that the cyber jump came from general reasoning and agent gains rather than targeted training, the next frontier model will likely show both moves at once. More offensive capability. The same brittle compression. ARC Prize’s 2025 competition results already trailed this pattern, with strong scores on training-aligned tasks and collapses on novel ones.

Frequently Asked Questions

Is GPT-5.5 available to use right now?

GPT-5.5 is in limited preview as of May 2026. OpenAI has rolled it out to enterprise customers and API testers under usage agreements that include the high-risk safety controls AISI tested. A wider ChatGPT release has not been announced. Developers can apply for access through OpenAI’s platform page; rate limits and abuse-monitoring requirements come bundled with the high-risk classification.

Does the AISI finding mean AI can hack on its own?

Not quite. GPT-5.5 finished a full corporate intrusion only twice in ten attempts, and each run cost hundreds of dollars in compute. What changed is the speed on individual subtasks. Reverse engineering jobs that took human experts twelve hours fell in under eleven minutes for $1.73. Defenders should treat the model as a force multiplier for skilled attackers, not an autonomous threat actor yet.

How does ARC-AGI-3 differ from earlier ARC tests?

ARC-AGI-3 is interactive, not single-turn. The earlier ARC-AGI-2 asked models to fill in a missing grid pattern from a few examples. ARC-AGI-3 drops the model into 135 hand-built game environments with no instructions, where the model must figure out rules through trial and error. Humans clear them without training; frontier models score below 1%. The 2026 Kaggle round opens later this year for outside teams.

What did OpenAI say about the universal jailbreak?

OpenAI updated its safeguard stack after AISI shared the bypass details, the company told the institute. AISI then received a follow-up build, but a configuration error in that version blocked retesting, so the fix is unverified externally. OpenAI’s preparedness page lists GPT-5.5 at “high” risk on cybersecurity, the second-highest tier and the trigger for monitoring commitments around the model in production.

The next round of frontier evaluations is already in flight. AISI is iterating its 95-task suite while ARC Prize runs ARC-AGI-3 as a 2026 Kaggle competition with a $1 million prize pool. Whichever lab ships the next jump first will be tested against both, and the gap between those two scores is now the number that matters.

Logan Pierce is a writer and web publisher with over seven years of experience covering consumer technology. He has published work on independent tech blogs and freelance bylines covering Android devices, privacy focused software, and budget gadgets. Logan founded Oton Technology to publish clear, no nonsense tech news and reviews based on real hands on testing. He has personally tested and reviewed dozens of mid range and budget Android phones, written extensively about app privacy, and built and managed multiple WordPress publications over the past decade. Logan holds a bachelor's degree in English and studied digital marketing at a certificate level.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI

AAOI Stock Hits All-Time High as AI Optical Spending Surges

Published

on

Applied Optoelectronics, Inc. (NASDAQ: AAOI), a Sugar Land, Texas fiber-optic manufacturer, reported $151.1 million in Q1 2026 revenue on May 7, a 51% jump from the same period a year earlier, with data center sales rising 154% as AI cluster buildouts pushed optical interconnect demand into a new speed tier. Management lifted full-year 2026 guidance to $1.1 billion at a May 13 investor conference, and the stock touched an all-time high of $233.67 the same day after Rosenblatt Securities raised its price target to $220, citing strong Amazon-linked 800G order momentum and upcoming Oracle customer qualifications.

Searches for “AAOI USDT” reflect a broader pattern: crypto-native traders want single-name AI infrastructure exposure without leaving their exchange accounts. The company at that intersection, building the optical transceivers that move data between the GPU racks everyone else discusses, became one of the more volatile story stocks on the Nasdaq in 2026.

Applied Optoelectronics: The Fiber Company Powering AI

AAOI designs and manufactures three families of fiber-optic networking products. Optical transceivers for hyperscale data centers come in 100G, 400G, and 800G configurations, with 1.6-terabit (Tb) modules entering volume production in late 2026. A second line covers hybrid fiber-coaxial (HFC, broadband equipment used by cable operators) products for cable television (CATV) networks. A third covers fiber-to-the-home (FTTH) telecom optics. In Q1 2026, data center revenue crossed CATV for the first time in a meaningful way, the product mix shift that Dr. Thompson Lin, AAOI’s Founder, President and Chief Executive Officer, called central to the company’s growth trajectory.

The company’s role in the AI conversation is structural, not incidental. Every large GPU training cluster needs optical interconnects to move data between chips at the speeds required for parallel processing. As cluster architectures have scaled from hundreds of GPUs to tens of thousands, copper cables hit their physical limits on bandwidth and distance. Optical fiber, which transmits data as pulses of light, handles what copper cannot. AAOI is one of a handful of pure-play vendors serving that supply chain, rather than treating optical as a side line within a broader component portfolio.

What separates AAOI from module assemblers is vertical integration. The company designs and fabricates its own laser diodes, a manufacturing capability that only a handful of competitors match. That integration shortens customer qualification timelines, because hyperscalers can test the complete signal chain at one vendor rather than coordinating across multiple component suppliers. Applied Optoelectronics has expanded its manufacturing footprint in the Houston area to roughly 900,000 square feet across multiple facilities, positioning a significant share of production inside the United States at a time when supply chain geography is a real factor in hyperscaler procurement decisions.

Why the Optical Layer Is the Structural Bottleneck

TrendForce’s April 2026 AI optical transceiver market analysis placed the global market at $16.5 billion in 2025 and projected expansion to $26 billion in 2026, a 57% single-year increase. The total addressable market estimate for optical transceiver modules was revised upward by 43% and 46% for 2026 and 2027 respectively, driven by hyperscaler capital expenditure plans coming in consistently above consensus. LightCounting, a separate optical research firm, projected the AI cluster-specific transceiver segment alone would double in two years, from $5 billion in 2024 to more than $10 billion in 2026.

The bandwidth demand behind those numbers is architectural. AI clusters are scaling in two directions simultaneously: horizontally, by adding compute nodes, and vertically, by tightening interconnects within the rack. Both trajectories consume optical bandwidth. As GPU cluster sizes increase and rack density rises, passive copper cable solutions hit their physical limits on speed and distance, making optical connectivity a required specification rather than a premium option for new AI infrastructure builds.

Supply constraints are running parallel to the demand acceleration. The bottleneck is not the transceiver module itself but the electro-absorption modulated laser (EML, a chip that converts electrical signals into precise optical pulses) chips and continuous-wave laser diodes inside each unit. Fabricating those components requires precision manufacturing that cannot be scaled in a single product cycle. TrendForce named Applied Optoelectronics alongside Coherent Corp. and Lumentum Holdings, Inc. as vendors that have initiated capacity expansions and technology deployments in direct response to component shortages. AAOI management told attendees at the Needham Technology, Media and Consumer Conference that demand is expected to exceed supply through at least mid-2027.

The technology cycle is also in transition. 800G modules are becoming the standard for new AI cluster deployments this year, while early 1.6T products are entering mass production. TrendForce identifies 2026 through 2027 as the crucial window for vendors to secure design-in at tier-one hyperscale accounts, because success in that qualification cycle determines the revenue trajectory for the following two years. AAOI completed its first volume shipment of 800G modules to a major hyperscale customer in Q1 2026, a qualification milestone rather than a volume event, but qualifications precede volume orders in the optical supply chain.

  • 800G+ share of optical transceiver shipments: 19.5% in 2024, projected above 60% in 2026, per TrendForce
  • $26 billion: projected global AI optical transceiver market in 2026, up from $16.5 billion in 2025 (TrendForce)
  • 57%: single-year market growth rate as of TrendForce’s April 2026 report, one of the fastest on record for the sector
  • Mid-2027: AAOI management’s estimate for when demand will continue exceeding available supply capacity

AAOI’s Books: Revenue, Guidance, and the Amazon Signal

First-Quarter Results and What the Numbers Say

Q1 revenue of $151.1 million came in slightly below the Street consensus of roughly $157 million, but the mix told a clearer story. Data center revenue reached $81.4 million, growth of 154% from Q1 2025, and crossed CATV revenue of $66.8 million for the first time in the cycle. The Q1 2026 earnings filing on SEC EDGAR showed GAAP net loss widening to $14.3 million as research and development spending increased and the Texas facility ramped toward capacity. Non-GAAP gross margin came in at 29.1%, easing from 30.6% a year earlier as higher-cost 800G units entered production.

Cash and equivalents stood at $449.4 million at quarter end, following a public equity offering that added approximately $382.4 million. The company also carries $125 million in 2.75% convertible senior notes due 2030 and had $61.7 million in unused borrowing capacity. The balance sheet is funded for the capital expenditure cycle ahead, including the planned Texas expansion and manufacturing equipment for 1.6T product lines.

Metric Q1 2026 Actual Q2 2026 Guidance FY2026 Target
Revenue $151.1M $180M to $198M ~$1.1B
Data center revenue $81.4M Not disclosed Not disclosed
CATV revenue $66.8M Not disclosed Not disclosed
Non-GAAP gross margin 29.1% 29% to 30% Not disclosed
GAAP net income (loss) ($14.3M) Not disclosed Not disclosed
Cash on hand $449.4M Not disclosed Not disclosed

The $1.1 Billion Target and Amazon’s Stake

Management raised full-year 2026 revenue guidance to $1.1 billion at the Needham Technology, Media and Consumer Conference on May 13, implying a sharp back-half acceleration. The sequencing is clear: 800G volume ramp with a second hyperscale customer is imminent, 1.6T deliveries are scheduled to begin in late Q3, and the Texas facility adds production output continuously through the year. Q2 guidance of $180 million to $198 million already signals meaningful sequential growth from Q1.

The relationship with Amazon.com, Inc. adds an unusual layer of forward visibility. In March 2025, AAOI issued a customer warrant to a subsidiary of Amazon.com, Inc., granting Amazon the right to purchase up to approximately 7.95 million shares, with vesting tied to Amazon’s purchasing volume over time, potentially covering up to $4 billion in total purchases. The structure means Amazon has a financial incentive aligned with AAOI’s share price, linking the two companies beyond a standard purchase order relationship. Raymond James cited management’s plan to ramp optical transceiver revenue to $1.4 billion by mid-2027 when it raised its target to $160 from $72.50, maintaining an Outperform rating on May 13.

The company locked in more than $324 million in confirmed 800G and 1.6T orders and received a $20.9 million grant from the Texas Semiconductor Innovation Fund in April 2026 to support the Sugar Land facility expansion. That combination of a strategic customer warrant, state manufacturing support, and confirmed order backlog differentiates AAOI from smaller optical vendors competing for the same hyperscale qualifications.

Where the AAOI Thesis Can Break

The bear case is not about the technology or the market size. Both are clearly growing. The question is whether a company burning cash to build manufacturing capacity can execute the production ramp fast enough, at the margins the Street expects, without a competitor closing the qualification gap at one of its major accounts.

  • Customer concentration: AAOI’s data center revenue flows from a small number of hyperscale accounts. A single large customer adjusting its order cadence can produce double-digit stock moves on earnings day, independent of broader AI infrastructure trends.
  • Negative free cash flow: Capital expenditure for the Texas expansion kept free cash flow deeply negative through the scaling period. The equity raise addressed near-term liquidity, but sustained profitability is still a future milestone, not a current fact.
  • Beta of 2.24: The stock amplifies broad market moves by more than double, meaning macro-driven risk-off sessions can produce sharp drawdowns entirely unrelated to AAOI’s own operating results or its customers’ spending plans.
  • Analyst target dispersion: Price targets ranged from $57.50 at Northland to $220 at Rosenblatt as of mid-May 2026. A spread that wide reflects genuine disagreement about what the ramp is worth at price-to-sales multiples above 20.
  • Ramp timing risk: B. Riley Securities flagged potential 800G production timing delays into the second half of 2026 even while raising its target to $129 and maintaining a Neutral rating. Q3 and Q4 execution is the central variable that either validates or deflates the annual guidance.

Insider selling in mid-May 2026 drew attention when AAOI executives unloaded a significant block of shares as the stock was near its all-time high. Executive liquidity events at peak prices are common practice, but the timing against still-negative GAAP margins and a price-to-sales ratio above 20 adds to the list of inputs active traders are watching before the July 30, 2026 earnings date.

What “AAOI USDT” Means and How to Trade the Theme

What the USDT Suffix Means for Crypto Traders

Crypto-native traders searching “AAOI USDT” are looking for a Tether-margined perpetual futures contract on AAOI’s price, using the same naming convention as BTCUSDT or ETHUSDT, but referencing a Nasdaq-listed equity rather than a cryptocurrency. The appeal is structural: a single USDT collateral pool, the ability to go long or short with leverage, no fiat onboarding, and consolidated profit and loss alongside existing crypto positions. For a trader already running a multi-asset crypto book, those mechanics represent a real friction reduction compared with opening a separate brokerage account.

AAOI is not currently listed as a perpetual contract on major crypto derivatives platforms. Single-name equity perpetuals require either a licensed stock price oracle, compliance with the securities regulations governing swap instruments, or operation in a jurisdiction where such products are currently permitted. The regulatory landscape around tokenized equities is actively evolving, but as the SEC’s pause on broader tokenized stock access for crypto platforms in May 2026 illustrated, the distance between a single-name equity perp and a compliant crypto exchange product is wider than the search query implies.

Index Futures as the Available Proxy

For traders who want directional exposure to the AI infrastructure theme in a USDT-margined account, Nasdaq 100 (NDX) index futures offer the most accessible available route. The NDX carries heavy weighting toward the hyperscale operators and semiconductor suppliers whose capital expenditure decisions drive demand for AAOI’s products: Amazon.com, Inc., Microsoft Corp., NVIDIA Corp., Alphabet Inc., and Meta Platforms, Inc. together represent a substantial share of the index. Platforms such as Phemex offer USDT-margined NDX futures, making it possible to hold a single USDT account that spans AI-themed index exposure alongside crypto positions.

Feature AAOI Common Stock AAOI USDT Perp (Hypothetical) NDX USDT Futures
Account type Brokerage account Crypto exchange Crypto exchange
Collateral USD USDT USDT
Long / short Long (cash); short via margin Both, with leverage Both, with leverage
Availability (May 2026) Yes Not currently listed Yes
Concentration risk Single name Single name 100-name diversified index
Shareholder rights Yes (voting, proxy) None None
Settlement USD USDT USDT

The core trade-off is concentration versus convenience. A direct AAOI position captures the full upside of the 800G ramp and $1.1 billion guidance if the company executes. It also absorbs the full customer concentration risk, the 2.24 beta amplification during risk-off sessions, and any execution shortfalls in Q3 and Q4. An NDX position dampens AAOI-specific upside while surviving an AAOI-specific earnings miss without the same single-session drawdown risk. Which approach fits a given portfolio depends on how much of the thesis is specific to AAOI’s own execution versus how much is simply a bet on AI infrastructure spending continuing at pace.

Frequently Asked Questions

What does Applied Optoelectronics make and why is it considered an AI stock?

Applied Optoelectronics manufactures optical transceivers, laser diodes, and fiber-optic networking products used in hyperscale data centers, cable TV networks, and telecom infrastructure. It qualifies as an AI stock because its data center transceiver products are the optical interconnects that physically move data between GPU clusters inside AI training and inference facilities. Every large-scale AI workload requires high-bandwidth optical links to function, and AAOI is one of a small number of pure-play vendors serving that supply chain with vertically integrated laser and module manufacturing.

What does “AAOI USDT” mean in crypto trading?

AAOI USDT refers to a USDT-margined perpetual futures contract on AAOI’s share price, using the same naming convention as crypto pairs such as BTCUSDT or ETHUSDT. It means a trader would go long or short on AAOI’s price using Tether (USDT, the largest stablecoin by trading volume) as collateral, without a traditional brokerage account. As of May 2026, AAOI is not listed as a perpetual contract on major crypto derivatives platforms.

Is Applied Optoelectronics currently profitable?

Not yet on a GAAP basis. AAOI reported a GAAP net loss of $14.3 million in Q1 2026 and a non-GAAP net loss of $4.9 million, while investing heavily in manufacturing capacity expansion in Texas and in research and development for 800G and 1.6T product lines. On a non-GAAP adjusted EBITDA basis the company reported near breakeven at $966,000 positive in Q1 2026, and Q2 2026 guidance includes the possibility of non-GAAP net income, though GAAP profitability is a later milestone tied to the revenue ramp and margin expansion trajectory.

What is AAOI’s full-year 2026 revenue guidance?

Applied Optoelectronics raised its full-year 2026 revenue guidance to approximately $1.1 billion at the Needham Technology, Media and Consumer Conference on May 13, 2026. Q2 2026 guidance is $180 million to $198 million in revenue with non-GAAP gross margin of 29% to 30%. Management expects sequential revenue growth throughout the year, with the largest acceleration in the second half as 800G volumes ramp with a second hyperscale customer and 1.6T deliveries begin in late Q3.

How can crypto traders get USDT-margined exposure to AI infrastructure themes?

Because AAOI-specific perpetual contracts are not currently listed on crypto exchanges, the most direct available route is USDT-margined Nasdaq 100 (NDX) index futures, which carry heavy weighting toward Amazon, NVIDIA, Microsoft, Alphabet, and Meta, all significant buyers or operators of AI data center infrastructure that drives demand for optical transceivers. Some platforms, including Phemex, offer USDT-margined NDX futures that can sit alongside crypto perpetuals in a single margin account, allowing cross-asset AI infrastructure positioning without fiat onboarding.

AAOI’s next quarterly earnings are scheduled for July 30, 2026. By that date, the back-half ramp driving the $1.1 billion annual target should show up as sequential revenue acceleration and improving gross margins. If both arrive together, the operating data will have caught up with the valuation. If margins compress while revenue grows, the market will want a new answer on profitability timing before extending the multiple further.

Disclaimer: This article is for informational purposes only and does not constitute investment advice or a recommendation to buy, sell, or hold any security or financial instrument. Trading equities, derivatives, and leveraged products involves substantial risk of loss and may not be suitable for all investors. Readers should conduct independent research and consult a qualified financial professional before making any investment decisions. All figures cited are sourced from company filings and industry research accurate as of the publication date and are subject to change.

Continue Reading

AI

Claude AI Models Push Investors Past the Chip Trade

Published

on

Claude AI models are moving the artificial intelligence (AI) investment debate from chatbot rankings toward security, finance workflow agents, cloud capacity and the hardware needed to run long tasks at scale, after Anthropic said its restricted Mythos Preview system helped partners find more than ten thousand serious vulnerabilities through Project Glasswing and Gartner raised its AI spending outlook.

Portfolio managers can stop asking whether AI remains a theme and start asking which parts of the stack collect the next dollar. The new evidence points to a wider trade: security teams, market-data owners, cloud platforms, memory suppliers, consulting firms and workflow software all sit closer to the revenue line than they did when the contest centered on raw model scores.

The Claude Signal Is No Longer a Chatbot Benchmark

Model news used to trade like a scoreboard. A lab posted a higher benchmark, investors rewarded the chip names, and software vendors promised to add a smarter assistant. That loop still exists, but the AI trade is broadening as models move from answering questions into doing multi-step work inside codebases, spreadsheets and security programs.

The cleanest public marker is the Claude Opus 4.7 release. The company said the model improved on advanced software engineering, long-running coding tasks, vision and enterprise document analysis, while keeping API pricing at $5 per million input tokens and $25 per million output tokens. Application programming interface (API, the developer gateway that lets software call a model) pricing matters because usage can become a recurring cost line instead of a one-off experiment.

The stronger signal may be the model that most customers cannot use. Mythos Preview remains limited because of its computer security capability. That creates a strange investment setup: the restricted system shows what future frontier models may do, while the released system tests whether guardrails, cloud access and paid workflows can turn that power into revenue without blowing up risk budgets.

The Investment Map Has Four New Columns

The usual AI screen starts with chips, cloud and the model lab. This cycle adds workflow packaging, security response and distribution control. That matters because a portfolio built only around training clusters can miss the companies that profit when AI work becomes audited, scheduled and tied to real data.

Signal From the Latest Cycle Business Asset Likely Public-Market Exposure Weak Point
Long-running coding and document reasoning Developer hours, app modernization and office automation Cloud platforms, developer tools and IT services Review cost and accuracy controls
Restricted cyber model Vulnerability discovery backlog Cybersecurity vendors, cloud security teams and cyber insurers Patch triage capacity
Finance agent templates Packaged analyst workflows Market-data firms, consulting partners and workflow software Audit trails and bad source data
Multi-cloud compute contracts Capacity as distribution Accelerators, memory, networking, power and data centers Utilization risk if demand disappoints

That table is why a new model release can move more than one basket. If the model performs, cloud usage rises. If the model performs in finance, data providers get pulled into the workflow. If it performs in security, the industry inherits a patching problem. Each outcome sends dollars to a different vendor class.

Security Turns From Cost Center to Capacity Problem

In cybersecurity, stronger models create two tradable outcomes at once. They make protection spending more urgent, and they threaten parts of the manual testing market by finding flaws faster than human teams can verify them. Anthropic says the bottleneck in Project Glasswing has shifted from finding bugs to verifying, disclosing and patching them.

  • approximately 50 partners are working with the restricted system through Project Glasswing.
  • 6,202 high or critical estimates were found in open-source projects during the company’s scan, out of 23,019 total vulnerability estimates.
  • 90.6 percent true positive rate was reported among 1,752 assessed findings, with 62.4 percent confirmed as high or critical severity.

The number that should worry security chiefs is smaller: 75 of 530 high or critical disclosed bugs had been patched when the company published the update. That gap is the market. Tools that verify reports, prioritize exposure, route patches and prove remediation become more valuable when discovery volume jumps.

There is a catch. A flood of AI-generated bug reports can bury maintainers and security teams. Investors should be careful with any vendor promising instant protection from model-driven vulnerability discovery. The durable spending may sit with firms that reduce alert noise and document the chain from finding to fix.

Finance Agents Put the Model in Analyst Workflow

The finance release is the clearest product signal for investors because it turns model capability into named jobs. Anthropic introduced finance agent templates for banking and investing work, saying each package combines instructions, governed data connectors and subagents that handle parts of the task such as comparables selection or methodology checks.

The company listed ten ready-to-run finance agents, and said the updates pair best with Opus 4.7, which led Vals AI’s Finance Agent benchmark at 64.37 percent. The agents can run as plugins in Cowork or Code, or as cookbooks for Managed Agents. Claude also works across Excel, PowerPoint and Word, with Outlook listed as coming soon.

  • Pitch builder creates target lists, runs comparables and drafts meeting books.
  • Earnings reviewer reads transcripts and filings, then flags thesis changes.
  • Model builder creates and maintains financial models from filings and data feeds.
  • Valuation reviewer checks methods against comparables and firm standards.
  • Month-end closer runs close checklists, prepares journal entries and produces reports.

For public markets, the important point is distribution. A general chatbot lives in an innovation budget. A finance agent embedded in office software, market data and approval flows can land in operating budgets. That gives data owners and consulting partners a stronger claim on AI spending than a thin wrapper around a model.

Compute Commitments Give the Trade Its Toll Roads

The second-order trade needs power, chips and reserved cloud capacity. The model lab captures attention, but the toll-road assets collect rent whenever users call the model, run agents overnight or send documents through compliance review. That is why the infrastructure numbers attached to this release cycle matter.

Cloud Capacity

The Google and Broadcom compute agreement calls for multiple gigawatts of next-generation Tensor Processing Unit capacity starting in 2027. Tensor Processing Units (TPUs, Google’s custom chips for machine learning work) give the company another route around graphics processing units (GPUs, chips used for parallel AI math) and help explain why the AI trade has spread into custom silicon and networking.

Amazon Web Services (AWS, Amazon’s cloud computing arm) has its own claim. Amazon’s latest infrastructure agreement includes a $5 billion investment now, up to an additional $20 billion tied to commercial milestones, a commitment by Anthropic to spend more than $100 billion on AWS technologies over ten years, and up to 5 gigawatts of capacity.

Hardware and Memory

The macro backdrop supports the same read. Gartner’s May AI spending forecast put worldwide AI spending at $2.59 trillion in 2026, up 47 percent from the prior year, with infrastructure accounting for more than 45 percent of the market. That makes capacity a central part of the investment case, not a back-office detail.

The Public-Market Read

IDC’s AI infrastructure tracker note put 2025 AI infrastructure spending at $318 billion and projected $487 billion in 2026. IDC also flagged power generation, grid capacity, memory scarcity and storage constraints as risks. Those constraints are investable, but they are also where margin surprises can appear if capacity arrives late or costs rise faster than usage.

Public Equities Get a Broader Scorecard

The clean winners are no longer limited to one chip supplier or one cloud provider. Hyperscalers gain when model demand turns into reserved capacity, enterprise access and governance controls. Custom silicon suppliers gain if customers want alternatives to GPU scarcity. Memory and networking suppliers gain when inference demand compounds after each new model generation.

Security vendors face the most mixed setup. Better AI can create more demand for scanning, remediation and identity controls, but it can also expose products that depend on expensive manual review. A company that sells triage, proof and patch workflow may benefit more than one that merely adds AI wording to an old scanner.

Software incumbents have a tougher test. If agents live inside spreadsheets, email, documents and code editors, the owner of the workflow can keep the customer relationship. If the model layer pulls work into a separate interface, some traditional software seats lose daily importance. The market will likely reward firms that own data rights, review steps and approvals because those pieces are hard to replace with a prompt box.

The Risk Is Execution, Not Imagination

The bear case is practical. Models can be powerful and still fail in production if they need too much supervision, produce hard-to-audit work or force companies to redesign approvals before savings show up. Gartner noted that enterprises still favor tactical efficiency projects over disruptive change, which means adoption may arrive as a slow budget migration rather than a sudden rewrite of the office.

That is why finance and security are useful tests. Both areas have money, urgency and repetitive work. Both also punish errors. A wrong vulnerability report wastes scarce engineering hours. A wrong financial model can flow into a client deck, audit file or capital decision.

The investment question is therefore less glamorous than the model demo. Can vendors prove verification and control at scale, while the cloud providers deliver enough capacity at the right cost? The answer will decide whether this phase of AI spending lifts many stocks or only the companies closest to usage and governance.

If agent rollouts keep moving into audited workflows, the AI trade broadens from model labs to the companies that carry, secure and govern the work. If verification costs rise faster than productivity, the same releases will look less like a straight line and more like a capex cycle with a margin test.

Disclaimer: This article is for informational purposes only and does not provide investment advice. AI and technology securities carry market, valuation, execution and regulatory risks. Consult a qualified financial professional before making investment decisions. Figures are accurate as of publication.

Continue Reading

AI

ChatGPT Tip Helped Drive a Georgia Child Abuse Life Sentence

Published

on

The ChatGPT tip in the Corey Hickey case shows how an AI upload can become a child safety report: prosecutors say the tool flagged at least two illegal images, sent a CyberTipline report to the National Center for Missing and Exploited Children, and led Georgia investigators to a phone with more evidence.

On May 21, 2026, a Putnam County jury convicted Hickey, 38, on 31 counts. T. Wright Barksdale III, district attorney for Georgia’s Ocmulgee Judicial Circuit, said the judge imposed two life sentences without parole, three life sentences with parole eligibility and 220 years in prison to run consecutively.

The Tip That Started a Criminal Case

The public record begins with a cyber tip. The Georgia Bureau of Investigation (GBI, the state agency that handled the cybercrime inquiry) said it began investigating Hickey’s online activity in October 2025 after receiving a report from the center about possible production, possession and distribution of child sexual abuse material (CSAM, illegal material depicting child sexual exploitation). The Georgia Bureau of Investigation arrest notice says Hickey was charged with three counts of sexual exploitation of children and booked into the Putnam County Jail.

Prosecutors later said the case widened after agents confirmed the images were taken in Putnam County and seized Hickey’s phone. A forensic review of that device uncovered numerous videos, leading to additional warrants for rape, aggravated assault and child molestation. The original online alert became the doorway to a local evidence case.

The indictment shows the shift. Prosecutors said jurors convicted Hickey on three counts of rape of a child under 10, four counts of aggravated child molestation, three counts of child molestation and 21 counts of sexual exploitation of children. That mix matters because it ties the online upload to alleged in-person abuse, device evidence and a full trial record.

The AI Report Chain Has Four Hands

The technology piece can sound instant: upload, flag, report, arrest. The working chain is slower and more human. The case moved through a four-hand relay involving a company safety system, a national clearinghouse, state investigators and local prosecutors.

The CyberTipline report process at NCMEC says staff review tips and work to find a potential location so reports can be made available to the proper law enforcement agency. That routing function is why the same pipeline can serve a social network, a cloud service, a messaging app or an AI product.

Actor Role in a CyberTip Case Publicly Confirmed in This Case
OpenAI Detects and reports prohibited uploads or requests involving CSAM. Prosecutors said at least two images uploaded to the AI tool were flagged and sent onward.
NCMEC CyberTipline Reviews reports and routes them to the law enforcement agency best placed to respond. GBI said its inquiry began after a CyberTipline report from the center.
GBI CEACC Unit Investigates online child exploitation leads with local partners. GBI said its Child Exploitation and Computer Crimes Unit opened the inquiry and made the arrest.
Ocmulgee Judicial Circuit DA Turns investigative findings into charges and trial proof. Prosecutors said the jury returned guilty verdicts across the full indictment.

That division of labor is easy to miss. The company can detect and report. The center can triage and route. Investigators still need warrants, devices, interviews and forensic work before a prosecutor can ask a jury for a conviction.

The Scale Behind One Tip

One Georgia prosecution landed inside a reporting system that now handles industrial-scale volume. The latest CyberTipline data from NCMEC says the service received 21.3 million reports in 2025, and provider reports included 61.8 million images, videos and other files.

  • 21.3 million reports reached the CyberTipline in 2025.
  • 61.8 million files were attached to electronic service provider reports.
  • 107,817 reports were submitted by OpenAI to the center from July to December 2025, according to OpenAI child safety reporting totals.

The data also shows why generative AI has become a child safety category of its own. NCMEC said 2025 reports included 1.5 million with a generative AI nexus, though it noted that more than 1.1 million came from Amazon AI Services detections tied to potential CSAM inside training datasets and did not include actionable offender or victim information.

That caveat is important. A report can describe a known image, a new upload, a cloud storage hit, an attempted generation or a training-data detection. Law enforcement value depends on what comes with the alert: account details, timestamps, location signals, preserved files and enough context to identify a suspect or protect a child.

Detection Happens Before a Courtroom

The case answers a common tech question in blunt terms: uploads to AI services are subject to safety checks when they involve suspected child exploitation. OpenAI says its rules bar users from exploiting, endangering or sexualizing anyone under 18, and that its systems monitor for child safety violations.

Our Child Safety Team reports all instances of CSAM, including uploads and requests, to NCMEC

That line appears in OpenAI’s child exploitation safety post, which also says the company uses hash matching for known material and Thorn’s CSAM classifier for potentially novel material. Thorn is a child safety technology nonprofit whose tools are used by platforms trying to identify abuse material at scale.

AI-specific abuse adds another layer. The same service can receive an old illegal image, a newly produced file, an attempt to generate abusive content or a request asking the model to describe an uploaded file. Detection has to sort those cases quickly, then hand the highest-risk ones to a human process that can preserve evidence and escalate urgent threats.

Prosecutors Still Needed Local Evidence

A CyberTipline report can start the knock on the door, but prosecutors still have to prove crimes under state law. In Hickey’s case, the proof described by prosecutors moved from the AI upload to physical jurisdiction, a seized phone and a broader digital review.

  • Location: investigators confirmed the images tied to the tip were taken in Putnam County.
  • Device evidence: Hickey’s phone was seized after his arrest and reviewed forensically.
  • Cloud evidence: prosecutors said more than 400 additional images were found on his Google Drive.
  • Charging decision: the case grew from three initial sexual exploitation charges into a 31-count indictment.

Prosecutors said law enforcement found over 4,800 images and videos of child abuse material on the phone. That volume did not replace the jury’s job. It gave prosecutors a body of local evidence that could be tested in court, tied to a device and placed beside the testimony and forensic work needed for conviction.

The Google Drive detail also shows why these cases rarely stay inside one product. A single tip can point to a handset, then to cloud storage, then to accounts, timestamps and other services. For investigators, the first alert is often a map fragment, not the whole map.

The Privacy Trade-Off Sits in the Same File

The safety win carries a privacy tension that deserves plain language. AI companies invite users to upload files, images and videos for help, but some categories of content trigger monitoring, account action and legal reporting. The boundary is drawn around suspected harm, yet the mechanism still depends on scanning user-provided material.

The company’s U.S. privacy policy for uploaded content says user content can include prompts and uploaded files, images, audio and video. It also says personal data may be used to comply with legal obligations and protect the rights, privacy, safety or property of users, the company or third parties.

That is the bargain every large platform now has to defend. Weak detection leaves children exposed and lets offenders use mainstream services as cover. Overbroad systems can create false alarms, chill lawful use and put sensitive material in front of reviewers. The hard standard is not just catching more. It is catching better, with secure handling, clear escalation and records a court can understand.

The Alert Is Only the First Rescue

The deepest fact in the Georgia case is that an online report surfaced alleged offline abuse. Prosecutors said the victim was a 7-year-old child known to Hickey, and the uploads gave investigators a path into a case that was larger than the two images that triggered the alert.

NCMEC says its staff review tips, seek a potential location and make reports available to appropriate law enforcement. The GBI said its inquiry was part of the Internet Crimes Against Children (ICAC, a U.S. Department of Justice-backed task force program) effort housed with the state agency’s child exploitation unit. In practice, that means child safety work moves across company systems, nonprofit analysts, state investigators and county courtrooms.

For readers who suspect online child exploitation, the practical route remains direct reporting through the CyberTipline or local law enforcement rather than confronting a suspect or circulating material. The Georgia case shows why: the evidence has to be preserved, routed and handled by people trained to protect both the child and the case.

A model can refuse an upload in milliseconds, but the child is protected only when the report reaches someone with a badge.

Continue Reading

Trending