Connect with us

AI

Alexandr Wang Has Staked Meta’s AI Future on Health

Alexandr Wang named health as Meta’s AI edge at Bloomberg Tech Thursday. The safety report found biological risks in Muse Spark before mitigations were applied.

Published

on

Meta’s Chief AI Officer Alexandr Wang said Thursday that consumer health capabilities will be the decisive edge for the company’s next AI models, putting health at the center of a bet backed by 3.5 billion monthly users and a $14.3 billion investment in his former company, Scale AI. He spoke at the Bloomberg Tech conference in San Francisco and described integration into Facebook, Instagram, and WhatsApp as the distribution path for health features.

Six weeks before those remarks, Meta’s safety team published a 158-page preparedness report on Muse Spark, the first model out of Meta Superintelligence Labs (MSL). The report found that before the company applied its mitigation stack, Muse Spark reached the “high risk” threshold for chemical and biological risks under Meta’s own internal scaling framework. Wang confirmed Thursday those findings shaped the decision not to release Muse Spark as open source, a break from the Llama tradition that had defined Meta’s AI identity for years.

The Ground-Up Overhaul Zuckerberg Ordered

Mark Zuckerberg created MSL in June 2025 after growing impatient with how far Llama had fallen behind OpenAI’s ChatGPT and Anthropic’s Claude. Meta had released Llama 4 in April 2025, and by summer the company had concluded it wasn’t closing the competitive gap fast enough. Wang, then Scale AI’s chief executive, joined Meta as its first-ever Chief AI Officer through a deal that gave Meta a 49% non-voting stake in Scale; Wang remained on Scale’s board.

Zuckerberg followed by pulling in more outside talent. Nat Friedman, the former chief executive of GitHub, joined alongside his longtime business partner Daniel Gross, who had led Safe Superintelligence, the company Ilya Sutskever co-founded after leaving OpenAI. Meta also recruited researchers from OpenAI, Anthropic, and Google, some brought in with pay packages reportedly worth hundreds of millions of dollars.

The team rebuilt Meta’s AI stack from scratch over nine months, using the Hyperion data center MSL constructed for training. Muse Spark launched April 8, 2026, and immediately went live at meta.ai. Wang called the result “better, frankly, than we expected internally,” while conceding it is “not at the tier of the leading frontier models” like Claude and ChatGPT. Future models in the Muse series, already in development, are what he’s aiming at those tiers. Meta told investors in January that 2026 AI-related capital expenditures will land between $115 billion and $135 billion, up from $72.2 billion the year before. The company also cut roughly 8,000 employees in May, redirecting resources toward AI infrastructure.

The HealthBench Gap That Backs Wang’s Claim

When Muse Spark launched, Meta flagged health as the model’s clearest competitive advantage. The company had spent nine months curating training data in collaboration with more than 1,000 physicians, covering both factual content and the edge cases where general-purpose models fail. The model generates interactive displays breaking down the nutritional content of meals, muscle groups activated by exercise, and product-by-product comparisons from a phone camera image. Muse Spark is rolling out on Ray-Ban Meta and Oakley Meta glasses, with a Ray-Ban Display integration set for summer 2026. For complex health questions, the model’s “Contemplating” mode runs multiple reasoning agents in parallel; on Humanity’s Last Exam, a graduate-level reasoning benchmark, that mode reached 58%.

Health is an area that we view as really critical as we scale these models out to billions.

Wang at the Bloomberg Tech conference, June 5, 2026.

On HealthBench Hard, the benchmark measuring medical reasoning quality, Muse Spark posted 42.8. Gemini scored 20.6 on the same test; Claude Opus 4.6 scored 14.8. The broader Intelligence Index from Artificial Analysis, the AI evaluation firm, puts Muse Spark at 52, with Gemini 3.1 Pro and GPT-5.4 each at 57 and Claude Opus 4.6 at 53. Llama 4 Maverick, the model MSL replaced, scored 18 on that index.

Model HealthBench Hard AI Intelligence Index
Muse Spark (Meta) 42.8 52
Gemini 3.1 Pro (Google) 20.6 57
Claude Opus 4.6 (Anthropic) 14.8 53
GPT-5.4 (OpenAI) n/a 57
Llama 4 Maverick (Meta, prior) n/a 18

HealthBench Hard: Meta’s April 2026 launch benchmarks. Intelligence Index: Artificial Analysis.

OpenAI launched ChatGPT Health in January 2026, offering users the ability to upload medical records and receive explanations of test results and care options. Wang’s argument Thursday was that consumer scale changes the calculation: 3.5 billion monthly users on Meta’s platforms gives the company a distribution channel no model-only lab can match. Meta describes the long-term aim as “personal superintelligence,” an AI embedded in the relationships and context at the center of a user’s digital life, with health as one of its primary applications.

The Biological Risk Flag in Meta’s Safety Report

Three weeks after Muse Spark launched, Meta published the Muse Spark Safety and Preparedness Report, a 158-page document released April 28. The finding on biological risk was specific: an unmitigated Muse Spark deployment meets the “high risk” threshold for chemical and biological risks under Meta’s Advanced AI Scaling Framework. Post-mitigation, the company assessed residual risk as “moderate or lower.”

To evaluate those risks, Meta brought in external biosecurity consultants from Deloitte, Faculty, and SecureBio, hiring domain specialists who could assess biological-risk bottlenecks where, the company acknowledged, generic evaluators would lack the ground truth to judge accurately. Cybersecurity risk, by contrast, was assessed at “moderate or lower” without that specialist layer.

Apollo Research, running third-party evaluations on a near-launch checkpoint, found that Muse Spark showed the highest rate of evaluation awareness of any model Apollo had tested. The model frequently identified testing scenarios as “alignment traps” and reasoned that it should behave honestly specifically because it was being evaluated. Meta concluded this was not a blocking concern but acknowledged it “warrants further research.”

The model runs on Facebook, Instagram, WhatsApp, Messenger, and Threads, as well as Meta’s smart glasses line. Distributing the weights publicly, Wang said, is a different risk calculation than running it on Meta’s own controlled infrastructure.

43 States and the FDA’s March Guidelines

Wang’s health push arrives into a market regulators have been moving to govern since January. In March 2026, the U.S. Food and Drug Administration (FDA) issued draft guidelines for AI health tools, moving toward a framework that could require premarket review for consumer chatbots providing medical information. A May 2026 analysis from Harvard Law School’s Petrie-Flom Center argued those chatbots already qualify as medical devices under existing law, even without formal FDA classification.

In its annual assessment, ECRI, the nonprofit patient safety organization, ranked AI chatbot misuse as the top health technology hazard for 2026, ahead of hardware failures and medication errors. ECRI’s research cited hallucinations as the primary driver, including at least one documented case where a chatbot incorrectly suggested a surgical procedure that would have caused serious burns.

  • FDA draft guidelines, March 2026: moving toward premarket review for consumer health chatbots; the agency’s Digital Health Advisory Committee met specifically to discuss AI mental health tools in November 2025.
  • ECRI’s 2026 hazard assessment: ranked AI chatbot misuse as the top health technology risk of the year, with hallucinations cited as the driver of dangerous real-world outcomes.
  • 43 states: introduced more than 240 health AI bills in 2026 alone, per the Manatt Health policy tracker, covering chatbot disclosures, clinical oversight requirements, and guardrails for users under 18.

Privacy adds another layer. Muse Spark requires a Facebook or Instagram login to access. Meta hasn’t explicitly stated whether social media profile data will inform its health responses, though the company’s “personal superintelligence” framing suggests personalization is the aim. California’s AI chatbot law, effective January 1, 2026, already requires chatbots to detect mental health crises and suicidal ideation, establish guardrails for minors, and disclose to users they are interacting with an AI. Instagram, Facebook, and WhatsApp all operate under California law.

The Llama Legacy Meta Chose to Break

By the end of 2025, Llama had become the foundation for a developer ecosystem that extended far beyond Meta’s own products. Muse Spark ended that pattern. In Meta’s Muse Spark launch announcement, the company said it hopes to open-source future Muse models once each passes the safety framework review, framing the current closure as conditional rather than permanent.

The closed-source turn has been read by most analysts as a signal about Meta’s competitive posture. Truist, in an April 21 report, described the shift as deliberate: “Notably, Muse Spark is closed-source, reflecting a change from Llama’s open-sourced approach and a shift toward high-performance, specialized infrastructure,” they wrote, calling the nine-month rebuild an “aggressive effort to close the gap with competitors like OpenAI and Google.”

Loop Capital has flagged in recent reports the perception risk of Meta appearing to be “a company desperately spending to fix problematic AI initiatives,” while concluding Muse Spark demonstrates the model progress that matters most for Meta’s actual revenue engine. Even if future Muse models don’t outperform rivals on capability benchmarks, those tests carry “mixed importance,” Loop wrote, because Meta’s advertising business doesn’t depend on frontier-level AI performance to keep growing.

On the day Muse Spark launched, Meta’s stock rose 9%. Wang said Thursday that future Muse models could be released under an open-source license, but only after each passes Meta’s safety framework review. The biological-risk assessment runs every time. For a health-focused model, that assessment will have to clear the same threshold that Muse Spark itself required mitigations to pass.

Logan Pierce is a writer and web publisher with over seven years of experience covering consumer technology. He has published work on independent tech blogs and freelance bylines covering Android devices, privacy focused software, and budget gadgets. Logan founded Oton Technology to publish clear, no nonsense tech news and reviews based on real hands on testing. He has personally tested and reviewed dozens of mid range and budget Android phones, written extensively about app privacy, and built and managed multiple WordPress publications over the past decade. Logan holds a bachelor's degree in English and studied digital marketing at a certificate level.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending