Blog | MachinaLearning

Mythos Goes Dark, MCP Hits 97M, EU Omnibus Lands: Saturday Briefing, May 2, 2026

Filtering the latest cycle to the items that actually move builder, enterprise, or policy planning leaves a short list. Anthropic shipped Claude Opus 4.7 across every major cloud and quietly seeded “Mythos” (Project Glasswing) to roughly 50 defensive-security partners — the first time a frontier model has been gated behind a security-only access program at this scale. Meta’s Superintelligence Labs shipped its first flagship, Muse Spark, alongside a $115–135B 2026 capex commitment. MCP crossed 97 million installs; Stanford’s 2026 AI Indexputs agent success on real computer tasks at 66%, even as a CISO survey reports only 5% believe they could contain a compromised agent. The EU AI Act Omnibus trilogue closed on April 28 with new deadlines (Dec 2, 2027 and Aug 2, 2028). A neuro-symbolic VLA result reports ~100× training-energy reduction with accuracy gains, and Big Tech AI capex is tracking ~$700B in 2026.

97M

MCP installs (cumulative)

66%

agents on real computer tasks

CISOs confident they could contain a rogue agent

~$700B

Big Tech 2026 AI capex

~100×

neuro-symbolic VLA training-energy cut

Frontier Cadence: Opus 4.7 GA, Mythos Goes Dark, Muse Spark Lands at a Fraction of Llama 4’s Compute

Anthropic shipped Claude Opus 4.7 across all Claude products, the API, AWS Bedrock, Google Vertex AI, and Microsoft Foundry simultaneously. The advertised gains are concentrated in long-running coding tasks and higher-resolution vision, with pricing held at $5/$25 per million input/output tokens. The release confirms that Anthropic’s cadence on the Opus line is now point releases on weeks-to-months — planning teams should treat “Opus is at version X” as a moving target rather than a fixed SKU.

More unusual is what didn’t ship to the public. Anthropic’s Claude Mythos Preview (Project Glasswing), announced April 7, is a general-purpose model with unusually strong cybersecurity capability — and it was distributed only to ~50 enterprise partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan, Microsoft, NVIDIA, Palo Alto Networks, and others) for defensive security work. It is the first time a major lab has gated a frontier model behind a security-focused partner program at this scale, and it sets a template that other labs can copy when capability outruns the safety case for open access.

On the OpenAI side, GPT-5.5 is now live on Amazon Bedrock alongside Codex and Bedrock Managed Agents. GPT-6 (codename “Spud”) — originally targeted for April 14 — slipped, with Sam Altman calling it “a few weeks out;” provisional benchmarks suggest >40% gains over GPT-5.4 on coding, reasoning, and agent tasks (HumanEval ~95%, MATH ~85%). Meta’s Muse Spark, the first flagship out of Alexandr Wang’s Superintelligence Labs, posts competitive multimodal performance at a fraction of the compute cost of Meta’s older mid-size Llama 4 variant — paired with 2026 AI capex of $115–135B, roughly double 2025 spend. Google’s Gemma 4 (Apache 2.0) and the GLM-5.1 open family round out the cadence: a 27B dense + 26B-A4B MoE intelligence-per-parameter push from Google, and an MIT-licensed Chinese release that reportedly beats GPT-5.4 on coding benchmarks.

Why It Matters

The frontier is now arriving in two tiers: a public tier where Opus, GPT-5.5, Gemma, and Muse Spark fight on benchmarks, and a partner-only tier where capability is gated behind security review. Plan internal evals against capability slices, not SKUs — and assume that the next meaningful jump in cybersecurity-relevant capability may not arrive through the public API at all.

Agent Surface Goes Mainstream: MCP at 97M Installs, 66% on Real Tasks, Workspace Agents Default-On

Anthropic’s Model Context Protocol passed 97 million cumulative installs by March 2026. At that scale MCP is no longer an experimental standard — it functions as de facto agent-tooling infrastructure. The same week, Stanford’s 2026 AI Index reported agent success on real computer-use tasks jumping from 12% to 66% year over year. Agents now navigate software roughly as well as humans on benchmark tasks — the threshold result the agentic-software thesis has been waiting on since 2024.

The product surface caught up at the same time. OpenAI workspace agents in ChatGPT Business and Enterprise now handle tasks across Slack, Gmail, and other tools with reduced human-in-the-loop intervention. AWS expanded Amazon Connect into four agentic verticals (supply chain, hiring, customer experience, plus one more), and shipped the Amazon Quick desktop assistant at What’s Next with AWS 2026. On April 28, Anthropic shipped nine Claude Connectors across Adobe, Blender, Autodesk, Ableton, Splice, Affinity by Canva, SketchUp, Resolume Arena, and Resolume Wire — available across all Claude plans. NVIDIA’s Nemotron 3 Nano Omni ships an open multimodal model unifying vision, audio, and language, with claimed up to 9× efficiency for agent workloads.

Why It Matters

Forecast is 40% of business apps embedding agents by end of 2026, up from under 5% in 2025. The lock-in fight has shifted up the stack — from model SKU to the agent wrapper, the tool catalog (MCP), and the connector ecosystem. Procurement decisions made now will reverberate longer than which foundation model is on top.

The Agent Governance Gap: 86% of CISOs Don’t Enforce Access Policies for AI Agents

A new CISO survey lands the counter-signal hard. 86% of CISOs do not enforce access policies for AI agents, and only 5% believe they could contain a compromised agent. Agents commonly hold admin-level access with minimal oversight — the building governance crisis the FMF has been warning about, now quantified. Gartner separately forecasts that >40% of agent projects will fail by end of 2027 on cost and security grounds — not capability.

The frontier labs responded operationally. OpenAI, Anthropic, and Google have formed an anti-distillation pact via the Frontier Model Forum, sharing intelligence to detect adversarial distillation by Chinese competitors — the first concrete operational outcome from FMF coordination. The Mythos partner program (above) is the same instinct in a different shape: keep the highest-capability cybersecurity work behind closed-distribution review, not the public API.

Why It Matters

The capability curve is racing ahead of the access-control curve. Re-baseline agent rollouts on the assumption that tool-level evals, sandboxes, and audit trails are now the rate-limiter. The first major incident traceable to the agent-access-governance gap is now a when, not an if.

Capital & Compute: ~$700B Big-Tech Capex, SoftBank Plans a $100B Robotics IPO, Sora Shuts Down

Big Tech AI infrastructure spend is tracking ~$700B in 2026. Fortune frames the open question as “where does the buildout end” — relevant to power supply, semiconductor allocation, and macro AI-capex risk. Meta alone committed $115–135B, roughly double 2025. SoftBank is reportedly preparing to spin out and IPO an AI-and-robotics company (“Roze”) at up to a $100B valuation — one of the largest AI IPOs to date if it lands.

The cycle’s first prominent consumer failure also crossed the wire. OpenAI shut down the Sora app six months after launch: it hit >1M downloads in opening week, then active users collapsed below 500K against a ~$15M/day compute burn. Notable as the first major cycle-of-2026 cancellation in consumer generative video, and a useful counterweight to the “all consumer AI scales” narrative. On the upside, Meta’s business AI now facilitates ~10M conversations/week, up from 1M at the start of 2026 — a 10× ramp in four months on a B2B surface.

Why It Matters

AI capex is still accelerating, but unit economics are starting to bite at the consumer edge. Watch the May earnings cycle for any wobble in capex guidance — and watch the Sora post-mortem for what it implies about the ~$15M/day compute floor for any consumer generative-video product.

Policy Compass: EU Omnibus Closes With Delays, ChatGPT May Land Under DSA, US Goes Federal-First

On April 28, the EU AI Act Omnibus trilogue completed with new deadlines: Dec 2, 2027 for standalone Annex III high-risk systems, and Aug 2, 2028 for AI embedded in regulated products — a material slip from the original schedule. Per Handelsblatt, the European Commission is also expected to designate ChatGPT under DSA-style platform rules “within days,” subjecting OpenAI to some of the bloc’s most stringent digital-services obligations. The two decisions push compliance work in opposite directions: more time on Act-driven obligations, less time on platform-rule obligations for the largest US providers.

On the US side, the White House National Policy Framework for AI (released March 20) recommends federal preemption of state AI laws “imposing undue burdens” — a direct collision course with state-level frameworks. California EO N-5-26 directs agencies to draft AI safety requirements for state vendors covering illegal content, bias, civil rights, and free speech. New York’s RAISE Act took effect March 19 — transparency, compliance, and reporting obligations on developers of large frontier models. Compliance posture now splits into three tracks: EU Act on a 2027–2028 calendar, EU platform rules on a near-term cadence, and a US federal-vs-state preemption fight that could redraw obligations multiple times before year-end.

Why It Matters

The headline-level read is “EU bought time, US is consolidating.” The operational read is the opposite: any provider large enough to fall under DSA designation is back on the near-term clock, and US compliance teams have to plan for two regimes (federal-preemption and state-patchwork) running in parallel for at least the next two quarters.

Research Edge: Neuro-Symbolic VLA Cuts Training Energy ~100×, Hafnium-Oxide Chip Cuts 70%

A neuro-symbolic system combining neural networks with symbolic reasoning reports 95% success vs. 34% for conventional baselines on robotic tasks, with training time of 34 minutes vs. >1.5 days — the headline ~100× energy reduction comes with an accuracy gain, not a tradeoff. If the result reproduces at scale, the cost curve for embodied and robotic agents bends downward, and the “train-bigger-spend-more” default stops being the obvious frontier strategy.

On the hardware side, a Cambridge hafnium-oxide neuromorphic chip (published in Science Advances) mimics simultaneous neuronal compute and storage, claiming up to 70% energy reduction. Google TurboQuant at ICLR 2026 attacks KV-cache memory overhead, a dominant inference-cost bottleneck for any provider serving long-context windows. Separately, AMI Labs raised >$1B for world-model research — a record initial round for a European AI company, and a structural bet that environment simulation is the next frontier alongside (or beyond) LLM-only architectures.

Why It Matters

Three independent lines — neuro-symbolic constraints, neuromorphic hardware, KV-cache quantization — are pulling on the same lever: inference and training economics. The companies that close the gap between “runs on frontier infra” and “runs profitably” first will set the next round of unit economics.

The Six-Item Synthesis

If only six takeaways carry from this batch into the next planning cycle:

Opus 4.7 GA across all clouds; Mythos gated to ~50 partners. Frontier capability is now arriving in two tiers — public and partner-only.
MCP at 97M installs, agents at 66% on real tasks. Agent infrastructure is no longer experimental — the lock-in fight is over the wrapper and the tool catalog.
86% of CISOs don’t enforce agent access policies; only 5% feel containable.Capability is racing ahead of access control — tool-level evals and audit trails are the new rate-limiter.
Big-Tech capex ~$700B in 2026, SoftBank Roze targeting $100B IPO, Sora canceled.Capex still accelerating; consumer-AI unit economics are the first pressure point to crack.
EU Omnibus closed: Dec 2027 / Aug 2028; ChatGPT may land under DSA “within days.” Compliance teams now manage three distinct regulatory clocks, not one.
Neuro-symbolic VLA ~100× energy reduction with accuracy gains. The cost curve underneath training and inference is starting to bend — revisit unit economics on every embodied or agent workload.

References

Anthropic — Introducing Claude Opus 4.7 Anthropic Red — Claude Mythos Preview (Project Glasswing)TechCrunch — Anthropic launches Claude Design testingcatalog — Claude Connectors for Creative Tools Releasebot — OpenAI Release Notes (GPT-5.5, GPT-6 status)CNBC — OpenAI Looms Over Hyperscaler Earnings CNBC — Meta Debuts First Major Model Since the $14B Wang Deal (Muse Spark)NVIDIA — Nemotron 3 Nano Omni AWS — Top Announcements of What’s Next with AWS 2026 Google Cloud — AI Agent Trends 2026 Epsilla — AI Agent Infrastructure, April 2026 (MCP installs)Datadog — State of AI Engineering Fortune — Big Tech to spend $700B on AI in 2026 TechCrunch — Meta business AI hits 10M weekly conversations ScienceDaily — AI breakthrough cuts energy 100× (neuro-symbolic VLA)ScienceDaily — Brain-like chip cuts AI energy 70% (Cambridge hafnium-oxide)Apple ML Research — ICLR 2026 (TurboQuant context)Nature — “World models” rise (AMI Labs raise context)IAPP — EU AI Act Omnibus: What Just Happened Computing — EU may classify ChatGPT under platform rules Consumer Finance Monitor — White House National AI Policy Framework Cooley — State AI Laws: Where Are They Now Eversheds Sutherland — Global AI regulatory update, April 2026 llm-stats — LLM News Today (May 2026)Fazm — New LLM Releases, April 2026 AI Agent Store — Daily AI Agent News