The week the AI calendar collided. Apple's WWDC and Microsoft's Build on the same Monday Anthropic shipped Fable 5. OpenAI filed for an IPO, killed Sora, walked away from Disney — all in one week. A quantum chip designed by AI dropped. The pace wasn't unusual — the synchronicity was.
Irv CassioWeekly BriefingCoverage window June 6 to June 12
TL;DR · 90 seconds
Issue 16 in one paragraph
Anthropic shipped Claude Fable 5 (June 9) — the first Mythos-class model cleared for general use — and the same day Apple's WWDC announced a Gemini-powered Siri rebuild. Microsoft Build unveiled OpenClaw (open-source agent framework), MXC security containers, Scout assistant, and seven new MAI models, plus the Majorana 2 quantum chip that AI helped design. OpenAI filed a $852B IPO, killed Sora, walked away from $1B with Disney. Riley Brown cloned Lovable in 2 prompts using Fable 5. Cincy AI Week ran June 9–11 in Over-the-Rhine with a city proclamation and AI Ready Ohio expansion. Project Glasswing's zero-day count is now 10,000+. Claude Dynamic Workflows orchestrates 1,000 subagents (it ported Bun's 750K-line codebase from Zig to Rust). GitHub Copilot switched to metered billing as Karpathy demonstrated an AI agent running 700 autonomous experiments overnight for an 11% training speedup.
Fable 5 SWE-Bench Pro80.3%
Dynamic Workflows subagents1,000 parallel
Bun port size (Zig → Rust)750K LOC
Mythos zero-days found10,000+
OpenAI IPO target$852B / Sep–Q4
Cincy AI Week datesJun 9–11, OTR
Lovable cloned in2 prompts
Mag 7 YTD 2026 (approx)~−2.3%
01 · AnthropicAll users · Developers
Claude Fable 5 lands — and the backlash started within hours.
On June 9, 2026, Anthropic released Claude Fable 5, the first model from its new Mythos-class tier — the level that now sits above Opus — cleared for general use. Alongside it came Claude Mythos 5, the same underlying model with safety classifiers lifted, restricted to a handful of cyber defenders via Project Glasswing.
The numbers are striking: 80.3% on SWE-Bench Pro vs. 69.2% for Opus 4.8, 58.6% for GPT-5.5, and 54.2% for Gemini 3.1 Pro. Pricing is $10/$50 per million tokens. Hard safety classifiers fall back to Opus 4.8 on cybersecurity, biology, chemistry, and distillation requests — Anthropic says this happens in fewer than 5% of sessions.
The headline capability isn't a benchmark — it's persistence. Fable 5 holds focus across millions of tokens and improves its own work using notes it keeps along the way. Anthropic's most-shared demo: with persistent file-based memory while playing the deck-builder Slay the Spire, Fable's performance improved 3x more than Opus 4.8, and it reached the final act three times as often.
But the backlash started within hours. A widely-shared r/singularity thread argued the celebrated demos — counting letters, spatial puzzles, commute optimization — fix bounded benchmarked failures rather than addressing deep reasoning limits. The Register documented Fable 5 refusing innocuous prompts at higher rates than Opus 4.8 — the price of those safety classifiers showing up in everyday workflows.
Newsletter angle
The first "too dangerous to ship" model just shipped — and the community is already arguing about whether it actually reasons.
Meeting discussion
Is the r/singularity critique right — are we mistaking benchmark wins on bounded failures for real reasoning progress?
Has anyone hit the safety-classifier fallback in practice? On what kind of prompt?
On June 9, 2026 — the same day as Fable 5 — Apple opened WWDC 2026 with the announcement that defines this fall: Siri AI, a ground-up rebuild of Apple's voice assistant.
The new Siri holds real back-and-forth conversations, pulls context from emails, messages, and photos, answers live questions from the web, and takes action across apps. The twist that surprised even the Apple press corps: the new Siri runs on Google Gemini under the hood. Apple Intelligence handles the on-device privacy layer; Gemini handles the heavy reasoning.
The supporting cast was substantial. Spatial Reframe repositions photo subjects after the fact with generative fill. Safari got AI tab management. iOS 27 / macOS Golden Gate / watchOS 27 / iPadOS 27 / visionOS 27 all dropped together. Performance: 70% faster photo loading, 80% faster AirDrop, and improved CPU multitasking. All devices from iPhone 11 onward get the update.
The launch caveats matter: Siri AI is English-only beta initially, won't be available in China while Apple navigates regulatory requirements, and won't ship on iOS or iPadOS in the EU at launch — a Digital Markets Act standoff that's now the visible cost of EU AI regulation for the largest consumer-AI launch of the year.
Newsletter angle
Apple gave up on building its own AI brain — and put Gemini inside Siri.
Meeting discussion
Apple-using-Gemini for Siri is one of the biggest strategic concessions in recent memory. Does that change how you think about Apple as an AI company?
EU exclusion at launch — is this the new normal for cutting-edge AI features?
Microsoft Build 2026: OpenClaw, MXC, Scout, MAI — and the end of internal Claude Code.
Microsoft Build 2026 was a unified bet on agentic Windows. Three flagship announcements set the tone:
OpenClaw — an open-source agent framework for building, testing, and deploying autonomous agents directly on Windows. Microsoft is fully embracing it; its proactive AI assistant Scout (coming this summer to Copilot) is OpenClaw-powered, and Microsoft is contributing security guardrails back to the open project.
MXC (Microsoft Execution Containers) — a security isolation layer for agents, plus VBS Enclaves for agents extending virtualization-based security to protect agent memory and runtime state from the OS kernel. Windows is the first commercial OS with hardware-rooted container isolation for AI agents.
Surface hardware refresh with dedicated neural processors. OpenClaw and MXC enter public preview next month; GA with the Windows 11 2026 Update in October.
Microsoft also rolled out Microsoft IQ (unified intelligence layer — Work IQ for enterprise grounding, Web IQ for live-world data) GA across GitHub Copilot, Foundry, and Copilot Studio. Seven new MAI models dropped — MAI-Thinking-1 is a 35B-parameter reasoning model with a 256K context window. The native GitHub Copilot desktop app went into preview.
Which is the segue to the most awkward subplot of the week: Microsoft is cancelling Claude Code for most of its Experiences and Devices division on June 30, citing runaway token costs and security governance concerns. Engineers building Windows, M365, Outlook, Teams, and Surface get pushed to Copilot CLI. Claude Code had only been deployed inside Microsoft since December 2025 — six months in, "wildly popular" was enough to get it cancelled.
Newsletter angle
Microsoft built OpenClaw, MXC, and Scout — then quietly cancelled the AI tool its engineers actually wanted.
Meeting discussion
OpenClaw as open-source agent framework — does this become a real Anthropic Managed Agents competitor, or is it Microsoft-flavored only?
For Microsoft engineers losing Claude Code: is Copilot CLI close enough that the switch is painless, or will productivity dip?
Microsoft Majorana 2: AI-designed quantum chip, 1,000x better.
Slipped quietly into Build week: Microsoft unveiled Majorana 2, the second generation of its topological-qubit quantum chip. The headline material change is bigger than it sounds.
Where Google, IBM, and most of the industry build superconducting wires from aluminum, Microsoft's Majorana 2 uses lead — a larger atom whose quantum properties are easier to stabilize at scale. The chip and its materials were co-designed using AI tools that searched the material-properties space at speeds humans couldn't match. The result Microsoft is claiming: a 1,000-fold improvement in some performance dimensions of Majorana 2 over its predecessor.
The larger context: IBM committed $10 billion to global quantum expansion and is launching Anderon, the first pure-play quantum chip foundry in the US, backed by a $1B CHIPS Act incentive — targeting fault-tolerant quantum supercomputers by 2029. Microsoft says it'll have systems by 2029 too. QuiX Quantum installed a Feed-Forward Control Unit enabling real-time adaptive operations at 150-nanosecond latency for photonic systems.
This is the year quantum stops being a research curiosity and starts being capex. The interesting AI-adjacent question: if AI tools designed Majorana 2's material stack, the next generation of AI chips will be designed by AI too — and the loop tightens.
Newsletter angle
Microsoft just shipped a quantum chip that AI helped design — and that loop is about to close.
Meeting discussion
AI-assisted chip design is real now. What's the first AI-relevant workload you'd run on a 2029 quantum supercomputer?
IBM, Microsoft, QuiX, and Anderon all racing to 2029. Will this be the year quantum starts mattering, or is 2029 just another moving target?
OpenAI's pivot trifecta: IPO filing, Sora shutdown, Disney deal killed.
Three OpenAI moves in the same week tell one story: prepare for public markets, ruthlessly.
On June 8–10, OpenAI filed a confidential draft S-1 with the SEC at a post-money valuation around $852 billion, with Goldman Sachs and Morgan Stanley managing. The listing window: as early as September 2026, stretching into Q4. Some analysts are already modeling a $1T valuation by bookbuilding close.
Then, the same week, OpenAI shut down its viral Sora AI video app and terminated a $1B licensing deal with Disney, with an executive saying the company "cannot afford to be distracted by side quests" as it streamlines ahead of the IPO.
The financial picture explains the pivot: $25B annualized revenue, ChatGPT at 900M weekly active users, but losing $1.22 for every $1 earned in Q1 2026. Public-market investors will tolerate burn — they tolerated Amazon's for a decade — but they want a coherent story about which burn. Killing the consumer video bet to focus on enterprise (where the actual revenue is) is exactly the kind of clarity an S-1 needs. Anthropic, notably, filed its own confidential S-1 the same week — what's now being called a $3T AI IPO race.
Newsletter angle
OpenAI killed Sora and walked away from $1B with Disney — because the IPO needs the story to be cleaner.
Meeting discussion
If OpenAI lists at $852B losing $1.22 per dollar, what's the metric public markets will trade on — revenue growth, gross margin, or "narrative purity"?
Anthropic vs OpenAI on the same IPO calendar — who gets a better reception, and why?
Cincy AI Week — mayor's proclamation, AI Ready Ohio, and Fable 5 mid-conference.
While the global AI calendar collided this week, Cincy AI Week (June 9–11) ran simultaneously in Over-the-Rhine — sessions at Trancept, Union Hall, Alcove, and Sommerhaus instead of a convention center. Ohio's largest AI leadership conference, produced by the Enterprise Technology Association.
Councilmember Meeka Owens (on behalf of Mayor Aftab Pureval and City Council) proclaimed June 8th as AI Day in Cincinnati, bringing together city, county, state, and federal representatives. JobsOhio's Payal Thakur announced expansion of the AI Ready Ohio program in southwest Ohio, targeting upskilling and certification of 1,000 residents in practical AI competencies.
Day one's keynote: Ravit Dotan, PhD on Responsible AI. Day two: Dinesh Maheshwari (former CTO at Groq, now humanist entrepreneur) on AI for Business, in a Capital & Compute fireside with Tyler Mantel of Roll Tack Ventures. Day three: Katie Trauth Taylor, CEO of Narratize, on AI Leadership. Cintrifuse powered the AI Demo Day pitch competition.
Claude Ambassador Chris Thomas ran workshops and joined JobsOhio leadership roundtables — and Claude Fable 5 dropped publicly during the conference itself, making Cincinnati one of the first cities where attendees were actively building on it live. The Women in AI Breakfast returned at Union Hall. Awards Luncheon closed Thursday.
Newsletter angle
Mayor proclaims AI Day, AI Ready Ohio gets 1,000-resident expansion — and Fable 5 launches mid-conference.
Meeting discussion
Of the speakers and demos this week, which session shaped your team's thinking most?
"AI Ready Ohio" certifying 1,000 residents — what's the right curriculum? What would you teach if you had 8 hours per learner?
Dynamic Workflows: 1,000 subagents and a 750K-line Bun port.
Anthropic's Dynamic Workflows feature — first announced May 28 with Opus 4.8 — is now broadly available in Claude Code, the desktop app, and the VS Code extension for Max, Team, and Enterprise users. Also live on the API, Bedrock, Vertex AI, and Microsoft Foundry.
The mechanic: Claude takes one user prompt, builds a plan, divides the work, and orchestrates up to 1,000 subagents in parallel, with separate verifier agents checking findings before results merge.
The proof-of-concept Anthropic keeps citing is real: the Bun runtime port from Zig to Rust, ~750,000 lines of code with rigorous test-suite validation, was completed using Dynamic Workflows. Combine this with Claude Code's new 5-level nested sub-agents (an agent that spawns an agent that spawns an agent) and the fallbackModel setting (Fable 5 → Opus 4.8 → Sonnet 4.6, auto-degrade on overload) and you get a coordination layer that didn't exist three months ago.
The hard part of "agents at scale" was never spinning them up — it was orchestrating them, verifying them, and not setting fire to your token budget. Dynamic Workflows answers all three.
Newsletter angle
Bun's entire codebase ported from Zig to Rust — by 1,000 Claude agents working in parallel.
Meeting discussion
What's the largest task in your codebase you'd hand to Dynamic Workflows? What's the failure mode you fear most?
Demo idea: live-launch a 50-agent migration task and show the orchestration tree in claude agents --json.
Prompting Fable 5: external memory beats a bigger window.
Anthropic dropped an official prompting guide alongside the Fable 5 launch — and the single most important tactic is one nobody had to think about with previous models: stop treating it like a chat assistant.
Fable 5 is built for tasks that run for hours or days inside an agent harness. The single biggest mistake teams are making in the first 72 hours is feeding it conversational prompts with everything inline. Here are the five rules worth memorizing right now:
Give it tools, not instructions. The ability to run tests, read files, and search lets Fable 5 verify itself. Step-by-step procedures are wasted on it.
State the goal and success criteria, then leave room to plan. Concise goal-framing > prescriptive process.
Use external memory — not a bigger window. Persist plans, decisions, findings to a store the model reads from. The 3x Slay the Spire improvement was specifically with file-based memory; without it, Fable performs more like Opus.
Store lessons, not chat history. Compress finished workstreams into short summaries the next stage can retrieve.
Retrieve, don't recall. Pull in the slice of context the current step actually needs. Stuffing the prompt is the new context-bleeding.
This is a different relationship with the model — closer to "delegate a project" than "answer a question." Worth a 15-minute team walkthrough.
Newsletter angle
Five rules for Fable 5 the docs won't shout at you — and the one that 3x'd performance overnight.
Meeting discussion
Have you rewritten any standing prompts for Fable 5 already? Show one before/after.
Demo idea: side-by-side run of the same task with "all in prompt" vs. "tools + external memory" — measure tokens, time, quality.
Riley Brown cloned Lovable in two prompts — then said "Mythos is AGI."
Less than 24 hours after Fable 5 went live, AI builder Riley Brown posted a video on X showing he had cloned Lovable's official AI software-development platform in just two prompts.
Prompt 1 instructed Fable 5 to copy Lovable's interface and core programming logic. Prompt 2: "make it build a Notion-style dark notes app." The clone outperformed the official Lovable on text-editing richness (titles, underlines, tables), real-time editing smoothness, one-click external browser preview, and overall UI fidelity. Riley later expanded to 5 prompts and added cloud sandbox, integrated database, and authentication.
Riley's framing — "Mythos is AGI" — set off an immediate community debate. A separate viral LinkedIn post from Benjamin Verbeek showed a similar pattern: prompt 1 builds a Spore-style 3D space-game in Lovable; prompt 2 — literally the word "multiplayer" — converts it to multiplayer with websockets, state sync, and conflict resolution. A commenter nailed why this matters: "'Make it multiplayer' is not a feature request, it is an architecture migration."
The skeptics' counter is that what's being cloned is a single notes app on top of an existing LLM — not cross-domain reasoning, not autonomous awareness. Both sides are partly right. What's undeniable: the gap between "prompt" and "working app" collapsed by an order of magnitude this week.
Newsletter angle
Lovable cloned in 2 prompts — and the clone is better than the original.
Meeting discussion
Is Riley's clone "AGI" — or is it just "Fable 5 is really good at React + Tailwind"? Where do you draw the line?
What's the next "obvious" app to clone in 2 prompts? What would actually be hard to clone, and why?
Project Glasswing: 10,000+ zero-days and a 27-year-old OpenBSD bug.
The week of Fable 5's launch, Anthropic's red team published a major update on Project Glasswing — the closed program where Claude Mythos Preview is shared with AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, NVIDIA, Palo Alto Networks, the Linux Foundation, and a few open-source maintainers.
The aggregate count is now 10,000+ high- or critical-severity vulnerabilities, including zero-days in every major operating system and every major web browser. The standout case: a 27-year-old vulnerability in OpenBSD — the OS famous for its security hardening — found by an AI model on what was effectively a first read.
What makes this discontinuous isn't discovery — it's exploitation. Mythos reproduced vulnerabilities and developed working exploits on the first attempt in over 83% of cases. Anthropic's stated reason Mythos remains restricted (and only the safety-classified Fable 5 ships): "We are not confident that everybody should have access right now."
That's a notable position from the lab that built it, and it reframes the AI safety debate from speculative ("what if AGI someday") to concrete ("we have a model that pops OpenBSD on the first try — what do we do with it?"). Expect this single dataset to drive policy conversations for the rest of 2026.
Newsletter angle
The model that pops OpenBSD on the first try — and why Anthropic refuses to ship it.
Meeting discussion
If 12 companies are sitting on a model that finds zero-days 83% of the time, what's the half-life of the global software vulnerability backlog?
Should this kind of capability be regulated as a controlled export — like cryptography in the 90s?
Anthropic shipped two enterprise-shaped pieces alongside Fable 5: self-hosted sandboxes (public beta) and MCP tunnels (research preview).
The split is clever — the agent loop (orchestration, context, error recovery) stays on Anthropic's infrastructure, but tool execution moves to compute you control — your VPC, or a managed provider like Cloudflare, Daytona, Modal, or Vercel. Your network policies, audit logging, security tooling all apply, and files don't leave your perimeter.
MCP tunnels are the second half: a lightweight gateway you deploy makes a single outbound connection — no inbound firewall rules, no public endpoints — and your agents reach internal databases, APIs, knowledge bases, and ticketing systems as tools.
This drops into an MCP ecosystem that just crossed real escape velocity. ~97 million monthly SDK downloads, 10,000+ public servers indexed, native support across Claude, ChatGPT, Gemini, Copilot, and Cursor. Governance under the Linux Foundation (Agentic AI Foundation co-founded by Anthropic, Block, and OpenAI in December 2025). The newer development is MCP Apps, which lets tools return rich HTML interfaces in sandboxed iframes inside the chat — so an agent can produce a real interactive form or dashboard, not just text.
Newsletter angle
Anthropic's answer to "we can't send our code to your servers" — don't.
Meeting discussion
If you're at a regulated company, does the self-hosted sandbox + MCP tunnel combo finally unlock Claude for production? What's still missing?
Demo idea: stand up an MCP tunnel against a local Postgres and let Claude query it — show the network diagram with no inbound port.
The economics: Karpathy's auto-research loop + GitHub's metered billing.
Two stories from this week show the economics of agentic AI getting real.
First: Andrej Karpathy published an experiment where an AI agent autonomously conducted 700 experiments over two days to optimize neural-net training code, resulting in an 11% training speedup. Karpathy is calling the pattern "The Karpathy Loop" — autonomous-research-by-AI on AI itself. The implications are obvious: if AI can iterate on its own training pipeline overnight, the "AI improves AI" feedback loop is no longer theoretical. Multiple labs are reportedly running variations now.
Second: GitHub Copilot moved from request-based billing to usage-based metered billing on June 1. The mechanism is a virtual currency — GitHub AI Credits at $0.01 each — and the stated reason is "escalating inference costs from complex AI coding sessions made the previous unlimited subscription model unsustainable."
This is the same pressure that pushed Microsoft to cancel Claude Code internally on June 30. The era of all-you-can-eat AI subscriptions is closing — the next 12 months will normalize per-token or per-credit pricing across every AI dev tool. Combined with the Karpathy Loop showing AI can usefully spend its own tokens on research, the question gets sharper: who pays for the agentic loop, and how do you cap it?
Newsletter angle
AI agents can run 700 experiments overnight — right as your subscription becomes pay-per-token.
Meeting discussion
If your AI tool now meters per credit, does it change how you'd build a long-running agent workflow? What's the right per-task budget?
The Karpathy Loop: how long until labs run mostly AI-improving-AI cycles, with humans only setting goals?
⚙︎ Toolbelt · New weekly sectionAll users · Developers
The Toolbelt — what's worth installing this week.
A new recurring section: the MCPs, skills, plugins, and repos that crossed our desk this week and earned a spot in the actual workflow. Each tool is tagged for the agent harness it works with — Claude (Claude Code / Desktop), Codex (CLI & app), Qwen-local (Ollama / LM Studio), All (any MCP-capable client), and macOS (system-level).
Pro tip · from Karpathy + the Claude Code team
Ask your LLM to reply in HTML, not Markdown.
In early May, Andrej Karpathy posted a one-liner that's now made the rounds: "ask your LLM to structure its response as HTML, then view it in a browser." A day later, Thariq Shihipar (engineering lead on Claude Code at Anthropic) published "The Unreasonable Effectiveness of HTML" and said he'd stopped using markdown for AI outputs entirely.
The argument isn't aesthetic — it's information density. Markdown forces the model to approximate tables, diagrams, and side-by-side comparisons with ASCII. HTML gives it real tables, SVG charts, mockups, code diffs, and styled callouts. The model already knows HTML deeply (it's trained on it). The browser does the rendering work for free.
The progression Karpathy framed: raw text → markdown → HTML → interactive neural video. Each step trades a bit of efficiency for a lot of comprehension.
Try this prompt: "Answer in a self-contained HTML document with a <style> block. Use a table for any comparison, an SVG bar chart for any numeric distribution, and color-coded callout boxes for warnings and tips. Save it and open in a browser."
Your daily-driver Mac has all the cookies, API keys, and CLI tokens. Your agent's Mac (the headless mini under the desk, or the second machine you let Claude Code drive overnight) doesn't. agentcookie watches Chrome's cookie store and a per-CLI secrets bus on your laptop, and ships the diff continuously to the agent Mac, encrypted end-to-end over your Tailscale tailnet. No cloud middleman. When the agent wakes up, it's already logged in.
Solves the most annoying problem in autonomous-agent setups: the agent stalls because GitHub, Claude.ai, or your SaaS dashboard just logged it out. It's harness-agnostic — works whether your agent runtime is Claude Code, OpenClaw, Hermes, Codex, or a custom Qwen wrapper.
Example
You go to bed. Your Mac mini runs Claude Code on a refactor that needs GitHub auth, Linear access, and a logged-in Vercel dashboard. At 2am your laptop's GitHub session auto-refreshes — agentcookie ships the new token to the mini within seconds. The agent never sees a 401 and never wakes you up.
Jesse Vincent's agentic skills framework — a methodology bundled as a plugin. Ships with composable skills for brainstorming (forces the model to refine the idea before writing code), planning, TDD, systematic debugging, and verification-before-completion (the model can't claim done without running the verify command and quoting the output). 752,000 installs as of June 1; the brainstorming skill alone is responsible for the most "wait, that's a better feature than what I asked for" moments people are reporting.
Works on Claude Code, Codex CLI, Codex App, Factory Droid, Gemini CLI, OpenCode, Cursor, and GitHub Copilot CLI. Qwen-local users can adapt it if their harness reads SKILL.md files.
Example
You say "add a dark mode toggle." Without superpowers, the model writes a toggle. With superpowers, the brainstorming skill fires first: "Are we toggling the OS preference, or a per-user override? Do we persist to localStorage or to your user record? Should the toggle live in settings or the topbar?" — five questions and a one-page design doc before any code.
A research skill that searches a topic across Reddit, X, Bluesky, YouTube, TikTok, Instagram, Hacker News, Polymarket, and the open web — restricted to the last 30 days — then has a judge model score the results by what real humans actually engaged with and synthesize one grounded brief with citations. Reddit/HN/Polymarket/GitHub work zero-config; X/YouTube/TikTok unlock with a 30-second auth.
Best mental model: it's the inverse of WebSearch. WebSearch gives you ten links. last30days gives you the consensus narrative from the last month, with the receipts.
Example
Run /last30days "Claude Fable 5 vs GPT-5.5 in production" and you get a brief that quotes the actual Reddit threads where engineers are A/B-testing them, the X posts where Anthropic and OpenAI staff are sniping at each other, the Polymarket odds, and three Hacker News debates — instead of one cached Anthropic blog post.
One MCP endpoint that exposes 9,000+ apps (Slack, Gmail, Google Calendar, Notion, HubSpot, Stripe, Sheets, Salesforce, Linear, GitHub, Airtable, the list goes on) as callable tools. Plug it into Claude Desktop, Claude Code, Cursor, Codex, or any MCP-capable client — Qwen-local works if your harness supports remote MCP. No glue code, no per-app OAuth dance for each agent.
The killer use case isn't a single integration — it's composition. You don't have to write a script that touches Slack and Calendar and Sheets — the model does it in one sentence.
Example
Ask Claude: "Find every unread email from a customer this week, summarize each in 2 lines, drop the summary in our #cs-triage Slack channel, and add a row to the CS Triage Google Sheet with the customer's name and the urgency score." Zapier MCP handles all three apps. No code written.
The "stop hallucinating my framework's API" MCP. Context7 fetches live, version-pinned documentation for thousands of libraries — React, Next.js, Prisma, Tailwind, Django, Spring Boot, the cloud SDKs — and injects it into the model's context on demand. 348,660 installs as of June 1; the #1 fix for the "Claude wrote code against React 17 patterns even though we're on React 20" problem.
Works in any MCP-capable client. The Anthropic team includes it in the official Claude Code plugin marketplace; on Codex and Qwen-local you wire it as a remote MCP server.
Example
Tell the model "use Next.js App Router with the new after() API for the analytics callback." Without Context7, you might get the 2024 pages-router answer. With Context7, the model fetches the current Next.js 16 docs first and writes against the actual API as it shipped two weeks ago.
Microsoft's official browser-automation MCP — ~30,000 GitHub stars by mid-2026, second-most-popular MCP in the ecosystem behind GitHub MCP. Gives any model a real, scriptable browser: click, type, screenshot, scrape, fill forms, navigate multi-page flows. Works anywhere MCP works (Claude, Codex, Qwen-local via a compatible harness).
This is the tool that finally makes "have the agent buy the thing" or "have the agent fill out the form on the gnarly internal portal" reliable, because it's a real browser instead of a fragile HTTP scraper.
Example
"Log into our staging env, navigate to /admin/users, find the user with email foo@bar.com, screenshot their profile, and paste the screenshot into the Linear ticket." Playwright MCP handles the click-by-click; Linear MCP (or Zapier MCP) does the ticket attach.
Newsletter angle
Two MCPs and one skill replace a sprint of glue code — and the next version of "AI workflow" is just a list of plugins you installed.
Meeting discussion
Of the tools above, which two would have saved you the most time this past week if you'd had them?
Karpathy's HTML-over-markdown tip — would your team adopt it for internal reports, dashboards, or AI-generated docs?
If we standardize on Zapier MCP + Context7 + Playwright MCP across the team, what changes about how we plan a feature?
A practical equal-weight basket of 12 names for tracking the "AI economy" as a single line versus the broader market. Rebalance quarterly. Add IPO entrants on first close after listing. This is a design — pull current quotes from your data source.
Components — 12 names, equal-weight (~8.3% each)
Ticker
Company
Why it's in
NVDA
NVIDIA
GPU monopoly; powers every frontier model
MSFT
Microsoft
Build / OpenClaw / Majorana 2; OpenAI partner
GOOGL
Alphabet
Gemini frontier; Apple distribution deal; DeepMind
AAPL
Apple
Apple Intelligence; Gemini-powered Siri; iPhone scale
Sole-source fab for NVDA + AAPL leading-edge silicon
ARM
Arm Holdings
Reference designs for every edge-AI SoC
PLTR
Palantir
Enterprise / government AI deployment layer
CRWD
CrowdStrike
Glasswing partner; security for the agentic era
YTD Performance Snapshot — approximate, through ~June 11, 2026
Ticker / basket
YTD 2026
Note
AAPL
+14.3%
Strongest Mag 7; Siri/Gemini deal a tailwind
NVDA
+10–11%
Lagging the broader market for the first time since 2022
GOOGL
leading Mag 7
Specific number not in current search; benefited from Apple deal
MSFT
-14.7%
Token-cost concerns + Claude Code-cancel optics
META
lagging
Capex shock from $145B guidance
Mag 7 basket (approx)
~-2.3%
First period lagging the broader market since 2022
S&P 500 (^GSPC)
~+5–8%
1H estimate; June month-to-date -2.59%
AI Index basket vs. S&P 500 — YTD 2026 cumulative return
Illustrative cumulative YTD return through ~June 11, 2026. The AI Index is the 12-name equal-weight basket above (Mag-7-leaning). The Mag 7 underperforming a broader-market S&P 500 is itself the story — first stretch since 2022. Data points reconstructed from public sources; not investment guidance.
Dispersion view — every name in the basket, YTD 2026
The dispersion is the story. ORCL (OCI inference deals) leads the basket at +18%, with Apple and Alphabet keeping the Mag 7 in the green individually. At the other end, PLTR is the basket's worst performer — down ~23% YTD on valuation compression despite ~85% revenue growth — followed by MSFT and META, the two biggest AI capex hawks, at -14.7% and -8%. The S&P 500 sits +7% YTD because breadth widened.
⚠️ Live verification — this is a design of the index, not real-time pricing. Pull current quotes from SlickCharts YTD, FinanceCharts compare, or your broker for production use.
Read of the tape. Hard to make a single bullish-or-bearish call on "AI stocks" right now. The Mag 7 dispersion has widened — Apple and Alphabet up; Microsoft and Meta down — and the S&P 500 without the Mag 7 has actually returned ~7% YTD, meaning breadth has improved. On a 2026 YTD basis the dispersion story matters more than any headline number.
IPO Watchlist — what changes when these list
Company
Status
What it does to the index
OpenAI
Confidential S-1 filed; target Sept–Q4 2026 at ~$852B
Add at first close. Top-3 weighting if equal-weighted. Consider trimming MSFT exposure to avoid double-counting cross-ownership.
Anthropic
Confidential S-1 filed same week
Add at first close. AMZN cross-ownership means similar trim decision.
xAI
Private, ~$200B+ private rounds; no S-1 yet
Watch for filing; smaller initial weight.
Databricks
Private, mooted for late-2026 / 2027 IPO
Add when listed.
Perplexity
Private; speculation but no filing
Hold off.
Index composition — today vs. post-IPO (OpenAI + Anthropic added Q4)
Two listings — OpenAI in Sept–Q4 and Anthropic on the same confidential S-1 calendar — push the basket from 12 to 14 equal positions. The cross-ownership wrinkle: Microsoft's OpenAI stake and Amazon's Anthropic stake have been baked into MSFT and AMZN multiples for years. On IPO day, trim those weights to keep the index honest.
Rebalance rules to consider
On IPO add: dilute existing 12 equal weights to accommodate — e.g., 13 names at ~7.7% each, 14 names at ~7.1% each.
MSFT / OpenAI cross-ownership: if OpenAI lists and MSFT retains a meaningful stake, trim MSFT weight by the embedded OpenAI exposure to avoid double-counting.
AMZN / Anthropic: same logic.
Quarterly rebalance back to equal weight to avoid one name dominating the basket the way NVDA dominates the S&P 500.
Why this matters for the briefing. The Cincinnati Fireside Chat on "Capital & Compute" this week was the local version of the question every CFO is asking: does the AI capex story justify the AI stock multiples? With Mag 7 underperforming and S&P breadth widening, the answer is "it's complicated and getting more so." Add in OpenAI and Anthropic as listed equities in Q4, and the "AI investability" question gets a real benchmark for the first time.