Journal/AI Weekly Digest/15–22 May 2026

AI Weekly Digest15–22 May 2026

Andrej Karpathy joins Anthropic's pre-training team; self-hosted sandboxes and MCP tunnels ship for Managed Agents; Gates Foundation commits $200M; PwC expands to 30,000 certified professionals; Google I/O kicks off the Gemini 3.5 era; GPT-5.5 lands in Microsoft Copilot.

Period
15–22 May 2026
Published
May 22, 2026
Covers
Anthropic · OpenAI · Gemini · Copilot

Dateline: May 22, 2026 | Next update: May 29, 2026

Another dense week anchored by three headline stories: Andrej Karpathy (OpenAI co-founder, Tesla AI director) joined Anthropic's pre-training team on May 19; self-hosted sandboxes and MCP tunnels launched for Claude Managed Agents at the Code with Claude London event on the same day; and the Gates Foundation — already announced May 14 — became fully detailed as Anthropic's largest mission-aligned partnership ($200M over four years). PwC expanded its Anthropic alliance to certify 30,000 professionals on Claude, and the API gained cache diagnostics in public beta.


Claude / Anthropic

Andrej Karpathy joins Anthropic — pre-training team

Announced: May 19, 2026 | Role: Member of Technical Staff, pre-training | Reports to: Nick Joseph

Andrej Karpathy — OpenAI founding team member, former Tesla senior director of AI, and the most widely followed AI educator in the world — joined Anthropic on May 19. He will work under pre-training team lead Nick Joseph with a specific mandate: start a new team that uses Claude itself to accelerate pre-training research. This is a direct bet that AI-assisted experimentation — running more ablations, summarising internal results, writing infrastructure code, and proposing training recipes — is the next competitive lever in the frontier model race.

★ What's new

Karpathy starts at Anthropic on May 19. He will lead a new group within pre-training focused on using Claude to accelerate the training research loop. He paused his education startup Eureka Labs to join. He had been publicly enthusiastic about Claude Code since January 2026. His hire comes as Anthropic's annualised revenue run rate reaches $30B and Claude Code alone generates approximately $2.5B in annualised revenue.

Technical details

Team: pre-training | Lead: Nick Joseph | Mission: AI-accelerated pre-training research | Background: OpenAI founding (2015), Tesla Autopilot (2017–2022), OpenAI (2023), Eureka Labs (2024–2026) | Education work: plans to resume in time

Best for: Watch this space — Karpathy's focus on AI-accelerated pre-training research is a long-term signal, not an immediate product change

Claude Managed Agents — self-hosted sandboxes + MCP tunnels

Announced: May 19, 2026 at Code with Claude London | Self-hosted sandboxes: public beta | MCP tunnels: research preview (request access)

Announced at Anthropic's first international developer conference — Code with Claude London — self-hosted sandboxes and MCP tunnels together solve the core blocker for enterprise agent production deployments: data and code leaving the customer's security perimeter. The agent orchestration loop stays on Anthropic's infrastructure; tool execution and private service access move inside the customer's perimeter.

★ What's new

Self-hosted sandboxes (public beta): tool execution now runs on your own infrastructure or through managed providers — Cloudflare, Daytona, Modal, or Vercel. Your network policies, audit logging, and security tooling apply by default. Resource sizing and runtime images are fully customer-controlled. MCP tunnels (research preview): a lightweight gateway you deploy makes a single outbound encrypted connection — no inbound firewall rules, no public endpoints. Agents reach internal databases, private APIs, ticketing systems, and knowledge bases as MCP tools without those services being exposed to the public internet. Also shipping: MCP server and tool configurations can now be updated on an active session (no restart required); large tool outputs over 100K tokens are automatically spilled to a sandbox file, with the model receiving a truncated preview and the path to read the full content.

Technical details

Self-hosted sandboxes: public beta | Supported providers: Cloudflare (microVMs), Daytona, Modal, Vercel | MCP tunnels: research preview, request access via Claude Console workspace settings | Orchestration stays on Anthropic infra | Live MCP config updates: yes, no session restart required | Large output spill: >100K tokens auto-spilled to sandbox file | Both under managed-agents-2026-04-01 header

Best for: Enterprises in regulated industries (finance, healthcare, legal, government) requiring agent execution inside their own security perimeter

Gates Foundation — $200M four-year partnership

Announced: May 14, 2026 | Commitment: $200M over four years | Focus: global health, education, economic mobility

Anthropic committed $200 million over four years — in grant funding, Claude usage credits, and embedded engineering support — to a partnership with the Bill and Melinda Gates Foundation. It is the Gates Foundation's largest AI commitment to date (4x the $50M it gave OpenAI in January for a narrower African clinic programme). The work is led by Anthropic's Beneficial Deployments team.

★ What's new

Healthcare focus: accelerating vaccine and drug candidate screening for neglected diseases (including polio, HPV, preeclampsia), and improving malaria and tuberculosis treatment deployment forecasts with the Institute for Disease Modeling. The partnership will serve the approximately 4.6 billion people in low- and middle-income countries who lack access to essential health services. Education focus: tools to improve K–12 outcomes in the US, sub-Saharan Africa, and India, including African-language data collection and labelling released as public goods. Economic mobility: support for solopreneur and small-business programmes. Public goods deliverables include open health datasets, evaluation benchmarks, and knowledge graphs.

Technical details

Commitment: $200M over 4 years | Components: grant funding + Claude usage credits + Anthropic engineering support | Led by: Anthropic Beneficial Deployments team | Partner: Bill and Melinda Gates Foundation | Public goods: open datasets, benchmarks, knowledge graphs released to the broader research community | Comparison: Gates Foundation gave OpenAI $50M in Jan 2026 for narrower African clinic work

Best for: Global health researchers, educators, nonprofits, and policymakers — informational for commercial users

PwC — expanded strategic alliance

Announced: May 14, 2026 | Scope: global PwC workforce

Anthropic and PwC disclosed a major expansion of their strategic alliance. The rollout starts with PwC's US teams and extends to hundreds of thousands of professionals globally, anchored by a joint Center of Excellence and a certification programme targeting 30,000 PwC professionals trained and certified on Claude.

★ What's new

Claude Code and Claude Cowork rolling out across PwC's global workforce. Joint Anthropic–PwC Center of Excellence established. 30,000 PwC professionals to be trained and certified on Claude. Deployment begins with US teams, expanding globally. This makes PwC one of the largest single enterprise deployments of Claude to date.

Technical details

Products: Claude Code + Claude Cowork | Starting geography: US, expanding globally | Center of Excellence: joint Anthropic–PwC | Certification target: 30,000 professionals

Best for: Enterprise buyers evaluating large-scale professional services deployment; consulting and advisory firms

API — cache diagnostics in public beta

Platform: Claude API | Beta header: cache-diagnosis-2026-04-07 | Availability: all API users

Prompt caching is one of the most impactful cost-reduction tools on the Claude API, but until this week there was no way to know why a cache miss happened. Cache diagnostics, now in public beta, tells you exactly where the prompt cache prefix diverged from the previous turn — making it possible to debug and fix caching issues without guesswork.

★ What's new

Pass diagnostics.previous_message_id on a Messages request and the API returns a cache_miss_reason field explaining where the cache prefix diverged. Enables systematic debugging of prompt cache misses across multi-turn agentic workflows.

Technical details

Beta header: cache-diagnosis-2026-04-07 | Parameter: diagnostics.previous_message_id | Response field: cache_miss_reason | Works on: Messages API | Use case: debug prompt cache misses in agentic and multi-turn workflows

Best for: API developers optimising cost and latency through prompt caching

Claude Code — Agent Teams, plugin fixes, session tools

Platform: terminal / VS Code / web / mobile | Availability: all plans

Claude Code shipped a broad point release focused on Agent Teams stability, plugin improvements, and session tooling polish. Agent Teams — collaborative multi-agent sessions where named teammates work together on a task — received several reliability fixes after non-ASCII names caused API failures and crashed sessions.

★ What's new

Agent Teams fixes: teammates with non-ASCII names no longer fail every API call due to invalid header encoding. Plugin marketplace: add/update now respects CLAUDE_CODE_PLUGIN_PREFER_HTTPS; /plugin returns to the Installed list after enable/disable/uninstall actions. /doctor now shows an exec-form example when a command hook is missing the command field. Skill-listing truncation moved out of startup notifications — run /doctor for the full breakdown. Pre-response stream stall recovery improved — now retries streaming once rather than falling back to the slower non-streaming path. SDK/headless MCP startup is up to 2 seconds faster with slow MCP servers (pre-wait now overlaps startup instead of blocking). Fixed: infinite loop where a skill using context: fork repeatedly re-invoked itself. Fixed: /review using a deprecated projectCards GraphQL query that errored on repos with Classic Projects.

Technical details

Agent Teams: non-ASCII name header encoding fixed | CLAUDE_CODE_PLUGIN_PREFER_HTTPS: now respected on marketplace add/update | MCP headless startup: up to 2s faster | Stream stall: retry-streaming-once before fallback | Fixed: context: fork infinite loop | Fixed: /review GraphQL projectCards deprecation | Fixed: stale 'Failed to install Anthropic marketplace' banner | Fixed: PR badge not updating after gh pr create in-session

Best for: Developers using Agent Teams, MCP servers in headless SDK mode, plugin marketplace

Usage limits — Pro and Max tightening reported

Reported: May 14, 2026 | Affects: Pro and Max consumer plans

Anthropic quietly tightened Claude usage limits for Pro and Max subscribers. The API was not affected. This is a consumer-plan change only and came shortly after the doubling of Claude Code five-hour limits via the SpaceX compute deal announced May 6. Anthropic has not published a detailed breakdown of the new limits.

★ What's new

Pro and Max plan usage limits tightened as of approximately May 14. API pricing and limits unchanged. The net effect on Claude Code five-hour limits relative to the May 6 doubling is unclear — monitor your usage dashboard if you are a heavy Pro or Max user.

Technical details

Affected: Pro ($20/mo) and Max ($100–200/mo) consumer plans | API: unaffected | Source: Axios, May 14, 2026 | No official Anthropic statement on specifics | Practical impact: most visible on long Claude Code sessions without context auto-compaction

Best for: Heavy Pro and Max subscribers — check your usage dashboard; no action needed for API users

Plans and Pricing

No API pricing changes this week. The usage limit tightening on Pro and Max plans is the only consumer-facing change and has not been quantified officially. Claude Platform on AWS consumption pricing via AWS Marketplace remains the same as native Claude API rates.

Technical details

Opus 4.7: $5/$25 per MTok | Sonnet 4.6: $3/$15 per MTok | Haiku 4.5: low-cost tier | Claude Platform on AWS: consumption via AWS Marketplace | Gates Foundation and PwC: enterprise/mission-aligned pricing, contact Anthropic | Self-hosted sandboxes + MCP tunnels: included in Managed Agents (beta)


ChatGPT / OpenAI

Dateline: May 22, 2026 | Next update: May 29, 2026

Over the past week, OpenAI has continued improving agent reliability, multimodal workflows, and enterprise integration, while quietly refining system consistency across professional use cases.

GPT-5.3 Standard — default model

Release: late 2025 | Pricing: included | Availability: all users

More reliable handling of long and mixed-format conversations.

★ What's new

Improved consistency when switching between text, document, and image-based tasks.

Technical details

Context ~128k | Output ~4k–8k | Improved multimodal context management

Best for: General use

GPT-5.3 Pro — high-reasoning model

Release: late 2025 | Pricing: Pro | Availability: Pro/Enterprise

More stable performance on layered analytical tasks.

★ What's new

Improved reasoning continuity in very long outputs.

Technical details

Context ~200k (est.) | Reduced degradation across extended reasoning chains

Best for: Deep analysis

GPT-5.3 Mini — fallback model

Release: late 2025 | Pricing: low-cost | Availability: all

Faster and smoother lightweight responses.

★ What's new

Improved routing precision between Mini and Standard models.

Technical details

Context ~64k | Better dynamic inference allocation

Best for: Quick tasks

Agent Mode

Handles multi-step workflows.

★ What's new

Improved reliability in executing structured workflows over longer sessions.

Technical details

Better task persistence | Reduced workflow interruption rates

Best for: Task delegation

Deep Research

Combines browsing and reasoning.

★ What's new

Improved handling of conflicting information across multiple sources.

Technical details

Enhanced synthesis and source-weighting pipeline

Best for: Research

Memory & Projects

Persistent context across chats.

★ What's new

Better prioritisation of active-project context over older conversational memory.

Technical details

Improved relevance filtering | More efficient memory retrieval

Best for: Ongoing workflows

Advanced Voice Mode

★ What's new

More natural transitions between conversational turns.

Technical details

Improved turn-taking latency | Better conversational pacing

Best for: Voice interaction

ChatGPT for Clinicians

Continued expansion of healthcare-oriented workflows.

★ What's new

Improved structure and readability in generated clinical summaries and administrative outputs.

Technical details

Further domain-specific tuning | Reinforced medical safety guardrails

Best for: Clinical support (non-diagnostic assistance)

Enterprise & Workflow Integrations

★ What's new

Improved reliability across integrations involving documents, collaborative workflows, and persistent projects. Better orchestration across connected tools and memory systems.

Best for: Organisational deployment

Plans and Pricing

No significant changes this week. Pricing stable | API structure unchanged.


Gemini (Google)

Date: May 22, 2026 | Next update: May 29, 2026

This week marked the kickoff of Google I/O 2026, officially shifting the ecosystem into the "Agentic Gemini Era." Google introduced a brand-new model generation, multimodal video foundational architectures, and autonomous cloud agents.

Gemini 3.5 Flash — new default model

Launched: Google I/O 2026 | Availability: Gemini app and Google Search (default)

Launched as the default engine across the Gemini app and Google Search. Runs four times faster than other frontier models in output tokens per second and outperforms Gemini 3.1 Pro across key logic and coding benchmarks.

★ What's new

Gemini 3.5 Flash is now the default model across the Gemini app and Google Search. 4x faster output token throughput than competing frontier models. Outperforms Gemini 3.1 Pro on logic and coding benchmarks. Infrastructure successfully managed massive traffic spikes post-I/O keynote with no API downtime.

Best for: High-speed agent workflows, general-purpose everyday use

Gemini Omni — new multimodal family

Launched: Google I/O 2026 | First model: Gemini Omni Flash

Introduced as Google's premier cross-modal creative model. The first rollout, Gemini Omni Flash, natively combines text, audio, images, and video inputs to generate and text-edit high-quality video outputs while maintaining perfect scene, character, and physics continuity. All output includes SynthID watermarking.

★ What's new

Gemini Omni Flash: native text + audio + image + video input, high-quality video output with scene/character/physics continuity. SynthID watermarking on all generated content. Available via Flow and YouTube Shorts.

Best for: Native video generation and editing, creative multimodal workflows

Gemini 3.5 Pro — coming next month

Status: final testing | Deployment: scheduled next month

Announced to be in final testing, with an official deployment scheduled for next month.

Best for: Watch this space — no action needed yet

Gemini Spark — autonomous cloud agent

Launched: Google I/O 2026 | Platform: Google Cloud VMs

An always-on, autonomous agent platform powered by Gemini 3.5 Flash. Running on Google Cloud VMs, Spark can review credit card statements for hidden subscriptions, track school updates from emails, compile notes into Docs, and safely perform multi-step actions across third-party apps like Instacart and OpenTable — requiring final user confirmation for purchases.

★ What's new

Gemini Spark: always-on autonomous background agent running on Google Cloud. Supports multi-step actions across third-party apps with user confirmation required for purchases. Continuous cloud automation without needing to stay in an active session.

Best for: Autonomous background tasks, continuous cloud automation

The 25-year Search upgrade

Launched: Google I/O 2026 | Rollout: concurrent with May 21 core search update

Google Search rolled out its biggest interface overhaul in over two decades, replacing the standard search box with an expanded AI Search Box. Users can input full natural language queries alongside images, video clips, and entire Chrome tabs simultaneously.

★ What's new

AI Search Box replaces the classic Google search box. Accepts text, images, video clips, and full Chrome tabs as simultaneous inputs. May 21 core ranking algorithm update running concurrently.

Best for: Advanced web search, multimodal research queries

Gemini in Chrome — Android and Desktop

Android: coming next month | Desktop: available now

Coming to mobile next month, it introduces auto browse to automate digital chores (pulling event ticket details to book local parking). Desktop users gain Skills in Chrome, which saves complex multi-tab prompts into one-click reusable tools.

★ What's new

Auto browse (mobile, next month): automates digital chores from within Chrome. Skills in Chrome (desktop): saves complex multi-tab prompts as reusable one-click tools.

Best for: Advanced web and coding agents, reusable browser automation

Android XR and Smart Glasses

Google teased an Android XR collaboration with Samsung, Gentle Monster, and Warby Parker, showcasing prototype smart glasses capable of real-world text translation, Gemini voice chatting, and real-time audio translation.

★ What's new

Prototype smart glasses: real-world text translation, Gemini voice chat, real-time audio translation. Collaboration: Samsung, Gentle Monster, Warby Parker.

Best for: Watch this space — prototype stage, no release date announced

Plans and Pricing

AI subscription tier restructuring announced at Google I/O:

Plan Price Key Includes
AI Ultra $99/mo 5x higher usage limits vs. standard $20 Pro plan, 20TB storage, priority Antigravity developer tools
AI Ultra $200/mo 20x higher usage limits, exclusive Project Genie access (interactive 3D from Street View)

Antigravity 2.0 replaces legacy developer platforms and evolves into a full agentic ecosystem with a standalone desktop application and CLI. All Gemini CLI users are urged to migrate to Antigravity CLI immediately.


Microsoft Copilot

Dateline: May 22, 2026 | Next update: May 29, 2026

The biggest story of the week is model integration: Microsoft rolled out GPT-5.5 models across Copilot experiences, significantly improving reasoning, summarization, and writing quality.

GPT-5.5 models integrated into Copilot

Announced: May 19, 2026 | Effective: immediate | Applies to: Microsoft 365 Copilot (all tiers)

GPT-5.5 models now power Copilot, offering faster responses, deeper reasoning, and smarter writing assistance across all enterprise workflows.

★ What's new

Models: GPT-5.5 Instant + GPT-5.5 Thinking now power all Copilot tiers. Improvements: reasoning quality, context understanding, summarization accuracy. Enterprise workflows benefiting: email drafting, meeting summaries, document creation, data analysis, presentations.

Technical details

Models: GPT-5.5 Instant + GPT-5.5 Thinking | Applies to: all Microsoft 365 Copilot tiers | Effective: immediate from May 19, 2026

Best for: All Copilot users — especially enterprise teams needing higher-quality outputs

Outlook email grounding in Copilot Chat

Platform: Windows + Web | Availability: General

Users can now add emails or text from threads directly into Copilot Chat prompts, enabling context-aware answers without switching between apps.

★ What's new

Implicit grounding in Outlook: insert email sections into Copilot Chat. Summarize, analyze, or extract action items from email content directly in the chat interface.

Best for: Business users managing large volumes of email

PDF opening inside Copilot Chat

Platform: Windows, Mac, Web | Availability: General

PDFs now open directly inside Copilot Chat, allowing summarization, highlighting, and Q&A without switching apps.

★ What's new

PDFs open inline in Copilot — no need to leave the chat. Summarization and extraction supported. Reduces workflow interruptions for document-heavy tasks.

Best for: Researchers, analysts, legal teams, document-heavy workflows

App Launcher "Waffle" returns

Platform: Microsoft 365 | Availability: All plans

The classic "Waffle" app launcher is back, improving navigation across Outlook, Word, Excel, Teams, OneDrive, and PowerPoint.

★ What's new

Restores the older launcher design by popular request. Quick access to all Microsoft 365 apps from a single consistent entry point.

Best for: Longtime Microsoft 365 users, productivity-focused teams

Researcher + Notebooks upgrades

Platform: Microsoft 365 Copilot | Availability: General

Researcher now provides deeper summaries and smarter document analysis. Copilot Notebooks gained new organisational features for managing research workflows.

★ What's new

Researcher: improved information gathering and analysis. Notebooks: enhanced organisation and summarization. Now competitive with standalone AI research assistants.

Best for: Knowledge workers, students, enterprise research teams

Plans and Pricing

No pricing changes this week. Updates focus on feature expansion and model improvements. GPT-5.5 included in all Copilot tiers at no additional cost. Outlook grounding, PDF opening, Waffle, and Researcher upgrades all included in Microsoft 365 — no action needed.


Filed under: AI Weekly Digest
First published: May 22, 2026

← Previous issue8–15 May 2026All issuesNext issue →23–29 May 2026