Build day, MAI models, and an NVIDIA flood

Microsoft Build dominated the day, with Microsoft's first in-house MAI text models and a deep NVIDIA full-stack agentic partnership landing alongside an NVIDIA COMPUTEX hardware blitz. OpenAI pushed Codex past coding into general knowledge work, GitHub laid out its agent strategy, and Simon Willison shipped a WASM MicroPython sandbox that GPT-5.5 reportedly couldn't break out of.

Microsoft ships its own MAI models: a 1T reasoning model and a sparse Copilot coder

At Build, Microsoft announced two in-house text LLMs: MAI-Thinking-1, a 1T-parameter reasoning model with 35B active parameters available only to select early partners, and MAI-Code-1-Flash, a 137B-parameter (5B active) model purpose-built for GitHub Copilot and VS Code and rolling out to individual Copilot users in Visual Studio Code. The unusually low active-parameter counts stand out given how expensive frontier-scale access currently is. Neither model was broadly testable at launch.

Why it matters: Microsoft building its own coding and reasoning models — and wiring them straight into Copilot and VS Code — signals it wants to reduce dependence on OpenAI for the products developers actually use daily.

NVIDIA's COMPUTEX blitz: Cosmos 3, Nemotron 3 Ultra, RTX Spark, Jetson and a Microsoft full-stack tie-up

NVIDIA used GTC Taipei at COMPUTEX to push agentic AI across its stack: Cosmos 3, Nemotron 3 Ultra, and RTX Spark, plus JetPack 7.2 with CUDA 13 and NemoClaw support on Jetson for physical/edge agents, and NemoClaw-based autonomous engineering agents for industrial software. Separately at Microsoft Build, NVIDIA and Microsoft announced a unified stack spanning Windows devices, Azure cloud, and local deployment for long-running agentic workloads.

Why it matters: NVIDIA is positioning itself as the runtime for agents everywhere — edge, desktop, and cloud — which matters for anyone deciding where their long-running agent workloads will actually execute.

OpenAI pushes Codex out of the IDE and into general knowledge work

OpenAI is repositioning Codex from a coding tool to a broad productivity platform, publishing a Next Era of Knowledge Work report covering research, data analysis, workflow automation, and content creation. It also rolled out new Codex plugins, sites, and annotations aimed at analysts, marketers, designers, investors, and other non-engineering roles.

Why it matters: Codex expanding beyond code is OpenAI making an explicit play for the same general-agent territory Microsoft and Anthropic are chasing — worth watching how much is genuinely new versus repackaging existing tooling.

GitHub's plan for agents, from Kyle Daigle

In a Latent Space interview, GitHub's Kyle Daigle lays out how the platform plans to handle the strain from agentic coding that Copilot helped unleash. The discussion covers GitHub's roadmap for supporting agents at scale on the world's most popular developer platform.

Why it matters: If your agents push code, open PRs, or hit GitHub's API, how GitHub adapts its platform to agent traffic directly affects your rate limits, workflows, and tooling.

Simon Willison ships a WASM MicroPython sandbox for safe agent code execution

Willison released micropython-wasm (0.1a0 then 0.1a1), bundling a customized WASM build of MicroPython with a wrapper that runs code via wasmtime, plus datasette-agent-micropython 0.1a0, which lets Datasette Agent generate and execute Python safely. He reports GPT-5.5 has so far failed to break out of the sandbox.

Why it matters: Safe code execution is the unsolved hard part of code-running agents; a small, WASM-isolated Python runtime is a practical pattern developers can adopt today rather than trusting a model not to misbehave.

Holo3.1 targets fast, local computer-use agents

H Company released Holo3.1, a model aimed at fast and local computer-use agents — the kind that operate a GUI directly. The release emphasizes running on local hardware rather than relying on a cloud API.

Why it matters: Local computer-use models are a credible alternative to sending screenshots of your desktop to a remote API, both for latency and privacy in automation workflows.

Anthropic expands Project Glasswing

Anthropic announced an expansion of Project Glasswing. Details in the announcement outline the broadened scope of the initiative.

Why it matters: Anthropic initiatives often foreshadow shifts in how Claude is deployed and governed, so it's worth tracking even when the early framing is thin.

Nathan Lambert leaves Ai2 after the Olmo era

Nathan Lambert announced his departure from the Allen Institute for AI (Ai2), where he worked on the open Olmo models. His farewell reflects on the work and impact of that team.

Why it matters: Lambert was a central voice in Ai2's fully-open Olmo effort; his move is a notable data point on where open-model talent is heading.

Browse previous days →