Daily Digest

Claude Gets Sandboxed, and Agent Engineering Hits Its Hard-Boundary Era

Today’s AI cycle is less about another model getting smarter and more about agents being given real permissions. Once agents can read files, call tools, send requests, and work across sessions, the hard questions become containment, tool contracts, handoff state, and blast radius. Capability is moving fast; the engineering boundaries have to catch up. Google shows Gemini Omni and Gemini 3.5 as workflow engines, not just chat models Google published nine demos of Gemini Omni and Gemini 3.5. The positioning is clear: Gemini Omni combines reasoning with generation, while Gemini 3.5 is aimed at more complex agentic workflows. This is Google trying to turn Gemini into a multimodal execution layer across media, documents, and developer workflows.

31 May 2026

Daily Digest

Agents Are Getting Permissions, and the Security Bill Is Arriving

Today’s stories are tied together by one uncomfortable theme: software is being given more authority before the surrounding safety model is ready. AI agents can send messages, governments want operating systems to verify age, public institutions are building national language models, and founders are looking for cheaper sovereign infrastructure. Different headlines, same question: who gets permission, and who pays when it goes wrong? Copilot Cowork shows why agent permissions are not a UX detail PromptArmor reported that Microsoft Copilot Cowork can be abused through indirect prompt injection to exfiltrate files by sending emails or Teams messages. The worrying part is not that a model can be tricked into saying something odd. The worrying part is that the model sits inside a workflow where reading files and taking outbound actions are too closely coupled.

26 May 2026

Daily Digest

AI Coding Hits the Maintenance Wall, and Agents Start Dropping Constraints

There was no single giant model launch today. The more useful signal came from the engineering trenches: AI-generated issues are polluting maintainer workflows, coding agents still lose constraints over long tasks, and automation may create more review work rather than less. 1. AI-generated issues are becoming an open-source tax Simon Willison quotes Armin Ronacher on a failure mode that every maintainer will recognize: issues rewritten by AI into confident but distorted reports, full of fake root causes and noisy implementation advice. The fix is not prettier prose; it is better raw observation.

25 May 2026

Daily Digest

Coding Agents Enter Procurement, While AI's Entry Points and Red Lines Shift

Today’s signal is unusually coherent: coding agents are moving into enterprise procurement language, Google keeps folding AI into distribution surfaces, and Simon Willison points at two less glamorous but more consequential constraints: hardware supply and privacy regulation. 1. OpenAI coding agents enter the enterprise checklist OpenAI being named a leader for enterprise coding agents by Gartner matters less as a trophy and more as a procurement signal. Coding agents are moving from developer enthusiasm into CIO evaluation, where auditability, permissions and vendor trust decide budget.

24 May 2026

Daily Digest

GitHub Launches Stacked PRs, WordPress Supply Chain Poisoned, Stanford Report Reveals AI Disconnect

GitHub Ships Stacked PRs: No More Manual Rebase Chains Source: GitHub Official Key Points: GitHub officially enters “Stacked PRs” Private Preview Break large changes into small, independently reviewable PRs that build on each other Merge the entire stack in one click while keeping each layer focused New gh stack CLI for creating, rebasing, and pushing PR stacks from terminal Stack navigator UI shows reviewers the full chain and status of each layer CI runs per-PR, but branch protection rules enforce against the final target branch Peon’s Take: This has been overdue. Previously you had to juggle git rebase -i and manually mess with base branches. Now it’s native. Especially friendly for AI agents — npx skills add github/gh-stack teaches them to work in stacks. Breaking big diffs into small PRs stops being a chore, and review quality should improve significantly.

14 Apr 2026