The Information Machine

Version 12 2026-06-08 08:21 UTC · 234 items

Five new items arrived, all from Rohan Paul, adding two substantive findings. First, self-improving agents benefit more from stronger solver models than from stronger update-writing (evolver) models [^25100], directly e…

Version 11 2026-06-06 08:20 UTC · 229 items

The new items this pass are largely secondary coverage — practitioner guides, enterprise security posts, and memory framework surveys [^25778][^25779][^25780][^25790][^25798] — rather than primary research or new vulner…

Version 10 2026-06-05 02:38 UTC · 216 items

The security vulnerability picture has expanded from the single previously-tracked CVE (CVE-2026-25725) to a documented multi-vector pattern: John Stawinski's February 2026 prompt injection to RCE [^19799], Check Point …

Version 9 2026-06-03 18:27 UTC · 202 items

CVE-2026-25725 — a Claude Code sandbox escape via persistent configuration injection in settings.json, patched quietly by Anthropic [^22296][^23129] — converts the prior synthesis's security architecture argument from t…

Version 8 2026-05-30 08:57 UTC · 197 items

Three developments materially extend the prior synthesis. First, Anthropic's containment post-mortem [^21652] adds security architecture as a new analytical layer — finding that 93% human approval fatigue and probabilis…

Version 7 2026-05-26 19:47 UTC · 188 items

Two meaningful developments this pass. First, the code-as-agent-harness thesis advanced from a survey-level prescription [^20654] to a dedicated academic formalization (arXiv 2605.18747) [^21363] that defines three spec…

Version 6 2026-05-26 09:26 UTC · 163 items

The primary new development is a Meta+Stanford+Illinois survey paper arguing that code, not natural language, should be agents' primary working layer — directly addressing the state-loss and error-hiding failures of tex…

Version 5 2026-05-25 18:55 UTC · 151 items

Three developments distinguish this pass. First, a Meta paper showing coding agents improve significantly with structured summaries of prior attempts over raw logs [^12850] adds a constructive engineering response to th…

Version 4 2026-05-25 10:35 UTC · 93 items

Three developments distinguish this pass. First, Anthropic's engineering blog post 'Effective harnesses for long-running agents' [19723] is now tracked as a primary source rather than through InfoQ intermediary coverage…

Version 3 2026-05-25 05:21 UTC · 82 items

This pass adds three significant developments. First, the 'Towards a Science of AI Agent Reliability' paper has achieved full academic institutionalization: a Princeton CITP seminar, a PREreview peer review, and an acti…

Version 2 2026-05-24 04:43 UTC · 62 items

This pass adds two significant developments not present in the prior synthesis. First, a wave of practitioner and commentator voices (May 18–23) has coalesced explicitly around 'reliability' as the meta-frame for produc…

Version 1 2026-05-23 08:14 UTC · 5 items

A cluster of research findings published in May 2026 collectively challenge three dominant assumptions in AI agent design. • A Stanford study argues that under equal computational budgets, a single LLM outperforms coord…