The Information Machine

Research Findings Challenge AI Agent Architecture Assumptions

Synthesis history

10 versions, newest first.

  1. Version 10 2026-06-05 02:38 UTC · 216 items

    The security vulnerability picture has expanded from the single previously-tracked CVE (CVE-2026-25725) to a documented multi-vector pattern: John Stawinski's February 2026 prompt injection to RCE [^19799], Check Point …

  2. Version 9 2026-06-03 18:27 UTC · 202 items

    CVE-2026-25725 — a Claude Code sandbox escape via persistent configuration injection in settings.json, patched quietly by Anthropic [^22296][^23129] — converts the prior synthesis's security architecture argument from t…

  3. Version 8 2026-05-30 08:57 UTC · 197 items

    Three developments materially extend the prior synthesis. First, Anthropic's containment post-mortem [^21652] adds security architecture as a new analytical layer — finding that 93% human approval fatigue and probabilis…

  4. Version 7 2026-05-26 19:47 UTC · 188 items

    Two meaningful developments this pass. First, the code-as-agent-harness thesis advanced from a survey-level prescription [^20654] to a dedicated academic formalization (arXiv 2605.18747) [^21363] that defines three spec…

  5. Version 6 2026-05-26 09:26 UTC · 163 items

    The primary new development is a Meta+Stanford+Illinois survey paper arguing that code, not natural language, should be agents' primary working layer — directly addressing the state-loss and error-hiding failures of tex…

  6. Version 5 2026-05-25 18:55 UTC · 151 items

    Three developments distinguish this pass. First, a Meta paper showing coding agents improve significantly with structured summaries of prior attempts over raw logs [^12850] adds a constructive engineering response to th…

  7. Version 4 2026-05-25 10:35 UTC · 93 items

    Three developments distinguish this pass. First, Anthropic's engineering blog post 'Effective harnesses for long-running agents' [19723] is now tracked as a primary source rather than through InfoQ intermediary coverage…

  8. Version 3 2026-05-25 05:21 UTC · 82 items

    This pass adds three significant developments. First, the 'Towards a Science of AI Agent Reliability' paper has achieved full academic institutionalization: a Princeton CITP seminar, a PREreview peer review, and an acti…

  9. Version 2 2026-05-24 04:43 UTC · 62 items

    This pass adds two significant developments not present in the prior synthesis. First, a wave of practitioner and commentator voices (May 18–23) has coalesced explicitly around 'reliability' as the meta-frame for produc…

  10. Version 1 2026-05-23 08:14 UTC · 5 items

    A cluster of research findings published in May 2026 collectively challenge three dominant assumptions in AI agent design. • A Stanford study argues that under equal computational budgets, a single LLM outperforms coord…