This Meta + Stanford + Illinois survey paper argues that AI agents work better when code becomes their main working laye…

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-05-25

A Meta, Stanford, and University of Illinois survey paper argues that AI agents are more effective when code — rather than natural language — serves as their primary working layer for planning and execution.

Open original ↗

Appears in

Research Findings Challenge AI Agent Architecture Assumptions

Extraction

Topics: ai-agentscode-generationagentic-systemsllm-limitations

Claims

AI agents perform better when code is their primary operational layer rather than unstructured natural language.
Purely text-based LLMs struggle to maintain state across long multi-step tasks.
Text-only agent pipelines hide mistakes and convert plans into actions in brittle, error-prone ways.
Using code as the agent's working medium addresses core failure modes of LLM-native reasoning.

Key quotes

The problem is that an LLM by itself is mostly a text predictor, so long tasks can lose state, hide mistakes, and turn plans into actions in fragile ways.

AI agents work better when code becomes their main working layer.