Stronger agents will not come only from larger models, but from better systems around them.

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-05-29

Rohan Paul argues that AI agent capability is determined by the full surrounding system—memory, tools, routing, and permissions—not just model size, and that current evaluations wrongly attribute agent behavior to the model alone.

Open original ↗

Appears in

Research Findings Challenge AI Agent Architecture Assumptions

Extraction

Topics: ai-agent-evaluationagent-infrastructurellm-systemsagent-benchmarks

Claims

Stronger AI agents require better surrounding systems, not merely larger underlying models.
Current AI agent benchmarks and evaluations incorrectly attribute performance solely to the model.
Real agent behavior is a product of memory, tools, context, routing, checks, and permissions working together.
Evaluating the model alone misrepresents what drives observed agent outcomes.

Key quotes

Stronger agents will not come only from larger models, but from better systems around them.

many AI agents are judged as if the model alone did the work, even though the real behavior also depends on memory, tools, context, routing, checks, and permissions.