The last six months in LLMs in five minutes

Simon Willison · Simon Willison · 2026-05-19

Simon Willison's annotated PyCon US 2026 lightning talk surveys six months of LLM progress, identifying November 2025 as the inflection point when coding agents crossed a quality threshold and open-weight laptop models began outperforming expectations.

Open original ↗

Appears in

OpenClaw Project: From Obscure CLI to Widely-Known AI Assistant

Extraction

Topics: coding-agentsopen-weight-modelsllm-benchmarksmodel-releaseslocal-llms

Claims

November 2025 was a critical inflection point where coding agents crossed from 'often-work' to 'mostly-work,' becoming viable daily drivers for real development without constant error correction.
The title of best LLM changed hands five times among three major providers (Anthropic, OpenAI, Google) between September and late November 2025.
OpenClaw, a personal AI assistant that began as an obscure project called 'Warelay' in late November 2025, rose to widespread attention within three months of its first commit.
Qwen3.6-35B-A3B, a 20.9GB open-weight model runnable on a laptop, outperformed Claude Opus 4.7 on Willison's pelican-riding-a-bicycle drawing test.
Chinese lab GLM released GLM-5.1, a 1.5TB open-weight model described as very effective but requiring significant hardware investment.

Key quotes

Coding agents went from often-work to mostly-work, crossing a quality barrier where you could use them as a daily-driver to get real work done, without needing to spend most of your time fixing their stupid mistakes.

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7. That's a 20.9GB open weights model that runs on my laptop!

Drew Breunig joked to me that this is because they're the new digital pets, and a Mac Mini is the perfect aquarium for your Claw.