The Information Machine

The last six months in LLMs in five minutes

Simon Willison · Simon Willison · 2026-05-19

Simon Willison's annotated PyCon US 2026 lightning talk surveys six months of LLM progress, identifying November 2025 as the inflection point when coding agents crossed a quality threshold and open-weight laptop models began outperforming expectations.

Open original ↗

Appears in

Extraction

Topics: coding-agentsopen-weight-modelsllm-benchmarksmodel-releaseslocal-llms

Claims

  • November 2025 was a critical inflection point where coding agents crossed from 'often-work' to 'mostly-work,' becoming viable daily drivers for real development without constant error correction.
  • The title of best LLM changed hands five times among three major providers (Anthropic, OpenAI, Google) between September and late November 2025.
  • OpenClaw, a personal AI assistant that began as an obscure project called 'Warelay' in late November 2025, rose to widespread attention within three months of its first commit.
  • Qwen3.6-35B-A3B, a 20.9GB open-weight model runnable on a laptop, outperformed Claude Opus 4.7 on Willison's pelican-riding-a-bicycle drawing test.
  • Chinese lab GLM released GLM-5.1, a 1.5TB open-weight model described as very effective but requiring significant hardware investment.

Key quotes

Coding agents went from often-work to mostly-work, crossing a quality barrier where you could use them as a daily-driver to get real work done, without needing to spend most of your time fixing their stupid mistakes.
Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7. That's a 20.9GB open weights model that runs on my laptop!
Drew Breunig joked to me that this is because they're the new digital pets, and a Mac Mini is the perfect aquarium for your Claw.