Simon Willison Releases llm 0.32 Alpha Series · history

Version 5

2026-05-01 05:01 UTC · 132 items

Narrative

On April 29, 2026, Simon Willison released two alpha versions of his popular llm CLI tool and Python library in rapid succession, marking a significant architectural overhaul. The headline change in 0.32a0 is the replacement of the previous prompt/response model with a message-sequence API that allows full prior conversations to be injected without requiring SQLite as an intermediary.[1] This is a backwards-compatible refactor, but a major one that reshapes how the library models both inputs and outputs.

The new streaming API is perhaps the most technically ambitious aspect of the release: responses are now exposed as typed event parts — text, tool_call_name, tool_call_args, and reasoning — enabling downstream consumers to handle the mixed-type outputs increasingly common from modern models like Claude.[1] Willison frames this as a direct response to the reality of contemporary LLMs: "Many of today's models return mixed types of content. A prompt run against Claude might return reasoning output, then text, then a JSON request for a tool call, then more text content." The CLI immediately leverages this by rendering reasoning/thinking tokens in a distinct color and routing them to stderr, keeping piped output clean.[1] A new to_dict/from_dict serialization mechanism rounds out the release, letting Python API users store and restore responses in any storage layer rather than being coupled to SQLite.[1] The 0.32a1 patch followed the same day, fixing a bug in which tool-calling conversations were not correctly reinflated from SQLite storage — a regression introduced by the architectural changes in 0.32a0.[2][3]

A third search cycle (May 1, 2026) has returned further confirmation that organic community reaction remains absent from the public record. The dominant signal is search noise: queries targeting "stable release" have surfaced dozens of entirely unrelated Twitter posts[4][5][6][7][8][9][10][11] matching the word "stable" with no connection to llm whatsoever. Mildly notable new signals include: an Instagram post specifically titled "LLM 0.32 just rewrote its core — and everything still ..."[12], extending social amplification to a new platform; a Bluesky post from Willison's own account[13] and a Fediverse post on "The LLM Python library support…"[14], both without extracted claims; and a cluster of simonw/llm GitHub issues on tool usage features[15][16][17] that provide ecosystem context for the kinds of problems the 0.32 refactor addresses, though none are explicitly about 0.32 migration. The llm-openai-plugin GitHub repository[18] and the official llm docs page for other models[19] are indexed without new content, and a general Willison plugins tag page[20] confirms ongoing plugin ecosystem activity without 0.32-specific commentary.

The broader pattern across three search cycles is now clear: the 0.32 alpha series has attracted aggregator and isolated social amplification (Twitter, Instagram, Bluesky, Fediverse) but has generated no substantive third-party analysis beyond the single explore.n1n.ai piece[21], no plugin author migration reports, and no Hacker News discussion thread. The llm GitHub issues on tool documentation[16] and tool-in-conversation continuity[15] indicate that tool-calling features were already a live concern in the project before the 0.32 release, lending further context to why the architectural refactor prioritized typed streaming event parts. Whether the stable release will surface community reaction that the alpha series has not remains the open question.

Timeline

2026-04-29: LLM 0.32a0 released: major backwards-compatible refactor replacing prompt/response model with message-sequence API, adding typed streaming event parts and to_dict/from_dict serialization [1][22][23]
2026-04-29: LLM 0.32a1 released same day to fix bug where tool-calling conversations were not correctly reinflated from SQLite [2][3]
2026-04-29: Third-party aggregators (Let's Data Science, daily.dev) begin indexing and republishing the 0.32a0 announcement [24][25]
2026-04-30: Dedicated third-party analytical piece on the 0.32a0 refactor indexed from explore.n1n.ai; Hacker News searches confirm no 0.32-specific community discussion; second search cycle returns primarily noise [21][28][29][30][31][32]
2026-05-01: Third search cycle: Instagram post specifically about 0.32 core rewrite indexed; Bluesky and Fediverse posts from Willison detected without extracted claims; llm GitHub issues on tool usage features surfaced; 'stable release' queries overwhelmed by unrelated Twitter noise [12][13][14][15][16][17][18][19]

Perspectives

Simon Willison

Advocates for the architectural refactor as a necessary response to modern LLMs' mixed-type outputs (reasoning, text, tool calls). Treats the alpha series as iterative public development, shipping a fix the same day as the initial alpha. Continues posting on multiple platforms (Fediverse, Bluesky) about the library.

Evolution: consistent — no new substantive statements detected; active on social platforms but no extracted claims in third cycle

[1][22][2][23][3][14][13]

Third-party tech aggregators (Let's Data Science, daily.dev)

Neutral amplification — republishing Willison's announcement without original analysis or critique.

Evolution: consistent — purely re-aggregative across all three cycles

[24][25]

Specialized AI content sites (explore.n1n.ai)

Analytical framing of the 0.32a0 refactor as significant for Python-based AI tooling broadly, though no specific claims were extracted.

Evolution: first appeared in prior cycle — no new content extracted in third cycle

[21][26][27]

Social media amplifiers (Instagram)

Neutral amplification of the 0.32 core rewrite story on a new platform, extending reach beyond tech-aggregator sites.

Evolution: new in third cycle — Instagram represents a new platform for the story's spread

[12]

Tensions

The 0.32 series is explicitly alpha: it is unclear how many breaking changes plugin authors face and whether the new message-sequence API will stabilize before a stable release. [1][2]
The new to_dict/from_dict mechanism decouples the library from SQLite, but the same-day SQLite bug fix in 0.32a1 suggests the two storage paths are not yet equally exercised. [1][2][3]
Community and third-party plugin reactions to the refactor are entirely absent from three full search cycles — HN searches confirm no notable 0.32-specific discussion, and all substantive content comes from Willison himself and neutral amplifiers. [28][29][30][31][32][15][16][17]
Tool usage documentation and tool-in-conversation continuity were already open GitHub issues before the 0.32 release, suggesting the refactor addresses real developer pain points — but no plugin author has publicly responded to whether 0.32 resolves their specific needs. [15][16][17]
The broader Python LLM tooling ecosystem (e.g., pydantic-ai) is independently wrestling with the same streaming-plus-tool-call problem that llm 0.32 addresses, raising the question of whether llm's approach will converge with or diverge from emerging community conventions. [33][34]

Sources

[1] LLM 0.32a0 is a major backwards-compatible refactor — Simon Willison (2026-04-29)
[2] llm 0.32a1 — Simon Willison (2026-04-29)
[3] Release: llm 0.32a1 — reactive:simon-willison-llm-032
[4] "Get a stable job" — reactive:simon-willison-llm-032 (2026-05-01)
[5] Situation au 30 avril - 1/2 — reactive:simon-willison-llm-032 (2026-05-01)
[6] @blacknredtext Things have to be stable and have a growing population that is atleast culturally stable and consistent. ... — reactive:simon-willison-llm-032 (2026-05-01)
[7] Rudie Heyneke urges stable NSFAS leadership https://t.co/lMSMLp5Sto is important stable stable officials in nsfas is nee... — reactive:simon-willison-llm-032 (2026-04-30)
[8] 📊 SPY GEX Framework — 2026-04-30 — reactive:simon-willison-llm-032 (2026-04-30)
[9] Coolify V4 stable — reactive:simon-willison-llm-032 (2026-04-30)
[10] @seth_doe22 With all that has unfolded in the past 10 years concerning marriages, it is safe to say that there's an atta... — reactive:simon-willison-llm-032 (2026-04-30)
[11] We the stable will stay stable even when things aren't stable. — reactive:simon-willison-llm-032 (2026-04-30)
[12] LLM 0.32 just rewrote its core — and everything still ... - Instagram — reactive:simon-willison-llm-032
[13] Post by @simonwillison.net — reactive:simon-willison-llm-032
[14] Simon Willison: "The LLM Python library support…" — reactive:simon-willison-llm-032
[15] Ability to "reply" to a tool-response with a prompt carrying those tool ... — reactive:simon-willison-llm-032
[16] Documentation on how to implement tool usage for model plugins — reactive:simon-willison-llm-032
[17] c" should automatically include tools from "llm -T" in the initial prompt ... — reactive:simon-willison-llm-032
[18] GitHub - simonw/llm-openai-plugin: OpenAI plugin for LLM · GitHub — reactive:simon-willison-llm-032
[19] Other models - LLM — reactive:simon-willison-llm-032
[20] Simon Willison on plugins — reactive:simon-willison-llm-032
[21] LLM 0.32a0 Refactor: A Major Step for Python-Based AI Tooling — reactive:simon-willison-llm-032
[22] llm 0.32a0 — Simon Willison (2026-04-29)
[23] LLM 0.32a0 is a major backwards-compatible refactor — reactive:simon-willison-llm-032
[24] llm CLI package releases version 0.32a0 - Let's Data Science — reactive:simon-willison-llm-032
[25] LLM 0.32a0 is a major backwards-compatible refactor — reactive:simon-willison-llm-032
[26] n1n.ai: Enterprise Unified LLM API Gateway (One Key for All Models) — reactive:simon-willison-llm-032
[27] ai-agents — reactive:simon-willison-llm-032
[28] Yet Another LLM Rant - Hacker News — reactive:simon-willison-llm-032
[29] LLMs can be exhausting | Hacker News — reactive:simon-willison-llm-032
[30] Im genuinely blown away by llms. I'm an artist who've ... - Hacker News — reactive:simon-willison-llm-032
[31] LLMs are bullshitters. But that doesn't mean they're not useful — reactive:simon-willison-llm-032
[32] This is frankly one of the most frustrating things about LLMs — reactive:simon-willison-llm-032
[33] Streaming Tool Calls · Issue #640 · pydantic/pydantic-ai - GitHub — reactive:simon-willison-llm-032
[34] How streaming LLM APIs work | Simon Willison’s TILs — reactive:simon-willison-llm-032