The Information Machine

Version 9 2026-05-26 19:38 UTC · 226 items

New items this pass are thin: a second tweet confirming UW SyFi's multi-prize win[^20724] and the public FlashInfer Bench GitHub starter kit[^20725] add minor depth to the contest thread, and Paradigm.xyz's Attention Ke…

Version 8 2026-05-25 19:01 UTC · 221 items

vLLM's disaggregated prefilling documentation for both v0.8.5[^20326] and v0.10.2[^20327] confirms the 'experimental' label has not been lifted through the most recent tracked release, directly answering a prior open qu…

Version 7 2026-05-25 10:13 UTC · 213 items

Third-party deployment guides from Vultr[^19593] and Spheron[^19594], plus a dedicated vLLM Dynamo integration page[^19595], extend the disaggregation toolchain maturity signal beyond NVIDIA's own Kubernetes documentati…

Version 6 2026-05-25 04:14 UTC · 193 items

NVIDIA Dynamo's official Kubernetes documentation for disaggregated communication[^18696] is the most substantive addition this pass: it directly addresses the RDMA KV cache networking obstacle that was previously chara…

Version 5 2026-05-24 18:56 UTC · 184 items

The Native Sparse Attention paper (arXiv 2502.11089, ACL 2025) enters the thread as a substantive counter-weight to the sparse-attention-as-stopgap debate: NSA designs hardware-aligned sparse attention from training tim…

Version 4 2026-05-24 11:13 UTC · 164 items

Three substantive additions this pass: (1) @superaiwatcher introduces the first explicit counter-narrative in the thread, framing sparse attention as a transitional stopgap before hardware-native linear attention, which…

Version 3 2026-05-24 04:52 UTC · 128 items

MIT HAN Lab's Adaptive Drafter paper (arXiv 2511.16665) is now confirmed as ASPLOS'26 with an open-source GitHub repository (mit-han-lab/fastrl) and MIT News coverage; secondary sources characterize the speedup as 2x, s…

Version 2 2026-05-23 05:02 UTC · 107 items

Attention-FFN disaggregation moved from a single conference mention to a concrete engineering push this pass: StepFun's StepMesh library, a vLLM RFC, and formal papers on provisioning and hardware challenges all appeare…

Version 1 2026-05-22 18:27 UTC · 80 items

MLSys 2026, the ninth annual Conference on Machine Learning and Systems, is underway in Bellevue, Washington (week of May 18–22, 2026)[^7796][^10577]. The conference's inference track is organized around four converging…