For anyone listening, the relevant section starts around 44m: prefill vs. decode disaggregation.
reactive:mlsys-2026-inference-systems · Mansour Karam (@mansourkaram) · 2026-05-21
(No summary yet for this item — extraction summaries are still backfilling.)