The Information Machine

AI Persistent Memory: ChatGPT Dreaming and the Cross-Session Context Race · history

Version 1

2026-06-04 18:11 UTC · 90 items

What

OpenAI launched 'Dreaming' (Dreaming V3) for ChatGPT on June 4, 2026 — a background synthesis system that continuously updates user memory from conversation history rather than storing static snapshots [1][2][6]. The rollout starts with Plus and Pro subscribers. Concurrently, startup Anuma is positioning as a cross-model portable memory layer, arguing that proprietary per-model memory forces users to manually carry context between AI systems [10][9]. Academic research from MIT (MeMo) reports a 26% performance gain from keeping memory architecturally separate from the base model [11], and Palo Alto Networks has documented prompt injection as a live attack vector against persistent AI memory [12].

Why it matters

How AI assistants handle persistent memory is becoming a key axis of platform differentiation. OpenAI's Dreaming locks richer context into ChatGPT specifically; Anuma's portable model treats memory as user-owned infrastructure spanning all models. The two approaches have different implications for user lock-in, privacy, and the economics of AI platform competition. The concurrent security research makes the stakes concrete: more capable proactive memory systems also expand the attack surface.

Open questions

  • Does Dreaming V3's background synthesis introduce context drift or errors at scale — does continuously re-synthesizing memory degrade reliability compared to explicit user-managed entries? [1][13]

  • Will OpenAI extend Dreaming to free-tier users, and on what timeline? [14]

  • Can cross-model portable memory platforms like Anuma achieve enough adoption to reduce the stickiness of proprietary memory features? [10][9]

  • How do providers plan to address indirect prompt injection into persistent memory — a documented attack that causes behavioral changes across sessions? [12]

Narrative

On June 4, 2026, OpenAI announced 'Dreaming' — internally versioned as Dreaming V3 — a redesigned memory architecture for ChatGPT that runs background synthesis processes to keep user context current rather than relying on explicit, static memory entries [1][2]. The system rolls out first to Plus and Pro subscribers. OpenAI's blog frames it as a move from passive storage to proactive synthesis: the model periodically reviews conversation history and updates what it knows about a user's preferences and goals [1]. The announcement drew rapid social amplification within hours, with dozens of retweet threads and commentary posts [3][4][5][6].

The Dreaming launch sits in a broader context of documented practitioner frustration with context loss. A Hacker News discussion from late May described 'compaction amnesia and context rot' — the pattern where AI coding tools like Codex progressively lose track of prior decisions when handling complex, multi-step workflows [7]. Earlier developer forum posts proposed 'memory-first conversational architecture' as more reliable than extending context windows, arguing that selective structured memory retrieval beats raw context length [8]. These practitioner pain points frame the commercial urgency behind OpenAI's memory investment.

A competing architectural philosophy has emerged from Anuma, a startup building cross-model portable AI context [9]. Rohan Paul described the core problem plainly: 'Most AI workflows break because the user has to carry the context manually' [10]. Anuma stores context, preferences, and goals in a format portable across ChatGPT, Claude, and other systems, positioning itself as user-owned infrastructure rather than platform-locked memory. This directly contrasts with OpenAI's proprietary approach — if portable memory layers gain traction, they could reduce the stickiness of model-specific memory features.

Two additional technical threads matter. MIT's MeMo system, circulated in late May, reports a 26% LLM performance improvement by keeping memory architecturally separate from the base model, with the memory module updatable without retraining [11]. Separately, Palo Alto Networks documented that indirect prompt injection can poison AI long-term memory, causing persistent behavioral changes across sessions [12]. The combination of more proactive, trusted memory systems and this known attack vector creates a security tension that none of the major providers have publicly addressed.

Timeline

  • 2026-05-26: Hacker News post documents 'compaction amnesia and context rot' in Codex on complex multi-step workflows [7]
  • 2026-05-30: MIT MeMo research circulates claiming 26% LLM performance gain from memory kept architecturally separate from the base model [11]
  • 2026-06-02: Analysis of memory architecture state across major agent harnesses shared on Twitter [17]
  • 2026-06-04: OpenAI announces Dreaming V3 memory system for ChatGPT, rolling out to Plus and Pro users with background synthesis replacing static snapshots [1][2][15][6]
  • 2026-06-04: Anuma cross-model portable memory platform discussed as structural alternative to proprietary model-specific memory [10][9]
  • 2026-06-04: Palo Alto Networks research on indirect prompt injection poisoning AI long-term memory surfaces in context of Dreaming launch [12]

Perspectives

OpenAI

Dreaming V3 is a proactive background synthesis system that keeps user memory fresh and relevant, representing a new architectural approach to long-term memory in conversational AI

Evolution: Consistent product-launch framing; no comparison to competitors or discussion of limitations offered in public announcement

Anuma / Rohan Paul

Proprietary per-model memory is insufficient; user context should be portable across all AI models in a private, user-owned format rather than locked to a single platform

Evolution: Consistent; Anuma frames itself as infrastructure-layer response to the same problem OpenAI addresses with a platform-locked approach

MIT MeMo researchers

Keeping memory architecturally separate from the base model yields measurable performance gains without retraining — a design choice that differs from integrated approaches

Evolution: First appearance in this thread; academic framing with no commercial stance

Palo Alto Networks (security researchers)

Persistent AI memory creates a significant attack surface; indirect prompt injection can poison long-term memory and cause persistent behavioral changes across sessions

Evolution: First appearance; frames persistent memory as a security liability that providers have not adequately addressed

Practitioner and developer community

Context loss ('compaction amnesia', 'context rot') is a genuine and costly limitation in current AI tools; memory-first architecture is more reliable than extending context windows

Evolution: Consistent frustration; Dreaming is received as addressing a real workflow problem, though technical skepticism about background synthesis reliability is present

Tensions

  • OpenAI's Dreaming locks richer memory into ChatGPT specifically; Anuma argues memory should be portable across all models and user-owned, not platform-controlled [1][10][9]
  • Dynamic background synthesis (Dreaming) vs. static explicit memory snapshots — practitioners question whether proactive synthesis is more reliable or introduces context drift as a new failure mode [1][8][13]
  • More capable persistent memory expands the prompt-injection attack surface; Palo Alto Networks documents this risk, but neither OpenAI nor other providers have publicly addressed it [12][1]
  • MIT MeMo's finding that separate memory architecture outperforms integrated memory implies current production approaches may carry a performance penalty — but this comparison has not been applied to Dreaming specifically [11][1]

Sources

  1. [1] Dreaming: Better memory for a more helpful ChatGPT — OpenAI Blog (2026-06-04)
  2. [2] Dreaming memory system rolls out to ChatGPT Plus and Pro users — reactive:ai-persistent-memory-race
  3. [3] JUST IN: OpenAI unveils Dreaming memory system for ChatGPT to boost continuity and relevance, with Dreaming V3 rolling o... — reactive:ai-persistent-memory-race (2026-06-04)
  4. [4] New Memory system in ChatGPT: Dreaming. — reactive:ai-persistent-memory-race (2026-06-04)
  5. [5] RT @MTSlive: SITUATION DETECTED: OpenAI has launched a new memory system for ChatGPT called Dreaming, which automaticall... — reactive:ai-persistent-memory-race (2026-06-04)
  6. [6] OpenAI Launches Dreaming V3 Memory System for ChatGPT... — reactive:ai-persistent-memory-race (2026-06-04)
  7. [7] Why codex /goal fails on complex workflows: compaction amnesia and context rot — reactive:ai-persistent-memory-race (2026-05-26)
  8. [8] Memory-First Conversational Architecture as an Alternative to Long ... — reactive:ai-persistent-memory-race
  9. [9] Cross-Model, Cross-Device Portable AI Context | Anuma — reactive:ai-persistent-memory-race
  10. [10] Most AI workflows break because the user has to carry the context manually, and Anuma is trying to make that context por… — Rohan Paul Twitter (2026-06-04)
  11. [11] MIT's MeMo: 26% LLM performance boost without retraining — memory stays separate from the base model. — reactive:ai-persistent-memory-race (2026-05-30)
  12. [12] When AI Remembers Too Much – Persistent Behaviors in Agents ... — reactive:ai-persistent-memory-race
  13. [13] @yato220510 @OpenAI Compressed embeddings would be crucial here. Would love to see the technical deep-dive on the memory... — reactive:ai-persistent-memory-race (2026-06-04)
  14. [14] OpenAI just dropped a major upgrade to ChatGPT's memory system — and it's coming to free users for the first time. — reactive:ai-persistent-memory-race (2026-06-04)
  15. [15] Dreaming: Better memory for a more helpful ChatGPT — reactive:ai-persistent-memory-race
  16. [16] Building Your External Memory System: When User Memory is Full ... — reactive:ai-persistent-memory-race
  17. [17] Single article with a complete breakdown on the state of memory architecture in the major Agent Harnesses- — reactive:ai-persistent-memory-race (2026-06-02)