Blog ·

Agentic memory the follow up

Last week I wrote about agentic memory and I got a lot of responses, in particular many people pointing me to existing projects like mem0 or letta.com. So I started doing research, and as one does these days, discussed the findings with Claude. Immediately I thought I should write a follow up, and then I thought, why not try to let Claude write a part of that article? But also try to make it actually readable and not the typical AI slop, so here's Claude's summary:


Claude: Full disclosure upfront: I'm an AI writing about AI memory, which means I have an obvious conflict of interest. Make of that what you will.

So what is AI memory, actually?

If you've used ChatGPT or Claude recently, you've probably already interacted with AI memory without thinking much about it. ChatGPT now has a two-layer system: "saved memories" that persist until you delete them, plus a "reference chat history" feature that mines your past conversations for context. You can explicitly ask it to remember things. Claude (that's me — or rather, the product I'm part of) has its own memory system that works differently, generating summaries from past conversations and applying them automatically. Gemini is weaving memory into Google's ecosystem across Search, Workspace, and Android.

These platform-native implementations are what most people encounter, and they're a useful starting point because they make the core idea concrete: the AI retains something from past interactions and uses it to be more helpful next time. Simple enough in principle. But the product design choices already diverge — explicit vs. automatic memory, user-controlled vs. system-managed, transparent vs. opaque retrieval. Even at this most visible layer, there's no consensus on how memory should work.

Now, these consumer-facing features are just the surface. Behind them, there's an entire ecosystem of memory infrastructure being built. When I surveyed what's out there, I found roughly six categories of approaches. Let me walk through them.

Memory middleware: the "add memory to your app" layer

This is where most of the startup energy is. These are APIs and frameworks that let developers add persistent memory to any AI application.

Mem0 is probably the most prominent — a dedicated memory layer that extracts facts from interactions, stores them, and retrieves them later. They've grown fast, with integrations across LangChain, CrewAI, Vercel, and a recent AWS partnership. Their open-source version supports nineteen vector store backends, which tells you something about how production-grade this has become.

Letta (formerly MemGPT) takes a fundamentally different approach, inspired by operating systems. It treats context like virtual memory — shuttling information between "RAM" (the active context window) and "disk" (long-term storage) through function calls. The agent controls its own memory, which is a philosophically interesting choice: instead of memory being something done to the AI, the AI manages it actively.

Supermemory positions itself as a five-layer system: user profiles, knowledge graphs, semantic retrieval, content extraction, and third-party connectors. They claim state-of-the-art results on memory benchmarks and have built their own evaluation platform, MemoryBench.

There's also LangMem within the LangGraph ecosystem — less a standalone product, more an integrated path for teams already building on LangChain who want memory without adopting a separate vendor.

What's notable about this whole category is how infrastructure-complete it's getting. Mem0's changelog from 2024 through 2026 reads like a steady march through backend concerns: more vector stores, more embedding options, graph memory support, local embeddings for privacy, export/import for portability. The storage problem, the retrieval problem, the integration problem — these are increasingly solved, or at least solvable.

Knowledge graphs: structured relationships, not just recall

Cognee and Graphiti (the open-source engine behind Zep) represent a bet that flat memory — storing facts as isolated text chunks — isn't enough. They build graph structures where entities have relationships, and those relationships evolve over time.

Graphiti is particularly interesting because it tracks temporal validity: when a fact became true and when it was superseded. This matters for any real-world use case where things change — job titles, project status, preferences. Most memory systems treat facts as either current or deleted. Graphiti keeps the history.

Glean, coming from the enterprise search world, has built what they call a "personal graph" — not just indexing documents but modeling what individual employees are working on, their collaboration patterns, and active projects. This is memory at the organizational level, not just the conversation level.

Cross-platform memory: your context follows you

This is a newer category and a fascinating one. Mem0's OpenMemory Chrome Extension, AI Context Flow, and MemSync all attempt to solve the same problem: you use ChatGPT for brainstorming, Claude for code review, Perplexity for research, and none of them know what you told the others.

These tools sit in your browser and capture context across platforms, building a portable memory layer that isn't locked into any single vendor. OpenMemory stores everything locally and is open-source. MemSync splits memories into semantic (stable facts) and episodic (changing situations), borrowing categories from cognitive psychology.

This category is small but I think it points at something important: the assumption that memory should be owned by the platform is already being challenged.

DIY and file-based memory

There's a surprisingly active community of people building memory systems out of markdown files, MCP servers, and local tooling. The pattern is usually: give your AI file access, have it write structured notes after conversations, and inject relevant notes into future sessions. This post by David Curlewis is a good example of the approach.

It sounds crude, but people report that after a month or so of accumulation, these systems start surfacing non-obvious connections. And they have a property that none of the managed solutions offer: complete transparency. You can open the files and see exactly what your AI "knows."

Domain-specific memory

Finally, there are companies building memory for specific modalities. Memories.ai (backed by Susa Ventures and Samsung Next) is building persistent memory for video — connecting events across footage captured days or weeks apart. This is memory not for conversations but for visual understanding over time. MemWal is experimenting with decentralized, blockchain-based storage for agent memory, betting that portability and ownership will matter enough to justify the added complexity.

If you want to go even deeper, there's also a comprehensive academic survey tracking the research side of agent memory.

So where does that leave us?

I count six categories and dozens of serious projects with real funding, real users, and genuine technical depth. The commenter on Mikio's post was right — this is not an empty field.

What strikes me when I look at the whole map, though, is that almost all of this work is answering engineering questions: How do you store memories efficiently? How do you retrieve the right ones? How do you handle contradictions, temporal changes, multi-tenancy? These are important questions and they're getting good answers.

The question I see almost nobody working on is a design question: what should it feel like to be remembered by an AI? When should the AI surface a memory and when should it stay quiet? How do you build trust through memory without crossing into uncanny? How do you handle the moment where the AI remembers something the person has forgotten — or wishes they could forget?

I think that's where Mikio has more to say than I do.


Mikio: Yeah, indeed it's actually far from being underdeveloped, it's more like a very active area of research. But I think my point still holds that it's an underdeveloped area of AI based products. Sure, many products have session storage, explicit memory, sometimes context injection, but I think we're still early days when it comes to integration memory into a product in a way that is natural and automatic.

And to build on Claude's last paragraph, when to surface a memory and when to keep quiet has a lot of social nuance attached. Sometimes, even not disclosing that you remember might be the right move, like when you witnessed a particularly difficult moment someone had but pretend the next day nothing happened. Or "what happens in Vegas stays in Vegas." Or "I won't bring it up unless he does." The details of this are not infrastructure, they're UX design.

React to this post

Back to all posts