Memory Management

December 20, 2025

Memory management is essential for intelligent agents, enabling them to retain information effectively. Agents require various types of memory, similar to human memory, to function efficiently. This chapter explores memory management, focusing on the immediate (short-term) and persistent (long-term) memory needs of agents.

Memory in agent systems refers to how agents retain and utilize information from past interactions, observations, and learning experiences. Memory enables agents to make informed decisions, maintain conversational context, and improve over time. Agent memory is generally categorized into two primary types.

● Short-Term Memory (Contextual Memory) functions similarly to working memory, holding information currently being processed or recently referenced. For agents utilizing large language models (LLMs), short-term memory primarily resides within the context window. This window contains recent messages, agent replies, tool usage results, and agent reflections from the ongoing interaction. This recent history informs the LLM's subsequent responses and actions. The context window has a finite capacity, limiting the amount of recent information an agent can directly access. Effective short-term memory management involves retaining the most pertinent information within this limited space, potentially through techniques like summarizing older segments of the conversation or focusing on key details.

● Long-Term Memory (Persistent Memory) serves as a repository for information agents need to retain across multiple interactions, tasks, or extended periods, analogous to long-term knowledge bases. This typically involves storing data outside the agent's immediate processing environment, often in databases, knowledge graphs, or vector databases. In vector databases, information is transformed into numerical vectors and stored, allowing agents to retrieve data based on semantic similarity rather than exact keyword matches, a process known as semantic search. When an agent requires information from long-term memory, it queries the external storage, retrieves relevant data, and integrates it into the short-term context for immediate use, combining prior knowledge with the current interaction.

Effective memory management involves determining what information to retain, establishing methods for efficient storage and retrieval (such as summarization), and determining when to access stored data. The selection of short-term or long-term memory depends on the nature of the information and its required retention period.

Practical Applications & Use Cases

Memory management is vital for agents to track information and perform intelligently over time. This is essential for agents to surpass basic question-answering capabilities. Applications include:

Chatbots and Conversational AI: Maintaining conversation flow relies on short-term memory. Chatbots require remembering prior user inputs to provide coherent responses. Long-term memory enables chatbots to recall user preferences, past issues, or prior discussions, offering personalized and continuous interactions.

Task-Oriented Agents: Agents managing multi-step tasks need short-term memory to track previous steps, current progress, and overall goals. This information might reside in the task's context or temporary storage. Long-term memory is crucial for accessing specific user-related data not in the immediate context.

Personalized Experiences: Agents offering tailored interactions utilize long-term memory to store and retrieve user preferences, past behaviors, and personal information. This allows agents to adapt their responses and suggestions.

Learning and Improvement: Agents can refine their performance by learning from past interactions. Successful strategies, mistakes, and new information are stored in long-term memory, facilitating future adaptations. Reinforcement learning agents store learned strategies or knowledge in this way.

Information Retrieval (RAG): Agents designed for answering questions access a knowledge base, their long-term memory, often implemented within Retrieval Augmented Generation (RAG). The agent retrieves relevant documents or data to inform its responses.

Autonomous Systems: Robots or self-driving cars require memory for maps, routes, object locations, and learned behaviors. This involves short-term memory for immediate surroundings and long-term memory for general environmental knowledge.

Memory enables agents to maintain history, learn, personalize interactions, and manage complex, time-dependent problems.

At a Glance

What: Agentic systems need to remember information from past interactions to perform complex tasks and provide coherent experiences. Without a memory mechanism, agents are stateless, unable to maintain conversational context, learn from experience, or personalize responses for users. This fundamentally limits them to simple, one-shot interactions, failing to handle multi-step processes or evolving user needs. The core problem is how to effectively manage both the immediate, temporary information of a single conversation and the vast, persistent knowledge gathered over time.

Why: The standardized solution is to implement a dual-component memory system that distinguishes between short-term and long-term storage. Short-term, contextual memory holds recent interaction data within the LLM's context window to maintain conversational flow. For information that must persist, long-term memory solutions use external databases, often vector stores, for efficient, semantic retrieval. Agentic frameworks like the Google ADK provide specific components to manage this, such as Session for the conversation thread and State for its temporary data. A dedicated MemoryService is used to interface with the long-term knowledge base, allowing the agent to retrieve and incorporate relevant past information into its current context.

Rule of thumb: Use this pattern when an agent needs to do more than answer a single question. It is essential for agents that must maintain context throughout a conversation, track progress in multi-step tasks, or personalize interactions by recalling user preferences and history. Implement memory management whenever the agent is expected to learn or adapt based on past successes, failures, or newly acquired information.

Visual summary

Memory management design pattern

Key Takeaways

To quickly recap the main points about memory management:

Memory is super important for agents to keep track of things, learn, and personalize interactions.

Conversational AI relies on both short-term memory for immediate context within a single chat and long-term memory for persistent knowledge across multiple sessions.

Short-term memory (the immediate stuff) is temporary, often limited by the LLM's context window or how the framework passes context.

Long-term memory (the stuff that sticks around) saves info across different chats using outside storage like vector databases and is accessed by searching.

Frameworks like ADK have specific parts like Session (the chat thread), State (temporary chat data), and MemoryService (the searchable long-term knowledge) to manage memory.

ADK's SessionService handles the whole life of a chat session, including its history (events) and temporary data (state).

ADK's session.state is a dictionary for temporary chat data. Prefixes (user:, app:, temp:) tell you where the data belongs and if it sticks around.

In ADK, you should update state by using EventActions.state_delta or output_key when adding events, not by changing the state dictionary directly.

ADK's MemoryService is for putting info into long-term storage and letting agents search it, often using tools.

LangChain offers practical tools like ConversationBufferMemory to automatically inject the history of a single conversation into a prompt, enabling an agent to recall immediate context.

LangGraph enables advanced, long-term memory by using a store to save and retrieve semantic facts, episodic experiences, or even updatable procedural rules across different user sessions.

Conclusion

We dove into the really important job of memory management for agent systems, showing the difference between the short-lived context and the knowledge that sticks around for a long time. We talked about how these types of memory are set up and where you see them used in building smarter agents that can remember things. We've covered how agents can remember things, both short-term and long-term.