Chapter 8: Memory Management

Maintain context and state across interactions for coherent long-term conversations

Intermediate17 min readInteractive Playground

Chapter 8

Memory Management

Enable AI agents to remember context across conversations and maintain state over time

What is Memory Management?

Memory management enables AI agents to remember past interactions, maintain context across conversations, and build upon previous knowledge. Agents require different types of memory, much like humans, to operate efficiently.

Memory Types in Agent Systems

🧠 Short-Term Memory (Contextual Memory)

Similar to working memory, this holds information currently being processed or recently accessed. For agents using LLMs, short-term memory primarily exists within the context window, containing recent messages, agent replies, tool usage results, and reflections.

Limitation: The context window has limited capacity. Even "long context" models simply expand this short-term memory but remain ephemeral—lost once the session concludes. This makes it costly and inefficient to process every time.

💾 Long-Term Memory (Persistent Memory)

Acts as a repository for information agents need to retain across various interactions, tasks, or extended periods. Data is typically stored outside the agent's immediate processing environment in databases, knowledge graphs, or vector databases.

Key Feature: Information is converted into numerical vectors and stored, enabling semantic search—retrieving data based on meaning rather than exact keyword matches.

Practical Applications & Use Cases

💬 Chatbots and Conversational AI

Short-term memory maintains conversation flow. Long-term memory recalls user preferences, past issues, and prior discussions for personalized interactions.

Buffer MemoryUser Profiles

✅ Task-Oriented Agents

Agents managing multi-step tasks use short-term memory to track progress and goals, while long-term memory accesses user-specific data not in immediate context.

Task StateProgress Tracking

🎯 Personalized Experiences

Long-term memory stores and retrieves user preferences, past behaviors, and personal information to adapt responses and suggestions.

User PreferencesBehavioral Data

📚 Learning and Improvement

Agents refine performance by learning from past interactions. Successful strategies, mistakes, and new information are stored for future adaptations.

Reinforcement LearningStrategy Storage

🔍 Information Retrieval (RAG)

Agents access a knowledge base (long-term memory) to retrieve relevant documents or data, often implemented within Retrieval Augmented Generation.

Vector SearchKnowledge Base

🤖 Autonomous Systems

Robots or self-driving cars require memory for maps, routes, object locations, and learned behaviors—both short-term for immediate surroundings and long-term for environmental knowledge.

Spatial MemoryLearned Behaviors

Types of Long-Term Memory

📖 Semantic Memory: Remembering Facts

Retains specific facts and concepts, such as user preferences or domain knowledge. Used to ground agent responses for more personalized and relevant interactions.

Implementation: Continuously updated user "profile" (JSON document) or collection of individual factual documents.

🎬 Episodic Memory: Remembering Experiences

Recalls past events or actions. For AI agents, episodic memory remembers how to accomplish tasks, often implemented through few-shot example prompting.

Implementation: Learning from past successful interaction sequences to perform tasks correctly.

⚙️ Procedural Memory: Remembering Rules

Memory of how to perform tasks—the agent's core instructions and behaviors, often contained in its system prompt. Agents can modify their own prompts to adapt and improve.

Technique: "Reflection" where an agent is prompted with current instructions and recent interactions, then asked to refine its own instructions.

Google ADK: Session, State, and Memory

The Google Agent Developer Kit (ADK) offers a structured method for managing context and memory through three core concepts:

🔄 Session

An individual chat thread that logs messages and actions (Events) for that specific interaction, also storing temporary data (State) relevant to that conversation.

Components: Unique identifiers (id, app_name, user_id), chronological record of events, state storage, and last_update_time timestamp.

📝 State (session.state)

Data stored within a Session, containing information relevant only to the current, active chat thread. Operates as a dictionary storing key-value pairs.

State Prefixes (Scopes):

user: - User-specific data (persists across sessions)
app: - Application-level data (shared across all users)
temp: - Temporary data (cleared after session)

💾 Memory

A searchable repository of information sourced from various past chats or external sources, serving as a resource for data retrieval beyond the immediate conversation.

Key Feature: Semantic search enables retrieving data based on meaning rather than exact keyword matches using vector embeddings.

Vertex Memory Bank

Memory Bank, a managed service in the Vertex AI Agent Engine, provides agents with persistent, long-term memory. The service uses Gemini models to asynchronously analyze conversation histories to extract key facts and user preferences.

How It Works:

1.Analyzes conversation histories to extract key facts and user preferences
2.Stores information persistently, organized by scope (e.g., user ID)
3.Intelligently updates to consolidate new data and resolve contradictions
4.Retrieves relevant memories through full recall or similarity search using embeddings

Integration: Seamless with Google ADK out-of-the-box. Also supports LangGraph and CrewAI through direct API calls.

Visual Summary

Short-Term Memory

Context window for immediate processing

Long-Term Memory

Persistent storage across sessions

Semantic Search

Vector-based retrieval by meaning

State Management

Session-specific temporary data

Key Takeaways

✓Memory is essential for agents to track information, learn, and personalize interactions
✓Short-term memory is temporary and limited by context window; long-term memory persists across sessions
✓Frameworks like ADK provide Session (chat thread), State (temporary data), and MemoryService (long-term knowledge)
✓State prefixes (user:, app:, temp:) indicate data scope and persistence
✓Update state using EventActions.state_delta or output_key, not by direct dictionary modification
✓Long-term memory types: Semantic (facts), Episodic (experiences), Procedural (rules)
✓Vertex Memory Bank provides managed, persistent memory with automatic fact extraction and semantic search