Maintain context and state across interactions for coherent long-term conversations
Enable AI agents to remember context across conversations and maintain state over time
Memory management enables AI agents to remember past interactions, maintain context across conversations, and build upon previous knowledge. Agents require different types of memory, much like humans, to operate efficiently.
Similar to working memory, this holds information currently being processed or recently accessed. For agents using LLMs, short-term memory primarily exists within the context window, containing recent messages, agent replies, tool usage results, and reflections.
Limitation: The context window has limited capacity. Even "long context" models simply expand this short-term memory but remain ephemeral—lost once the session concludes. This makes it costly and inefficient to process every time.
Acts as a repository for information agents need to retain across various interactions, tasks, or extended periods. Data is typically stored outside the agent's immediate processing environment in databases, knowledge graphs, or vector databases.
Key Feature: Information is converted into numerical vectors and stored, enabling semantic search—retrieving data based on meaning rather than exact keyword matches.
Short-term memory maintains conversation flow. Long-term memory recalls user preferences, past issues, and prior discussions for personalized interactions.
Buffer MemoryUser ProfilesAgents managing multi-step tasks use short-term memory to track progress and goals, while long-term memory accesses user-specific data not in immediate context.
Task StateProgress TrackingLong-term memory stores and retrieves user preferences, past behaviors, and personal information to adapt responses and suggestions.
User PreferencesBehavioral DataAgents refine performance by learning from past interactions. Successful strategies, mistakes, and new information are stored for future adaptations.
Reinforcement LearningStrategy StorageAgents access a knowledge base (long-term memory) to retrieve relevant documents or data, often implemented within Retrieval Augmented Generation.
Vector SearchKnowledge BaseRobots or self-driving cars require memory for maps, routes, object locations, and learned behaviors—both short-term for immediate surroundings and long-term for environmental knowledge.
Spatial MemoryLearned BehaviorsRetains specific facts and concepts, such as user preferences or domain knowledge. Used to ground agent responses for more personalized and relevant interactions.
Implementation: Continuously updated user "profile" (JSON document) or collection of individual factual documents.
Recalls past events or actions. For AI agents, episodic memory remembers how to accomplish tasks, often implemented through few-shot example prompting.
Implementation: Learning from past successful interaction sequences to perform tasks correctly.
Memory of how to perform tasks—the agent's core instructions and behaviors, often contained in its system prompt. Agents can modify their own prompts to adapt and improve.
Technique: "Reflection" where an agent is prompted with current instructions and recent interactions, then asked to refine its own instructions.
The Google Agent Developer Kit (ADK) offers a structured method for managing context and memory through three core concepts:
An individual chat thread that logs messages and actions (Events) for that specific interaction, also storing temporary data (State) relevant to that conversation.
Components: Unique identifiers (id, app_name, user_id), chronological record of events, state storage, and last_update_time timestamp.
Data stored within a Session, containing information relevant only to the current, active chat thread. Operates as a dictionary storing key-value pairs.
State Prefixes (Scopes):
user: - User-specific data (persists across sessions)app: - Application-level data (shared across all users)temp: - Temporary data (cleared after session)A searchable repository of information sourced from various past chats or external sources, serving as a resource for data retrieval beyond the immediate conversation.
Key Feature: Semantic search enables retrieving data based on meaning rather than exact keyword matches using vector embeddings.
Memory Bank, a managed service in the Vertex AI Agent Engine, provides agents with persistent, long-term memory. The service uses Gemini models to asynchronously analyze conversation histories to extract key facts and user preferences.
Integration: Seamless with Google ADK out-of-the-box. Also supports LangGraph and CrewAI through direct API calls.
Context window for immediate processing
Persistent storage across sessions
Vector-based retrieval by meaning
Session-specific temporary data