Sessions
The Session API provides persistent conversation storage for agents, with tree-structured messages (inspired by Pi ↗), context blocks, compaction, full-text search, and AI-controllable tools. It runs entirely on Durable Object SQLite — no external database needed.
import { Agent } from "agents";import { Session } from "agents/experimental/memory/session";
class MyAgent extends Agent { session = Session.create(this) .withContext("soul", { provider: { get: async () => "You are a helpful assistant." }, }) .withContext("memory", { description: "Learned facts about the user", maxTokens: 1100, }) .withCachedPrompt();
async onMessage(message) { await this.session.appendMessage(message); const history = this.session.getHistory(); const system = await this.session.freezeSystemPrompt(); const tools = await this.session.tools(); // Pass history, system prompt, and tools to your LLM }}import { Agent } from "agents";import { Session } from "agents/experimental/memory/session";
class MyAgent extends Agent { session = Session.create(this) .withContext("soul", { provider: { get: async () => "You are a helpful assistant." }, }) .withContext("memory", { description: "Learned facts about the user", maxTokens: 1100, }) .withCachedPrompt();
async onMessage(message: unknown) { await this.session.appendMessage(message); const history = this.session.getHistory(); const system = await this.session.freezeSystemPrompt(); const tools = await this.session.tools(); // Pass history, system prompt, and tools to your LLM }}Use Session.create(agent) with a chainable builder. Context providers without an explicit provider option are auto-wired to SQLite.
const session = Session.create(this) .withContext("soul", { provider: { get: async () => "You are helpful." } }) .withContext("memory", { description: "Learned facts", maxTokens: 1100 }) .withCachedPrompt() .onCompaction(myCompactFn) .compactAfter(100_000);const session = Session.create(this) .withContext("soul", { provider: { get: async () => "You are helpful." } }) .withContext("memory", { description: "Learned facts", maxTokens: 1100 }) .withCachedPrompt() .onCompaction(myCompactFn) .compactAfter(100_000);For full control over providers:
import { Session, AgentSessionProvider, AgentContextProvider,} from "agents/experimental/memory/session";
const session = new Session(new AgentSessionProvider(this), { context: [ { label: "memory", description: "Notes", maxTokens: 500, provider: new AgentContextProvider(this, "memory"), }, { label: "soul", provider: { get: async () => "You are helpful." } }, ],});import { Session, AgentSessionProvider, AgentContextProvider,} from "agents/experimental/memory/session";
const session = new Session(new AgentSessionProvider(this), { context: [ { label: "memory", description: "Notes", maxTokens: 500, provider: new AgentContextProvider(this, "memory"), }, { label: "soul", provider: { get: async () => "You are helpful." } }, ],});All builder methods return this for chaining. Order does not matter — providers are resolved lazily on first use.
| Method | Description |
|---|---|
Session.create(agent) | Static factory. agent is any object with a sql tagged template method (your Agent or Durable Object). |
.forSession(sessionId) | Namespace this session by ID. Required for multi-session isolation when not using SessionManager. |
.withContext(label, options?) | Add a context block. Refer to Context blocks. |
.withCachedPrompt(provider?) | Enable system prompt persistence. The prompt is frozen on first use and survives hibernation and eviction. |
.onCompaction(fn) | Register a compaction function. Refer to Compaction. |
.compactAfter(tokenThreshold) | Auto-compact when estimated token count exceeds the threshold. Requires .onCompaction(). |
Messages use the SessionMessage type — a minimal shape with id, role, parts, and optional createdAt. The Vercel AI SDK's UIMessage is structurally compatible and can be passed directly. The session stores messages in a tree structure via parent_id, enabling branching conversations.
// Append — auto-parents to the latest leaf unless parentId is specifiedawait session.appendMessage(message);await session.appendMessage(message, parentId);
// Update an existing message (matched by message.id)session.updateMessage(message);
// Delete specific messagessession.deleteMessages(["msg-1", "msg-2"]);
// Clear all messages and skill statesession.clearMessages();// Append — auto-parents to the latest leaf unless parentId is specifiedawait session.appendMessage(message);await session.appendMessage(message, parentId);
// Update an existing message (matched by message.id)session.updateMessage(message);
// Delete specific messagessession.deleteMessages(["msg-1", "msg-2"]);
// Clear all messages and skill statesession.clearMessages();// Linear history from root to the latest leafconst messages = session.getHistory();
// History to a specific leaf (for branching)const branch = session.getHistory(leafId);
// Get a single messageconst msg = session.getMessage("msg-1");
// Get the newest messageconst latest = session.getLatestLeaf();
// Count messages in pathconst count = session.getPathLength();// Linear history from root to the latest leafconst messages = session.getHistory();
// History to a specific leaf (for branching)const branch = session.getHistory(leafId);
// Get a single messageconst msg = session.getMessage("msg-1");
// Get the newest messageconst latest = session.getLatestLeaf();
// Count messages in pathconst count = session.getPathLength();Messages form a tree. When you appendMessage with a parentId that already has children, you create a branch. Use getBranches() to get all child messages branching from a given point:
// Get all child messages that branch from messageIdconst branches = session.getBranches(messageId);// Get all child messages that branch from messageIdconst branches = session.getBranches(messageId);This powers features like response regeneration — pass the user message ID to get both the original and regenerated responses. getHistory(leafId) walks the chosen path.
Full-text search over the conversation history using SQLite FTS5:
const results = session.search("deployment Friday", { limit: 10 });// Returns: Array<{ id, role, content, createdAt? }>const results = session.search("deployment Friday", { limit: 10 });// Returns: Array<{ id, role, content, createdAt? }>Uses porter stemming and unicode tokenization. The search covers all messages in the session.
Context blocks are persistent key-value sections injected into the system prompt. Each block has a label, optional description, and a provider that determines its behavior.
There are four provider types, detected by duck-typing:
| Provider | Interface | Behavior | AI tool |
|---|---|---|---|
| ContextProvider | get() | Read-only block in system prompt | — |
| WritableContextProvider | get() + set() | Writable via AI | set_context |
| SkillProvider | get() + load() + set?() | On-demand keyed documents. get() returns a metadata listing; load(key) fetches full content. | load_context, unload_context, set_context |
| SearchProvider | get() + search() + set?() | Full-text searchable entries. get() returns a summary; search(query) runs FTS5. | search_context, set_context |
AgentContextProvider — SQLite-backed writable context. This is the default when using the builder without an explicit provider.
import { AgentContextProvider } from "agents/experimental/memory/session";
new AgentContextProvider(this, "memory");import { AgentContextProvider } from "agents/experimental/memory/session";
new AgentContextProvider(this, "memory");R2SkillProvider — Cloudflare R2 bucket for on-demand document loading. Skills are listed in the system prompt as metadata; the model loads full content on demand via load_context.
import { R2SkillProvider } from "agents/experimental/memory/session";
Session.create(this).withContext("skills", { provider: new R2SkillProvider(env.SKILLS_BUCKET, { prefix: "skills/" }),});import { R2SkillProvider } from "agents/experimental/memory/session";
Session.create(this).withContext("skills", { provider: new R2SkillProvider(env.SKILLS_BUCKET, { prefix: "skills/" }),});AgentSearchProvider — SQLite FTS5 searchable context. Entries are indexed and searchable by the model via search_context.
import { AgentSearchProvider } from "agents/experimental/memory/session";
Session.create(this).withContext("knowledge", { description: "Searchable knowledge base", provider: new AgentSearchProvider(this),});import { AgentSearchProvider } from "agents/experimental/memory/session";
Session.create(this).withContext("knowledge", { description: "Searchable knowledge base", provider: new AgentSearchProvider(this),});Blocks can be added and removed dynamically after initialization:
// Add a new block (auto-wires to SQLite if no provider given)await session.addContext("extension-notes", { description: "From extension X", maxTokens: 500,});
// Remove itsession.removeContext("extension-notes");
// Rebuild the system prompt to reflect changesawait session.refreshSystemPrompt();// Add a new block (auto-wires to SQLite if no provider given)await session.addContext("extension-notes", { description: "From extension X", maxTokens: 500,});
// Remove itsession.removeContext("extension-notes");
// Rebuild the system prompt to reflect changesawait session.refreshSystemPrompt();// Read a single blockconst block = session.getContextBlock("memory");// { label, description?, content, tokens, maxTokens?, writable, isSkill, isSearchable }
// Read all blocksconst blocks = session.getContextBlocks();
// Replace content entirelyawait session.replaceContextBlock("memory", "User likes coffee.");
// Append contentawait session.appendContextBlock("memory", "\nUser prefers dark roast.");// Read a single blockconst block = session.getContextBlock("memory");// { label, description?, content, tokens, maxTokens?, writable, isSkill, isSearchable }
// Read all blocksconst blocks = session.getContextBlocks();
// Replace content entirelyawait session.replaceContextBlock("memory", "User likes coffee.");
// Append contentawait session.appendContextBlock("memory", "\nUser prefers dark roast.");The system prompt is built from all context blocks with headers and metadata:
══════════════════════════════════════════════SOUL (Identity) [readonly]══════════════════════════════════════════════You are a helpful assistant.
══════════════════════════════════════════════MEMORY (Learned facts) [45% — 495/1100 tokens]══════════════════════════════════════════════User likes coffee.User prefers dark roast.// Freeze — first call renders and persists; subsequent calls return cached valueconst prompt = await session.freezeSystemPrompt();
// Refresh — re-render from current block state and persistconst updated = await session.refreshSystemPrompt();// Freeze — first call renders and persists; subsequent calls return cached valueconst prompt = await session.freezeSystemPrompt();
// Refresh — re-render from current block state and persistconst updated = await session.refreshSystemPrompt();The frozen prompt survives Durable Object hibernation and eviction when withCachedPrompt() is enabled.
Session automatically generates tools based on the provider types of your context blocks. Pass these to your LLM alongside your own tools.
const tools = await session.tools();const allTools = { ...tools, ...myTools };const tools = await session.tools();const allTools = { ...tools, ...myTools };Generated when any writable block exists. Writes to regular blocks, skill blocks (keyed), or search blocks (keyed). Enforces maxTokens limits.
Generated when any skill block exists. Loads full content by key from a SkillProvider.
Generated alongside load_context. Frees context space by unloading a previously loaded skill. The skill remains available for re-loading.
Generated when any search block exists. Full-text search within a searchable context block. Returns top 10 results by FTS5 rank.
Available on SessionManager only. Searches across all sessions.
Compaction summarizes older messages to keep conversations within token limits. Original messages are preserved in SQLite — the summary is a non-destructive overlay applied at read time.
import { createCompactFunction } from "agents/experimental/memory/utils/compaction-helpers";
const session = Session.create(this) .withContext("memory", { maxTokens: 1100 }) .onCompaction( createCompactFunction({ summarize: (prompt) => generateText({ model: myModel, prompt }).then((r) => r.text), protectHead: 3, tailTokenBudget: 20000, minTailMessages: 2, }), ) .compactAfter(100_000);import { createCompactFunction } from "agents/experimental/memory/utils/compaction-helpers";
const session = Session.create(this) .withContext("memory", { maxTokens: 1100 }) .onCompaction( createCompactFunction({ summarize: (prompt) => generateText({ model: myModel, prompt }).then((r) => r.text), protectHead: 3, tailTokenBudget: 20000, minTailMessages: 2, }), ) .compactAfter(100_000);- Protect head — first N messages are never compacted (default 3)
- Protect tail — walk backward from the end, accumulating tokens up to a budget (default 20K tokens)
- Align boundaries — shift boundaries to avoid splitting tool call/result pairs
- Summarize middle — send the middle section to an LLM with a structured format (Topic, Key Points, Current State, Open Items)
- Store overlay — saved in the
assistant_compactionstable, keyed byfromMessageIdandtoMessageId - Iterative — on subsequent compactions, the existing summary is passed to the LLM to update rather than replace
When getHistory() is called, compaction overlays are applied transparently — the compacted range is replaced by a synthetic summary message.
const result = await session.compact();
// Or manage overlays directlysession.addCompaction("Summary of messages 1-50", "msg-1", "msg-50");const overlays = session.getCompactions();const result = await session.compact();
// Or manage overlays directlysession.addCompaction("Summary of messages 1-50", "msg-1", "msg-50");const overlays = session.getCompactions();When .compactAfter(threshold) is set, appendMessage() checks the estimated token count after each write. If it exceeds the threshold, compact() is called automatically. Auto-compaction failure is non-fatal — the message is already saved.
SessionManager is a registry for multiple named sessions within a single Durable Object. It provides lifecycle management, convenience methods, and cross-session search.
import { SessionManager } from "agents/experimental/memory/session";
const manager = SessionManager.create(this) .withContext("soul", { provider: { get: async () => "You are helpful." } }) .withContext("memory", { description: "Learned facts", maxTokens: 1100 }) .withCachedPrompt() .onCompaction(myCompactFn) .compactAfter(100_000) .withSearchableHistory("history");import { SessionManager } from "agents/experimental/memory/session";
const manager = SessionManager.create(this) .withContext("soul", { provider: { get: async () => "You are helpful." } }) .withContext("memory", { description: "Learned facts", maxTokens: 1100 }) .withCachedPrompt() .onCompaction(myCompactFn) .compactAfter(100_000) .withSearchableHistory("history");Context blocks, prompt caching, and compaction settings are propagated to all sessions created through the manager. Provider keys are automatically namespaced by session ID.
| Method | Description |
|---|---|
SessionManager.create(agent) | Static factory. |
.withContext(label, options?) | Add context block template for all sessions. |
.withCachedPrompt(provider?) | Enable prompt persistence for all sessions. |
.onCompaction(fn) | Register compaction function for all sessions. |
.compactAfter(tokenThreshold) | Auto-compact threshold for all sessions. |
.withSearchableHistory(label) | Add a cross-session searchable history block. The model can search past conversations from any session. |
// Create a new sessionconst info = manager.create("My Chat");
// Create with metadataconst info2 = manager.create("My Chat", { parentSessionId: "parent-id", model: "claude-sonnet-4-20250514", source: "web",});
// Get session metadata (null if not found)const session = manager.get(sessionId);
// List all sessions (ordered by updated_at DESC)const sessions = manager.list();
// Renamemanager.rename(sessionId, "New Name");
// Delete (clears messages too)manager.delete(sessionId);// Create a new sessionconst info = manager.create("My Chat");
// Create with metadataconst info2 = manager.create("My Chat", { parentSessionId: "parent-id", model: "claude-sonnet-4-20250514", source: "web",});
// Get session metadata (null if not found)const session = manager.get(sessionId);
// List all sessions (ordered by updated_at DESC)const sessions = manager.list();
// Renamemanager.rename(sessionId, "New Name");
// Delete (clears messages too)manager.delete(sessionId);// Get or create the Session instance for an ID// Lazy — creates on first access, caches for subsequent callsconst session = manager.getSession(sessionId);// Get or create the Session instance for an ID// Lazy — creates on first access, caches for subsequent callsconst session = manager.getSession(sessionId);These delegate to the underlying Session and update the session's updated_at timestamp:
// Append a single messageawait manager.append(sessionId, message, parentId);
// Add or update (upsert)await manager.upsert(sessionId, message, parentId);
// Batch append (auto-chains parent IDs)await manager.appendAll(sessionId, messages, parentId);
// Read historyconst history = manager.getHistory(sessionId, leafId);
// Message countconst count = manager.getMessageCount(sessionId);
// Clear messagesmanager.clearMessages(sessionId);
// Delete specific messagesmanager.deleteMessages(sessionId, ["msg-1"]);// Append a single messageawait manager.append(sessionId, message, parentId);
// Add or update (upsert)await manager.upsert(sessionId, message, parentId);
// Batch append (auto-chains parent IDs)await manager.appendAll(sessionId, messages, parentId);
// Read historyconst history = manager.getHistory(sessionId, leafId);
// Message countconst count = manager.getMessageCount(sessionId);
// Clear messagesmanager.clearMessages(sessionId);
// Delete specific messagesmanager.deleteMessages(sessionId, ["msg-1"]);Fork a session at a specific message — copies history up to that point into a new session:
const forked = await manager.fork(sessionId, atMessageId, "Forked Chat");// forked.parent_session_id === sessionIdconst forked = await manager.fork(sessionId, atMessageId, "Forked Chat");// forked.parent_session_id === sessionIdmanager.addUsage(sessionId, inputTokens, outputTokens, cost);manager.addUsage(sessionId, inputTokens, outputTokens, cost);// Search across all sessions (FTS5)const results = manager.search("deployment Friday", { limit: 20 });
// Get tools for the model (includes session_search)const tools = manager.tools();// Search across all sessions (FTS5)const results = manager.search("deployment Friday", { limit: 20 });
// Get tools for the model (includes session_search)const tools = manager.tools();Implement any of the four provider interfaces to plug in your own storage:
// Read-only contextconst myProvider = { get: async () => "Static content here",};
// Writable context (enables set_context tool)const myWritable = { get: async () => fetchFromMyDB(), set: async (content) => saveToMyDB(content),};
// Skill provider (enables load_context tool)const mySkills = { get: async () => "- api-ref: API Reference\n- guide: User Guide", load: async (key) => fetchDocument(key), set: async (key, content, description) => saveDocument(key, content, description),};
// Search provider (enables search_context tool)const mySearch = { get: async () => "42 entries indexed", search: async (query) => searchMyIndex(query), set: async (key, content) => indexContent(key, content),};// Read-only contextconst myProvider: ContextProvider = { get: async () => "Static content here",};
// Writable context (enables set_context tool)const myWritable: WritableContextProvider = { get: async () => fetchFromMyDB(), set: async (content) => saveToMyDB(content),};
// Skill provider (enables load_context tool)const mySkills: SkillProvider = { get: async () => "- api-ref: API Reference\n- guide: User Guide", load: async (key) => fetchDocument(key), set: async (key, content, description) => saveDocument(key, content, description),};
// Search provider (enables search_context tool)const mySearch: SearchProvider = { get: async () => "42 entries indexed", search: async (query) => searchMyIndex(query), set: async (key, content) => indexContent(key, content),};You can also implement SessionProvider to replace the SQLite storage entirely:
const myStorage = { getMessage(id) { /* ... */ }, getHistory(leafId) { /* ... */ }, getLatestLeaf() { /* ... */ }, getBranches(messageId) { /* ... */ }, getPathLength(leafId) { /* ... */ }, appendMessage(message, parentId) { /* ... */ }, updateMessage(message) { /* ... */ }, deleteMessages(messageIds) { /* ... */ }, clearMessages() { /* ... */ }, addCompaction(summary, fromId, toId) { /* ... */ }, getCompactions() { /* ... */ }, searchMessages(query, limit) { /* ... */ },};const myStorage: SessionProvider = { getMessage(id) { /* ... */ }, getHistory(leafId?) { /* ... */ }, getLatestLeaf() { /* ... */ }, getBranches(messageId) { /* ... */ }, getPathLength(leafId?) { /* ... */ }, appendMessage(message, parentId?) { /* ... */ }, updateMessage(message) { /* ... */ }, deleteMessages(messageIds) { /* ... */ }, clearMessages() { /* ... */ }, addCompaction(summary, fromId, toId) { /* ... */ }, getCompactions() { /* ... */ }, searchMessages(query, limit) { /* ... */ },};All storage is in Durable Object SQLite. Tables are created lazily on first use.
| Table | Purpose |
|---|---|
assistant_messages | Tree-structured messages with id, session_id, parent_id, role, content (JSON), created_at |
assistant_compactions | Compaction overlays with summary, from_message_id, to_message_id |
assistant_fts | FTS5 virtual table for message search (porter stemming, unicode tokenization) |
assistant_sessions | Session registry (SessionManager only) with name, parent_session_id, model, source, token/cost counters |
cf_agents_context_blocks | Persistent context block storage (AgentContextProvider) |
cf_agents_search_entries / cf_agents_search_fts | Searchable context entries and FTS5 index (AgentSearchProvider) |
- Session's tree-structured messages are inspired by Pi ↗.
- Context blocks are inspired by Letta AI memory blocks ↗.
- Formatting of blocks is inspired by Hermes Agent ↗.
- Think — opinionated chat agent that uses Session for conversation storage via
configureSession() - Chat agents —
AIChatAgentwith its own message persistence layer - Store and sync state —
setState()for simpler key-value persistence