Skip to content

Sessions

The Session API provides persistent conversation storage for agents, with tree-structured messages (inspired by Pi), context blocks, compaction, full-text search, and AI-controllable tools. It runs entirely on Durable Object SQLite — no external database needed.

Quick start

JavaScript
import { Agent } from "agents";
import { Session } from "agents/experimental/memory/session";
class MyAgent extends Agent {
session = Session.create(this)
.withContext("soul", {
provider: { get: async () => "You are a helpful assistant." },
})
.withContext("memory", {
description: "Learned facts about the user",
maxTokens: 1100,
})
.withCachedPrompt();
async onMessage(message) {
await this.session.appendMessage(message);
const history = this.session.getHistory();
const system = await this.session.freezeSystemPrompt();
const tools = await this.session.tools();
// Pass history, system prompt, and tools to your LLM
}
}

Creating a session

Use Session.create(agent) with a chainable builder. Context providers without an explicit provider option are auto-wired to SQLite.

JavaScript
const session = Session.create(this)
.withContext("soul", { provider: { get: async () => "You are helpful." } })
.withContext("memory", { description: "Learned facts", maxTokens: 1100 })
.withCachedPrompt()
.onCompaction(myCompactFn)
.compactAfter(100_000);

Direct constructor

For full control over providers:

JavaScript
import {
Session,
AgentSessionProvider,
AgentContextProvider,
} from "agents/experimental/memory/session";
const session = new Session(new AgentSessionProvider(this), {
context: [
{
label: "memory",
description: "Notes",
maxTokens: 500,
provider: new AgentContextProvider(this, "memory"),
},
{ label: "soul", provider: { get: async () => "You are helpful." } },
],
});

Builder methods

All builder methods return this for chaining. Order does not matter — providers are resolved lazily on first use.

MethodDescription
Session.create(agent)Static factory. agent is any object with a sql tagged template method (your Agent or Durable Object).
.forSession(sessionId)Namespace this session by ID. Required for multi-session isolation when not using SessionManager.
.withContext(label, options?)Add a context block. Refer to Context blocks.
.withCachedPrompt(provider?)Enable system prompt persistence. The prompt is frozen on first use and survives hibernation and eviction.
.onCompaction(fn)Register a compaction function. Refer to Compaction.
.compactAfter(tokenThreshold)Auto-compact when estimated token count exceeds the threshold. Requires .onCompaction().

Messages

Messages use the SessionMessage type — a minimal shape with id, role, parts, and optional createdAt. The Vercel AI SDK's UIMessage is structurally compatible and can be passed directly. The session stores messages in a tree structure via parent_id, enabling branching conversations.

JavaScript
// Append — auto-parents to the latest leaf unless parentId is specified
await session.appendMessage(message);
await session.appendMessage(message, parentId);
// Update an existing message (matched by message.id)
session.updateMessage(message);
// Delete specific messages
session.deleteMessages(["msg-1", "msg-2"]);
// Clear all messages and skill state
session.clearMessages();

Reading history

JavaScript
// Linear history from root to the latest leaf
const messages = session.getHistory();
// History to a specific leaf (for branching)
const branch = session.getHistory(leafId);
// Get a single message
const msg = session.getMessage("msg-1");
// Get the newest message
const latest = session.getLatestLeaf();
// Count messages in path
const count = session.getPathLength();

Branching

Messages form a tree. When you appendMessage with a parentId that already has children, you create a branch. Use getBranches() to get all child messages branching from a given point:

JavaScript
// Get all child messages that branch from messageId
const branches = session.getBranches(messageId);

This powers features like response regeneration — pass the user message ID to get both the original and regenerated responses. getHistory(leafId) walks the chosen path.

Full-text search over the conversation history using SQLite FTS5:

JavaScript
const results = session.search("deployment Friday", { limit: 10 });
// Returns: Array<{ id, role, content, createdAt? }>

Uses porter stemming and unicode tokenization. The search covers all messages in the session.

Context blocks

Context blocks are persistent key-value sections injected into the system prompt. Each block has a label, optional description, and a provider that determines its behavior.

Provider types

There are four provider types, detected by duck-typing:

ProviderInterfaceBehaviorAI tool
ContextProviderget()Read-only block in system prompt
WritableContextProviderget() + set()Writable via AIset_context
SkillProviderget() + load() + set?()On-demand keyed documents. get() returns a metadata listing; load(key) fetches full content.load_context, unload_context, set_context
SearchProviderget() + search() + set?()Full-text searchable entries. get() returns a summary; search(query) runs FTS5.search_context, set_context

Built-in providers

AgentContextProvider — SQLite-backed writable context. This is the default when using the builder without an explicit provider.

JavaScript
import { AgentContextProvider } from "agents/experimental/memory/session";
new AgentContextProvider(this, "memory");

R2SkillProvider — Cloudflare R2 bucket for on-demand document loading. Skills are listed in the system prompt as metadata; the model loads full content on demand via load_context.

JavaScript
import { R2SkillProvider } from "agents/experimental/memory/session";
Session.create(this).withContext("skills", {
provider: new R2SkillProvider(env.SKILLS_BUCKET, { prefix: "skills/" }),
});

AgentSearchProvider — SQLite FTS5 searchable context. Entries are indexed and searchable by the model via search_context.

JavaScript
import { AgentSearchProvider } from "agents/experimental/memory/session";
Session.create(this).withContext("knowledge", {
description: "Searchable knowledge base",
provider: new AgentSearchProvider(this),
});

Adding and removing context at runtime

Blocks can be added and removed dynamically after initialization:

JavaScript
// Add a new block (auto-wires to SQLite if no provider given)
await session.addContext("extension-notes", {
description: "From extension X",
maxTokens: 500,
});
// Remove it
session.removeContext("extension-notes");
// Rebuild the system prompt to reflect changes
await session.refreshSystemPrompt();

Reading and writing context

JavaScript
// Read a single block
const block = session.getContextBlock("memory");
// { label, description?, content, tokens, maxTokens?, writable, isSkill, isSearchable }
// Read all blocks
const blocks = session.getContextBlocks();
// Replace content entirely
await session.replaceContextBlock("memory", "User likes coffee.");
// Append content
await session.appendContextBlock("memory", "\nUser prefers dark roast.");

System prompt

The system prompt is built from all context blocks with headers and metadata:

══════════════════════════════════════════════
SOUL (Identity) [readonly]
══════════════════════════════════════════════
You are a helpful assistant.
══════════════════════════════════════════════
MEMORY (Learned facts) [45% — 495/1100 tokens]
══════════════════════════════════════════════
User likes coffee.
User prefers dark roast.
JavaScript
// Freeze — first call renders and persists; subsequent calls return cached value
const prompt = await session.freezeSystemPrompt();
// Refresh — re-render from current block state and persist
const updated = await session.refreshSystemPrompt();

The frozen prompt survives Durable Object hibernation and eviction when withCachedPrompt() is enabled.

AI tools

Session automatically generates tools based on the provider types of your context blocks. Pass these to your LLM alongside your own tools.

JavaScript
const tools = await session.tools();
const allTools = { ...tools, ...myTools };

set_context

Generated when any writable block exists. Writes to regular blocks, skill blocks (keyed), or search blocks (keyed). Enforces maxTokens limits.

load_context

Generated when any skill block exists. Loads full content by key from a SkillProvider.

unload_context

Generated alongside load_context. Frees context space by unloading a previously loaded skill. The skill remains available for re-loading.

search_context

Generated when any search block exists. Full-text search within a searchable context block. Returns top 10 results by FTS5 rank.

Available on SessionManager only. Searches across all sessions.

Compaction

Compaction summarizes older messages to keep conversations within token limits. Original messages are preserved in SQLite — the summary is a non-destructive overlay applied at read time.

Setup

JavaScript
import { createCompactFunction } from "agents/experimental/memory/utils/compaction-helpers";
const session = Session.create(this)
.withContext("memory", { maxTokens: 1100 })
.onCompaction(
createCompactFunction({
summarize: (prompt) =>
generateText({ model: myModel, prompt }).then((r) => r.text),
protectHead: 3,
tailTokenBudget: 20000,
minTailMessages: 2,
}),
)
.compactAfter(100_000);

How compaction works

  1. Protect head — first N messages are never compacted (default 3)
  2. Protect tail — walk backward from the end, accumulating tokens up to a budget (default 20K tokens)
  3. Align boundaries — shift boundaries to avoid splitting tool call/result pairs
  4. Summarize middle — send the middle section to an LLM with a structured format (Topic, Key Points, Current State, Open Items)
  5. Store overlay — saved in the assistant_compactions table, keyed by fromMessageId and toMessageId
  6. Iterative — on subsequent compactions, the existing summary is passed to the LLM to update rather than replace

When getHistory() is called, compaction overlays are applied transparently — the compacted range is replaced by a synthetic summary message.

Manual compaction

JavaScript
const result = await session.compact();
// Or manage overlays directly
session.addCompaction("Summary of messages 1-50", "msg-1", "msg-50");
const overlays = session.getCompactions();

Auto-compaction

When .compactAfter(threshold) is set, appendMessage() checks the estimated token count after each write. If it exceeds the threshold, compact() is called automatically. Auto-compaction failure is non-fatal — the message is already saved.

SessionManager

SessionManager is a registry for multiple named sessions within a single Durable Object. It provides lifecycle management, convenience methods, and cross-session search.

Creating a SessionManager

JavaScript
import { SessionManager } from "agents/experimental/memory/session";
const manager = SessionManager.create(this)
.withContext("soul", { provider: { get: async () => "You are helpful." } })
.withContext("memory", { description: "Learned facts", maxTokens: 1100 })
.withCachedPrompt()
.onCompaction(myCompactFn)
.compactAfter(100_000)
.withSearchableHistory("history");

Context blocks, prompt caching, and compaction settings are propagated to all sessions created through the manager. Provider keys are automatically namespaced by session ID.

Builder methods

MethodDescription
SessionManager.create(agent)Static factory.
.withContext(label, options?)Add context block template for all sessions.
.withCachedPrompt(provider?)Enable prompt persistence for all sessions.
.onCompaction(fn)Register compaction function for all sessions.
.compactAfter(tokenThreshold)Auto-compact threshold for all sessions.
.withSearchableHistory(label)Add a cross-session searchable history block. The model can search past conversations from any session.

Session lifecycle

JavaScript
// Create a new session
const info = manager.create("My Chat");
// Create with metadata
const info2 = manager.create("My Chat", {
parentSessionId: "parent-id",
model: "claude-sonnet-4-20250514",
source: "web",
});
// Get session metadata (null if not found)
const session = manager.get(sessionId);
// List all sessions (ordered by updated_at DESC)
const sessions = manager.list();
// Rename
manager.rename(sessionId, "New Name");
// Delete (clears messages too)
manager.delete(sessionId);

Accessing sessions

JavaScript
// Get or create the Session instance for an ID
// Lazy — creates on first access, caches for subsequent calls
const session = manager.getSession(sessionId);

Message convenience methods

These delegate to the underlying Session and update the session's updated_at timestamp:

JavaScript
// Append a single message
await manager.append(sessionId, message, parentId);
// Add or update (upsert)
await manager.upsert(sessionId, message, parentId);
// Batch append (auto-chains parent IDs)
await manager.appendAll(sessionId, messages, parentId);
// Read history
const history = manager.getHistory(sessionId, leafId);
// Message count
const count = manager.getMessageCount(sessionId);
// Clear messages
manager.clearMessages(sessionId);
// Delete specific messages
manager.deleteMessages(sessionId, ["msg-1"]);

Forking

Fork a session at a specific message — copies history up to that point into a new session:

JavaScript
const forked = await manager.fork(sessionId, atMessageId, "Forked Chat");
// forked.parent_session_id === sessionId

Usage tracking

JavaScript
manager.addUsage(sessionId, inputTokens, outputTokens, cost);
JavaScript
// Search across all sessions (FTS5)
const results = manager.search("deployment Friday", { limit: 20 });
// Get tools for the model (includes session_search)
const tools = manager.tools();

Custom providers

Implement any of the four provider interfaces to plug in your own storage:

JavaScript
// Read-only context
const myProvider = {
get: async () => "Static content here",
};
// Writable context (enables set_context tool)
const myWritable = {
get: async () => fetchFromMyDB(),
set: async (content) => saveToMyDB(content),
};
// Skill provider (enables load_context tool)
const mySkills = {
get: async () => "- api-ref: API Reference\n- guide: User Guide",
load: async (key) => fetchDocument(key),
set: async (key, content, description) =>
saveDocument(key, content, description),
};
// Search provider (enables search_context tool)
const mySearch = {
get: async () => "42 entries indexed",
search: async (query) => searchMyIndex(query),
set: async (key, content) => indexContent(key, content),
};

You can also implement SessionProvider to replace the SQLite storage entirely:

JavaScript
const myStorage = {
getMessage(id) {
/* ... */
},
getHistory(leafId) {
/* ... */
},
getLatestLeaf() {
/* ... */
},
getBranches(messageId) {
/* ... */
},
getPathLength(leafId) {
/* ... */
},
appendMessage(message, parentId) {
/* ... */
},
updateMessage(message) {
/* ... */
},
deleteMessages(messageIds) {
/* ... */
},
clearMessages() {
/* ... */
},
addCompaction(summary, fromId, toId) {
/* ... */
},
getCompactions() {
/* ... */
},
searchMessages(query, limit) {
/* ... */
},
};

Storage tables

All storage is in Durable Object SQLite. Tables are created lazily on first use.

TablePurpose
assistant_messagesTree-structured messages with id, session_id, parent_id, role, content (JSON), created_at
assistant_compactionsCompaction overlays with summary, from_message_id, to_message_id
assistant_ftsFTS5 virtual table for message search (porter stemming, unicode tokenization)
assistant_sessionsSession registry (SessionManager only) with name, parent_session_id, model, source, token/cost counters
cf_agents_context_blocksPersistent context block storage (AgentContextProvider)
cf_agents_search_entries / cf_agents_search_ftsSearchable context entries and FTS5 index (AgentSearchProvider)

Acknowledgments

  • Think — opinionated chat agent that uses Session for conversation storage via configureSession()
  • Chat agentsAIChatAgent with its own message persistence layer
  • Store and sync statesetState() for simpler key-value persistence