Skip to content

Think

@cloudflare/think lets you build a stateful AI chat agent — one that streams replies, remembers the conversation, and calls tools — by extending a single base class. You provide a model with getModel(), and Think wires up the rest of the chat lifecycle for you: the agentic loop (the model calls tools, reads the results, and keeps going until it has an answer), message persistence, streaming, client tools, stream resumption, and extensions — all backed by Durable Object SQLite.

Think works as both a top-level agent (WebSocket chat to browser clients via useAgentChat) and a sub-agent (a child agent that another agent drives over RPC via chat()).

Quick start

Install

Terminal window
npm install @cloudflare/think @cloudflare/ai-chat agents ai @cloudflare/shell zod workers-ai-provider

Server

JavaScript
import { Think } from "@cloudflare/think";
import { createWorkersAI } from "workers-ai-provider";
import { routeAgentRequest } from "agents";
export class MyAgent extends Think {
getModel() {
return createWorkersAI({ binding: this.env.AI })(
"@cf/moonshotai/kimi-k2.6",
);
}
}
export default {
async fetch(request, env) {
return (
(await routeAgentRequest(request, env)) ||
new Response("Not found", { status: 404 })
);
},
};

That is it. Think handles the WebSocket chat protocol, message persistence, the agentic loop, message sanitization, stream resumption, client tool support, and workspace file tools.

Client

JavaScript
import { useAgent } from "agents/react";
import { useAgentChat } from "@cloudflare/ai-chat/react";
function Chat() {
const agent = useAgent({ agent: "MyAgent" });
const { messages, sendMessage, status } = useAgentChat({ agent });
return (
<div>
{messages.map((msg) => (
<div key={msg.id}>
<strong>{msg.role}:</strong>
{msg.parts.map((part, i) =>
part.type === "text" ? <span key={i}>{part.text}</span> : null,
)}
</div>
))}
<form
onSubmit={(e) => {
e.preventDefault();
const input = e.currentTarget.elements.namedItem("input");
sendMessage({ text: input.value });
input.value = "";
}}
>
<input name="input" placeholder="Send a message..." />
<button type="submit">Send</button>
</form>
</div>
);
}

Configuration

JSONC
{
"$schema": "./node_modules/wrangler/config-schema.json",
// Set this to today's date
"compatibility_date": "2026-06-04",
"compatibility_flags": [
"nodejs_compat"
],
"ai": {
"binding": "AI"
},
"durable_objects": {
"bindings": [
{
"class_name": "MyAgent",
"name": "MyAgent"
}
]
},
"migrations": [
{
"new_sqlite_classes": [
"MyAgent"
],
"tag": "v1"
}
]
}

Think vs AIChatAgent

Both Think and AIChatAgent extend Agent and speak the same cf_agent_chat_* WebSocket protocol. They serve different goals.

AIChatAgent is a protocol adapter. You override onChatMessage and are responsible for calling streamText, wiring tools, converting messages, and returning a Response. AIChatAgent handles the plumbing — message persistence, streaming, abort, resume — but the LLM call is entirely your concern.

Think is an opinionated framework. It makes decisions for you: getModel() returns the model, getSystemPrompt() or configureSession() sets the prompt, getTools() returns tools. The default onChatMessage runs the complete agentic loop. You override individual pieces, not the whole pipeline.

ConcernAIChatAgentThink
Minimal subclass~15 lines (wire streamText + tools + system prompt + response)3 lines (getModel() only)
StorageFlat SQL tableSession: tree-structured messages, context blocks, compaction, FTS5
RegenerationDestructive (old response deleted)Non-destructive branching (old responses preserved)
Context managementManualContext blocks with LLM-writable persistent memory
Sub-agent RPCNot built inchat() with StreamCallback
Programmatic turnssaveMessages()saveMessages(), submitMessages(), continueLastTurn()
CompactionmaxPersistedMessages (deletes oldest)Non-destructive summaries via overlays
SearchNot availableFTS5 full-text search per-session and cross-session

When to use AIChatAgent

  • You need full control over the LLM call (RAG, multi-model, custom streaming)
  • You want the Response return type for HTTP middleware or testing
  • You are building a simple chatbot with no memory requirements

When to use Think

  • You want to ship fast (3-line subclass with everything wired)
  • You need persistent memory (context blocks the model can read and write)
  • You need long conversations (non-destructive compaction)
  • You need conversation search (FTS5)
  • You are building a sub-agent system (parent-child RPC with streaming)
  • You need proactive agents (programmatic turns from scheduled tasks or webhooks)
  • You need durable async submission for webhook or RPC callers

Choose a turn API

Think has several ways to start or continue a turn. Choose based on who starts the work and what the caller needs back.

Use caseAPI
A browser user sends chat messagesuseAgentChat over the WebSocket chat protocol
Server code can wait for the model responsesaveMessages()
Server code needs fast durable acceptance and later statussubmitMessages()
Code should create recurring prompt-driven turns or handlersgetScheduledTasks()
Parent code needs direct streaming RPC to a specific childsubAgent(...).chat()
A parent delegates work to a retained child agentagentTool() or runAgentTool()
Surround a turn with idempotent app-owned side effectsstartFiber()
Coordinate multi-step durable orchestrationWorkflows
Add context or messages without starting a model turnpersistMessages()
Advanced subclass or recovery code continues an assistant turncontinueLastTurn()

Use saveMessages() when the caller owns the trigger and can wait for the turn to finish. Use submitMessages() when timeout ambiguity would make retries unsafe.

Use chat() for low-level parent-to-child streaming when your code owns forwarding, cancellation, and replay policy. Use Agent tools when a parent model or workflow delegates to a child agent and you want retained child runs, event replay, abort bridging, and UI drill-in.

Use startFiber() outside Think when the durable unit is an application job around a turn: accepting a webhook once, restoring a serialized channel or thread target, posting a visible reply, or recording app-level recovery policy. Think submissions own conversation admission and turn serialization; managed fibers own external job acceptance, idempotent side effects, and application recovery.

In this section

Acknowledgments

Think's design is inspired by Pi.

Example

  • Sessions — context blocks, compaction, search, multi-session (the storage layer Think builds on)
  • Sub-agentssubAgent(), abortSubAgent(), deleteSubAgent() (the base Agent methods for spawning children)
  • Chat agentsAIChatAgent for when you need full control over the LLM call
  • Long-running agents — sub-agent delegation patterns for multi-week agent lifetimes
  • Durable executionrunFiber() and crash recovery (used by chatRecovery)
  • Browse the web — full CDP helper API reference