Agents SDK v0.12.4: chat recovery, routing retries, durable Think submissions, and Voice connection control
The latest release of the Agents SDK ↗ brings more reliable chat recovery, fixes Agent state synchronization during reconnects, adds durable submissions for Think, exposes routing retry configuration, and adds connection control for Voice agents.
@cloudflare/ai-chat now keeps server turns running when a browser or client stream is interrupted. This is useful for long-running AI responses where users refresh the page, close a tab, or temporarily lose connection. Calling stop() still cancels the server turn.
Set cancelOnClientAbort: true if browser or client aborts should also cancel the server turn:
const chat = useAgentChat({ agent: "assistant", name: "user-123", cancelOnClientAbort: true,});const chat = useAgentChat({ agent: "assistant", name: "user-123", cancelOnClientAbort: true,});Notable bug fixes:
- Chat stream resume negotiation no longer throws when replay races with a closed WebSocket connection.
- Recovered chat continuations no longer leave
useAgentChatstuck in a streaming state when the original socket disconnects before a terminal response. - Approval auto-continuation preserves reasoning parts and persists continuation reasoning in the final message.
isServerStreamingnow resets correctly when a resumed stream moves from the fallback observer path to a transport-owned stream.
agents@0.12.4 prevents duplicate initial state frames during WebSocket connection setup. This avoids stale initial state messages overwriting state updates already sent by the client.
Agent recovery is also more reliable when tool calls span a Durable Object restart. Recovery now defers user finish hooks until after agent startup and isolates hook failures, so one failed hook does not block other recovered runs from finalizing.
getAgentByName() now supports routingRetry for transient Durable Object routing failures:
import { getAgentByName } from "agents";
const agent = await getAgentByName(env.AssistantAgent, "user-123", { routingRetry: { maxAttempts: 3, },});import { getAgentByName } from "agents";
const agent = await getAgentByName(env.AssistantAgent, "user-123", { routingRetry: { maxAttempts: 3, },});@cloudflare/think now supports durable programmatic submissions. submitMessages() provides durable acceptance, idempotent retries, status inspection, cancellation, and cleanup for server-driven turns that should continue after the caller returns.
Think.chat() RPC turns now run inside chat recovery fibers and persist their stream chunks. Interrupted sub-agent turns can recover partial output instead of starting over.
ChatOptions.tools has been removed from the TypeScript API. Define durable tools on the child agent or use agent tools for orchestration. Runtime options.tools values passed by legacy callers are ignored with a warning.
@cloudflare/think no longer applies pruneMessages({ toolCalls: "before-last-2-messages" }) to model context by default. The previous default could strip client-side tool results from longer multi-turn flows.
truncateOlderMessages still runs as before, so context cost remains bounded. Subclasses that relied on the old aggressive pruning can opt back in from beforeTurn:
import { Think } from "@cloudflare/think";import { pruneMessages } from "ai";
export class MyAgent extends Think { beforeTurn(ctx) { return { messages: pruneMessages({ messages: ctx.messages, toolCalls: "before-last-2-messages", }), }; }}import { Think } from "@cloudflare/think";import { pruneMessages } from "ai";
export class MyAgent extends Think<Env> { beforeTurn(ctx) { return { messages: pruneMessages({ messages: ctx.messages, toolCalls: "before-last-2-messages", }), }; }}@cloudflare/voice adds an enabled option to useVoiceAgent. React apps can now delay creating and connecting a VoiceClient until prerequisites such as capability tokens are ready.
const voice = useVoiceAgent({ agent: "MyVoiceAgent", enabled: Boolean(token),});const voice = useVoiceAgent({ agent: "MyVoiceAgent", enabled: Boolean(token),});This release also fixes Workers AI speech-to-text session edge cases and withVoice text streaming from AI SDK textStream responses.
- Streamable HTTP routing — Server-to-client requests now route through the originating POST stream when no standalone SSE stream is available.
- Structured tool output — Tool output shapes are preserved when truncating older messages or oversized persisted rows.
- Non-chat Think tool steps — Think agent-tool children can complete without emitting assistant text and can return structured output through
getAgentToolOutput. - Sub-agent schedules — Stale sub-agent schedule rows are pruned when their owning facet registry entry no longer exists.
@cloudflare/codemode— Adds a browser-safe export with an iframe sandbox executor and resolves OpenAPI specs inside the sandbox to avoid Worker Loader RPC size limits.
To update to the latest version:
npm i agents@latest @cloudflare/ai-chat@latest @cloudflare/think@latest @cloudflare/voice@latestRefer to the Agents API reference and Chat agents documentation for more information.