Skip to content

Durable recovery

Think wraps chat turns in recoverable fibers by default (chatRecovery = true). If the Durable Object is evicted mid-stream, Think reconstructs any buffered chunks, persists partial output, and schedules either a continuation of the assistant turn or a retry of the unanswered user turn.

When chatRecovery is true, WebSocket turns, sub-agent chat() turns, durable submitMessages() executions, auto-continuations, saveMessages(), and continueLastTurn() are wrapped in runFiber.

Bounded recovery

A stream-stall watchdog abort (chatStreamStallTimeoutMs, below) is treated as just another interruption: when chatRecovery is on, a stall routes into this same bounded path — the settled partial is preserved and a continuation is scheduled — so a transient hang recovers automatically. A persistently hanging provider exhausts the budget and terminalizes through the same exhaustion handling as a deploy or eviction interruption: onExhausted fires, the chat:recovery:exhausted event is emitted, and the configured terminalMessage is shown (not a raw stall error).

Configure bounded recovery by setting chatRecovery to an object:

JavaScript
export class MyAgent extends Think {
chatRecovery = {
maxAttempts: 6,
stableTimeoutMs: 10_000,
terminalMessage: "The assistant was interrupted and could not recover.",
async onExhausted(ctx) {
console.warn("Chat recovery exhausted", ctx.incidentId);
},
};
getModel() {
/* ... */
}
}

The same recovery events are available through agents/observability on the chat channel; transcript repairs are emitted on the transcript channel. Refer to Observability.

onChatRecovery

Override onChatRecovery when you need provider-specific recovery, such as retrieving a stored OpenAI Responses result instead of issuing a new model call:

JavaScript
export class MyAgent extends Think {
chatRecovery = {
maxAttempts: 6,
terminalMessage: "The assistant was interrupted. Please try again.",
};
async onChatRecovery(ctx) {
console.log("Recovering chat turn", ctx.incidentId, ctx.attempt);
return {}; // persist partial output and continue/retry when possible
}
}

ChatRecoveryContext

FieldTypeDescription
incidentIdstringStable ID for this recovery incident
attemptnumberCurrent attempt number for this incident, starting at 1
maxAttemptsnumberConfigured attempt cap before terminal exhaustion
recoveryKind"retry" | "continue"Whether recovery will retry an unanswered user turn or continue a partial assistant turn
streamIdstringThe stream ID of the interrupted turn
requestIdstringThe request ID of the interrupted turn
partialTextstringText generated before the interruption
partialPartsMessagePart[]Parts accumulated before the interruption
recoveryDataunknown | nullData from this.stash() during the turn
messagesUIMessage[]Current conversation history
lastBodyRecord<string, unknown>?Body from the interrupted turn
lastClientToolsClientToolSchema[]?Client tools from the interrupted turn
createdAtnumberEpoch milliseconds when the turn started

ChatRecoveryOptions

FieldTypeDescription
persistboolean?Whether to persist the partial assistant message
continueboolean?Whether to auto-continue with a new turn via continueLastTurn()

With persist: true, the partial message is saved. With continue: true, Think calls continueLastTurn() after the agent reaches a stable state.

For pre-stream interruptions, where ctx.streamId === "" and ctx.partialText === "" but the latest persisted message is still the unanswered user message, Think retries that turn automatically unless continue is false.

TypeScript
onChatRecovery(ctx: ChatRecoveryContext): ChatRecoveryOptions {
if (!ctx.streamId && !ctx.partialText) {
console.log("Recovering a pre-stream interruption");
}
return {};
}

Use ctx.createdAt to skip stale recoveries. For example, if the interrupted turn is older than a few minutes, return { continue: false } so the partial response is preserved without starting an old continuation.

Repairing interrupted tool calls

When a turn is interrupted mid-flight, the transcript can contain a tool call with no settled result. Before the next provider call, Think repairs each such call so the model does not silently re-run it and the provider does not reject the transcript with AI_MissingToolResultsError. The default flips the interrupted call to an errored tool result, so the record survives and conversion still has a tool result for it.

Override repairInterruptedToolPart to customize the repaired shape. The common case is a client-resolved tool — for example an ask_user question that has no server execute and is normally answered by the user's next message. Converting it to a plain text part lets the model treat it as ordinary conversation rather than a tool error, and keeps the question verbatim through compaction:

JavaScript
export class MyAgent extends Think {
repairInterruptedToolPart(part) {
const record = part;
if (record.type === "tool-ask_user") {
const input = record.input;
if (input?.prompt) {
return { type: "text", text: input.prompt };
}
}
return super.repairInterruptedToolPart(part);
}
}

This runs during transcript repair — before the repaired transcript is persisted and sent to the model — so the conversion shapes the current turn, not just the next one. The input is already normalized to a valid object. A returned tool part must carry a settled result (output-available, output-error, or output-denied); returning a non-tool part such as text is also fine.

Stability detection

Think provides methods to check if the agent is in a stable state — no pending tool results, no pending approvals, no active turns.

hasPendingInteraction

Returns true if any assistant message has pending tool calls (tools without results or pending approvals).

TypeScript
protected hasPendingInteraction(): boolean

waitUntilStable

Returns a promise that resolves to true when the agent reaches a stable state, or false if the timeout is exceeded.

JavaScript
const stable = await this.waitUntilStable({ timeout: 30_000 });
if (stable) {
await this.saveMessages([
{
id: crypto.randomUUID(),
role: "user",
parts: [{ type: "text", text: "Now that you are done, summarize." }],
},
]);
}