---
title: Durable recovery
description: Bounded chat recovery, the stream-stall watchdog, repairing interrupted tool calls, and stability detection for Think agents.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/agents/llms.txt  
> Use this file to discover all available pages before exploring further.

[Skip to content](#%5Ftop) 

# Durable recovery

Think wraps chat turns in recoverable [fibers](https://developers.cloudflare.com/agents/api-reference/durable-execution/) by default (`chatRecovery = true`). If the Durable Object is evicted mid-stream, Think reconstructs any buffered chunks, persists partial output, and schedules either a continuation of the assistant turn or a retry of the unanswered user turn.

Note

This is on by default and works without configuration — most apps never touch this page. Read it when you want provider-specific recovery, a stall watchdog, or to tune the terminal experience after recovery gives up.

When `chatRecovery` is `true`, WebSocket turns, sub-agent `chat()` turns, durable `submitMessages()` executions, auto-continuations, `saveMessages()`, and `continueLastTurn()` are wrapped in `runFiber`.

## Bounded recovery

A stream-stall watchdog abort (`chatStreamStallTimeoutMs`, below) is treated as just another interruption: when `chatRecovery` is on, a stall routes into this same bounded path — the settled partial is preserved and a continuation is scheduled — so a transient hang recovers automatically. A persistently hanging provider exhausts the budget and terminalizes through the **same** exhaustion handling as a deploy or eviction interruption: `onExhausted` fires, the `chat:recovery:exhausted` event is emitted, and the configured `terminalMessage` is shown (not a raw stall error).

Configure bounded recovery by setting `chatRecovery` to an object:

* [  JavaScript ](#tab-panel-5792)
* [  TypeScript ](#tab-panel-5793)

JavaScript

```

export class MyAgent extends Think {

  chatRecovery = {

    maxAttempts: 6,

    stableTimeoutMs: 10_000,

    terminalMessage: "The assistant was interrupted and could not recover.",

    async onExhausted(ctx) {

      console.warn("Chat recovery exhausted", ctx.incidentId);

    },

  };


  getModel() {

    /* ... */

  }

}


```

TypeScript

```

export class MyAgent extends Think<Env> {

  override chatRecovery = {

    maxAttempts: 6,

    stableTimeoutMs: 10_000,

    terminalMessage: "The assistant was interrupted and could not recover.",

    async onExhausted(ctx) {

      console.warn("Chat recovery exhausted", ctx.incidentId);

    },

  };


  getModel() {

    /* ... */

  }

}


```

The same recovery events are available through `agents/observability` on the `chat` channel; transcript repairs are emitted on the `transcript` channel. Refer to [Observability](https://developers.cloudflare.com/agents/api-reference/observability/#chat-recovery-events).

## onChatRecovery

Override `onChatRecovery` when you need provider-specific recovery, such as retrieving a stored OpenAI Responses result instead of issuing a new model call:

* [  JavaScript ](#tab-panel-5796)
* [  TypeScript ](#tab-panel-5797)

JavaScript

```

export class MyAgent extends Think {

  chatRecovery = {

    maxAttempts: 6,

    terminalMessage: "The assistant was interrupted. Please try again.",

  };


  async onChatRecovery(ctx) {

    console.log("Recovering chat turn", ctx.incidentId, ctx.attempt);

    return {}; // persist partial output and continue/retry when possible

  }

}


```

TypeScript

```

import type {

  ChatRecoveryContext,

  ChatRecoveryOptions,

} from "@cloudflare/think";


export class MyAgent extends Think<Env> {

  override chatRecovery = {

    maxAttempts: 6,

    terminalMessage: "The assistant was interrupted. Please try again.",

  };


  override async onChatRecovery(

    ctx: ChatRecoveryContext,

  ): Promise<ChatRecoveryOptions> {

    console.log("Recovering chat turn", ctx.incidentId, ctx.attempt);

    return {}; // persist partial output and continue/retry when possible

  }

}


```

### ChatRecoveryContext

| Field           | Type                     | Description                                                                              |
| --------------- | ------------------------ | ---------------------------------------------------------------------------------------- |
| incidentId      | string                   | Stable ID for this recovery incident                                                     |
| attempt         | number                   | Current attempt number for this incident, starting at 1                                  |
| maxAttempts     | number                   | Configured attempt cap before terminal exhaustion                                        |
| recoveryKind    | "retry" \| "continue"    | Whether recovery will retry an unanswered user turn or continue a partial assistant turn |
| streamId        | string                   | The stream ID of the interrupted turn                                                    |
| requestId       | string                   | The request ID of the interrupted turn                                                   |
| partialText     | string                   | Text generated before the interruption                                                   |
| partialParts    | MessagePart\[\]          | Parts accumulated before the interruption                                                |
| recoveryData    | unknown \| null          | Data from this.stash() during the turn                                                   |
| messages        | UIMessage\[\]            | Current conversation history                                                             |
| lastBody        | Record<string, unknown>? | Body from the interrupted turn                                                           |
| lastClientTools | ClientToolSchema\[\]?    | Client tools from the interrupted turn                                                   |
| createdAt       | number                   | Epoch milliseconds when the turn started                                                 |

### ChatRecoveryOptions

| Field    | Type     | Description                                                     |
| -------- | -------- | --------------------------------------------------------------- |
| persist  | boolean? | Whether to persist the partial assistant message                |
| continue | boolean? | Whether to auto-continue with a new turn via continueLastTurn() |

With `persist: true`, the partial message is saved. With `continue: true`, Think calls `continueLastTurn()` after the agent reaches a stable state.

For pre-stream interruptions, where `ctx.streamId === ""` and `ctx.partialText === ""` but the latest persisted message is still the unanswered user message, Think retries that turn automatically unless `continue` is `false`.

TypeScript

```

onChatRecovery(ctx: ChatRecoveryContext): ChatRecoveryOptions {

  if (!ctx.streamId && !ctx.partialText) {

    console.log("Recovering a pre-stream interruption");

  }

  return {};

}


```

Use `ctx.createdAt` to skip stale recoveries. For example, if the interrupted turn is older than a few minutes, return `{ continue: false }` so the partial response is preserved without starting an old continuation.

## Repairing interrupted tool calls

When a turn is interrupted mid-flight, the transcript can contain a tool call with no settled result. Before the next provider call, Think repairs each such call so the model does not silently re-run it and the provider does not reject the transcript with `AI_MissingToolResultsError`. The default flips the interrupted call to an errored tool result, so the record survives and conversion still has a tool result for it.

Override `repairInterruptedToolPart` to customize the repaired shape. The common case is a client-resolved tool — for example an `ask_user` question that has no server `execute` and is normally answered by the user's next message. Converting it to a plain text part lets the model treat it as ordinary conversation rather than a tool error, and keeps the question verbatim through compaction:

* [  JavaScript ](#tab-panel-5798)
* [  TypeScript ](#tab-panel-5799)

JavaScript

```

export class MyAgent extends Think {

  repairInterruptedToolPart(part) {

    const record = part;

    if (record.type === "tool-ask_user") {

      const input = record.input;

      if (input?.prompt) {

        return { type: "text", text: input.prompt };

      }

    }

    return super.repairInterruptedToolPart(part);

  }

}


```

TypeScript

```

import type { UIMessage } from "ai";


export class MyAgent extends Think<Env> {

  protected override repairInterruptedToolPart(

    part: UIMessage["parts"][number],

  ): UIMessage["parts"][number] {

    const record = part as Record<string, unknown>;

    if (record.type === "tool-ask_user") {

      const input = record.input as { prompt?: string } | undefined;

      if (input?.prompt) {

        return { type: "text", text: input.prompt };

      }

    }

    return super.repairInterruptedToolPart(part);

  }

}


```

This runs during transcript repair — before the repaired transcript is persisted and sent to the model — so the conversion shapes the current turn, not just the next one. The `input` is already normalized to a valid object. A returned tool part must carry a settled result (`output-available`, `output-error`, or `output-denied`); returning a non-tool part such as text is also fine.

## Stability detection

Think provides methods to check if the agent is in a stable state — no pending tool results, no pending approvals, no active turns.

### hasPendingInteraction

Returns `true` if any assistant message has pending tool calls (tools without results or pending approvals).

TypeScript

```

protected hasPendingInteraction(): boolean


```

### waitUntilStable

Returns a promise that resolves to `true` when the agent reaches a stable state, or `false` if the timeout is exceeded.

* [  JavaScript ](#tab-panel-5794)
* [  TypeScript ](#tab-panel-5795)

JavaScript

```

const stable = await this.waitUntilStable({ timeout: 30_000 });

if (stable) {

  await this.saveMessages([

    {

      id: crypto.randomUUID(),

      role: "user",

      parts: [{ type: "text", text: "Now that you are done, summarize." }],

    },

  ]);

}


```

TypeScript

```

const stable = await this.waitUntilStable({ timeout: 30_000 });

if (stable) {

  await this.saveMessages([

    {

      id: crypto.randomUUID(),

      role: "user",

      parts: [{ type: "text", text: "Now that you are done, summarize." }],

    },

  ]);

}


```

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/agents/","name":"Agents"}},{"@type":"ListItem","position":3,"item":{"@id":"/agents/think/","name":"Think"}},{"@type":"ListItem","position":4,"item":{"@id":"/agents/think/recovery/","name":"Durable recovery"}}]}
```