Skip to content

Changelog

New updates and improvements at Cloudflare.

Workers
hero image
  1. Durable Objects now supports two new location hints for Asia-Pacific: apac-ne (Northeast Asia-Pacific) and apac-se (Southeast Asia-Pacific). Use apac-ne or apac-se when you want finer-grained placement within Asia-Pacific rather than the broader apac hint.

    Use the new hints the same way as any other locationHint:

    JavaScript
    // Northeast Asia-Pacific (Japan, Korea, etc.)
    const stubNE = env.MY_DURABLE_OBJECT.get(id, { locationHint: "apac-ne" });
    // Southeast Asia-Pacific (Singapore, Indonesia, etc.)
    const stubSE = env.MY_DURABLE_OBJECT.get(id, { locationHint: "apac-se" });

    If your users are spread across all of Asia-Pacific, the existing apac hint remains the right choice. Only reach for apac-ne or apac-se when your traffic is clearly concentrated in one sub-region and you want to minimize round-trip time to that audience. The default behavior and what we generally recommended is not adding a location hint unless absolutely needed, this will create the Durable Object as close to the initializing request as possible to reduce latency.

    As with all location hints, these are best-effort suggestions. Cloudflare will place the Durable Object in a nearby data center, not necessarily the exact hinted location.

    For the full list of supported hints, refer to Data location — Provide a location hint.

  1. AI agents can now deploy Workers to Cloudflare without first requiring a user to sign up, open a browser-based OAuth flow, click through the dashboard, or create an API token. When an agent tries to deploy without Cloudflare credentials, Wrangler can tell it to rerun with --temporary, then deploy the Worker to a temporary preview account.

    To try this with your agent, update to Wrangler 4.102.0 or later, make sure you are logged out (wrangler logout), and then ask your agent to build something and deploy it to Cloudflare. The agent should follow Wrangler's output and deploy using the --temporary flag.

    Diagram showing an AI agent deploying, verifying, and redeploying a Worker to a temporary account, then claiming it after authentication and moving it to a permanent account
    Terminal window
    wrangler deploy --temporary

    The temporary deployment stays live for 60 minutes. During that window, the agent can verify the Worker, redeploy changes, and return both the live Worker URL and claim URL. Opening the claim URL lets you sign in to or create a Cloudflare account and make the temporary account permanent.

    Temporary preview accounts currently support a limited set of products, including Workers, Workers Static Assets, Workers KV, D1, Durable Objects, Hyperdrive, Queues, and SSL/TLS certificates. For supported products, limits, and claim behavior, refer to Claim deployments (temporary accounts).

    For more context, refer to Temporary Cloudflare Accounts for Agents.

  1. You can create PlanetScale Postgres and MySQL databases from Cloudflare and bill PlanetScale database usage through your Cloudflare account as a pay-as-you-go customer. Cloudflare contract customers will be able to add PlanetScale usage to their contract in July so reach out to your Cloudflare account team if interested.

    Create a PlanetScale database from the Cloudflare dashboard to check out globally distributed Workers optimized for regional data access.

    Go to Create a PlanetScale database Request flow from a user to Workers, Hyperdrive caches, connection pools, and PlanetScale.

    PlanetScale databases created from Cloudflare work with Workers through Hyperdrive. Hyperdrive manages database connection pools and query caching, so you can use PlanetScale as a centralized relational database for Workers applications without changing your database drivers, object-relational mapping (ORM) libraries, or SQL tooling.

    PlanetScale usage appears on your Cloudflare invoice each billing period as a dollar total at PlanetScale's standard pricing. You can introspect per-database billing usage via PlanetScale's dashboard.

    When you create a PlanetScale database from the Cloudflare dashboard, you receive the same PlanetScale developer experience, including development branches, query insights, and Model Context Protocol (MCP) server support for agents.

    To get started, refer to PlanetScale Postgres and MySQL with Hyperdrive.

  1. The latest release of the Agents SDK makes it easier to build agents that can safely interact with real systems and keep working through interruptions.

    Agents can now browse websites through Browser Run, write code against external tools through Code Mode, use client-provided tools when delegating to Think sub-agents, and recover more reliably from deploys, Durable Object evictions, and connection churn.

    Safer browser automation

    Agents can now use Browser Run through a single durable browser_execute tool. Instead of choosing from a fixed list of actions, the model writes code against the Chrome DevTools Protocol (CDP) and can inspect pages, capture screenshots, read rendered content, debug frontend behavior, and interact with live browser sessions.

    JavaScript
    const browserTools = createBrowserTools({
    ctx: this.ctx,
    browser: this.env.BROWSER,
    loader: this.env.LOADER,
    session: { mode: "dynamic" },
    });

    Browser sessions can be one-time, reused, or promoted from one-time to persistent during a run. This is useful when an agent needs a human to log in, complete MFA, or approve a sensitive action. The run can pause, keep the same tabs and cookies, and resume after approval.

    The browser tools also add Live View URLs, optional session recording, and quick actions such as browser_markdown, browser_extract, browser_links, and browser_scrape for one-shot browsing tasks.

    Resumable code execution with approvals

    Code Mode now uses createCodemodeRuntime, connectors, and a durable execution log. This lets you give a model one codemode tool instead of a large prompt full of tool definitions. The model can discover the capabilities it needs, write code against typed globals, and reuse saved snippets.

    JavaScript
    const runtime = createCodemodeRuntime({
    ctx: this.ctx,
    executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
    connectors: [new GithubConnector(this.ctx, this.env, connection)],
    });
    const result = streamText({
    model,
    messages,
    tools: { codemode: runtime.tool() },
    });

    When the code reaches an approval-gated action, the runtime pauses execution and returns a pending approval. After approval, completed calls replay from the durable log, the approved action runs, and the same code continues. This makes it practical to build agents that create issues, update external systems, or perform other side effects without custom pause-and-resume logic for every tool.

    Better Think delegation

    Think sub-agents can now use client-defined tools over the RPC chat() path. A parent agent can pass tool schemas with clientTools and resolve tool calls through onClientToolCall. This lets delegated agents use caller-provided capabilities without requiring a browser WebSocket.

    JavaScript
    await child.chat(message, callback, {
    signal,
    clientTools: [
    {
    name: "get_user_timezone",
    description: "Get the caller's timezone",
    parameters: { type: "object" },
    },
    ],
    onClientToolCall: async ({ toolName, input }) => {
    return runClientTool(toolName, input);
    },
    });

    Think Workflows also improve step.prompt(). A prompt step now runs a full agentic turn before returning structured output, so the agent can call tools before producing the typed result. This makes Workflow steps more useful for durable triage, research, and approval flows.

    The unified Think execute tool can also include cdp.* browser capabilities alongside state.* and tools.* when Browser Run is bound.

    Voice output device selection

    Voice clients can route assistant audio to a specific output device. Use outputDeviceId with useVoiceAgent, or call client.setOutputDevice() from the framework-agnostic client.

    JavaScript
    const voice = useVoiceAgent({
    agent: "MyVoiceAgent",
    outputDeviceId: selectedSpeakerId,
    });

    Browsers without speaker-selection support continue playing through the default output device and report a non-fatal outputDeviceError.

    Reliability fixes

    This release includes several fixes for production agents:

    • useAgent and AgentClient handle WebSocket replacement more reliably during reconnects and configuration changes.
    • Chat stream replay is more reliable after reconnects, deploys, and provider errors.
    • Fiber recovery continues across multi-pass scans and backs off when recovery hooks keep failing.
    • Agent teardown continues even when the request that started teardown is canceled.
    • Large session histories use byte-budgeted reads to reduce memory pressure during startup.

    Upgrade

    To update to the latest version:

    npm i agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

    Refer to the Code Mode documentation, Browser tools documentation, Think tools documentation, and Voice documentation for more information.

  1. We are excited to announce GLM-5.2 on Workers AI, Z.ai's flagship agentic coding model.

    @cf/zai-org/glm-5.2 is a text generation model built for agentic coding workflows. With function calling and reasoning support, it can handle long codebases, multi-step planning, and tool-augmented agents.

    Key features and use cases:

    • Agentic coding: Designed for autonomous coding tasks, long-horizon planning, and complex software engineering workflows
    • Large context window: GLM-5.2 supports up to a 1,048,576 token context window. Workers AI is launching the model with a 262,144 token context window and plans to increase this in the future
    • Function calling: Build agents that invoke tools and APIs across multiple conversation turns
    • Reasoning: Tackles complex problem-solving and step-by-step reasoning tasks

    Use GLM-5.2 through the Workers AI binding (env.AI.run()), the REST API at /run or /v1/chat/completions, or AI Gateway.

    Pricing is available on the model page or pricing page.

  1. You can now create custom trace spans in your Workers code using tracing.enterSpan(). Custom spans appear alongside the automatic platform instrumentation (fetch calls, KV reads, D1 queries, and other platform operations) in your traces and OpenTelemetry exports, with correct parent-child nesting.

    The API is available via import { tracing } from "cloudflare:workers" or through the handler context as ctx.tracing:

    TypeScript
    import { tracing } from "cloudflare:workers";
    export default {
    async fetch(request, env, ctx) {
    return tracing.enterSpan("handleRequest", async (span) => {
    span.setAttribute("url.path", new URL(request.url).pathname);
    const data = await env.MY_KV.get("key");
    return new Response(data);
    });
    },
    };

    Spans nest automatically based on the JavaScript async context, and are auto-ended when the callback returns or its returned promise settles. The Span object provides setAttribute(key, value) for attaching metadata and an isTraced property to check whether the current request is being sampled.

    Trace waterfall showing custom spans nested alongside automatic KV and fetch instrumentation

    Tracing must be enabled in your Wrangler configuration for spans to be recorded.

    For full API details and examples, refer to Custom spans.

  1. You can now filter the Metrics tab for a Durable Objects namespace by an individual Durable Object's ID or name in the Cloudflare dashboard. Previously, metrics charts only showed aggregate, namespace-level data, making it difficult to isolate the behavior of a specific object.

    Go to Durable Objects The Durable Objects Metrics tab filtered to a single object by ID, showing per-object requests and errors by invocation status.

    Start typing an ID or name into the filter and select a match from the autocomplete dropdown. The autocomplete only shows objects with invocations during the selected time range, so an object that does not appear has not been invoked in that window. This does not necessarily mean the object has been deleted. Every chart on the page updates to reflect only the selected object. This makes it easier to identify and investigate a single Durable Object when debugging a high-traffic object, an error spike, or unexpected storage usage. Clear the filter to return to namespace-level metrics.

    Metrics are powered by the GraphQL Analytics API, so standard analytics behavior such as ingestion delay and sampling applies.

    For more information, refer to Metrics and analytics.

  1. Dynamic Workers usage on the Workers overview page

    Customers can now view the number of Dynamic Workers invoked during their billing period from the Workers overview page in the Cloudflare dashboard.

    This count reflects the number of Dynamic Workers that Cloudflare would bill for during the selected billing period. Dynamic Workers usage data only goes back to June 1, 2026.

    You can also query this count through the GraphQL Analytics API by using workersInvocationsByOwnerAndScriptGroups and selecting distinctDynamicWorkerCount:

    query getDynamicWorkersCount(
    $accountTag: string!
    $filter: AccountWorkersInvocationsByOwnerAndScriptGroupsFilter_InputObject
    ) {
    viewer {
    accounts(filter: { accountTag: $accountTag }) {
    workersInvocationsByOwnerAndScriptGroups(limit: 10000, filter: $filter) {
    uniq {
    distinctDynamicWorkerCount
    }
    }
    }
    }
    }

    Use variables to set the account and billing-period date range:

    {
    "accountTag": "<ACCOUNT_ID>",
    "filter": {
    "date_geq": "2026-06-01",
    "date_leq": "2026-06-30"
    }
    }

    For more information, refer to Dynamic Workers pricing.

  1. Pay-as-you-go customers can now view billable usage and create budget alerts directly from the product overview pages for Workers & Pages, D1, R2, Workers KV, Queues, Vectorize, Durable Objects, and Containers. A new sidebar widget shows current-period spend and the billing cycle date range, alongside a button to create a budget alert.

    The widget pulls from the same data as the Billable Usage dashboard and aligns to your billing cycle (or the current day on Free plans), so the numbers match your invoice. Enterprise contract accounts are not yet supported.

    Billable usage widget in the Durable Objects product sidebar showing current-period spend and a breakdown by service

    Selecting Create budget alert opens the budget alert flow inline so you can set a dollar threshold in the same place you are reviewing usage. Budget alerts apply to your total account-level spend across all products, not just the product page you create them from.

    For more information, refer to the Usage-based billing documentation.

  1. The pipeline field inside the pipelines binding configuration in your Wrangler configuration file has been renamed to stream. The old field is deprecated but still accepted.

    Update your configuration to use stream to avoid the deprecation warning.

    Before (deprecated):

    JSONC
    {
    "$schema": "./node_modules/wrangler/config-schema.json",
    "pipelines": [
    {
    "binding": "MY_PIPELINE",
    "pipeline": "<STREAM_ID>"
    }
    ]
    }

    After:

    JSONC
    {
    "$schema": "./node_modules/wrangler/config-schema.json",
    "pipelines": [
    {
    "binding": "MY_PIPELINE",
    "stream": "<STREAM_ID>"
    }
    ]
    }

    No other changes are required. The binding name, TypeScript types, and runtime API (env.MY_PIPELINE.send(...)) remain the same.

    For more information on configuring pipeline bindings, refer to Writing to streams.

  1. You can now create, update, or delete multiple secrets for your Worker in a single request using the bulk secrets endpoint.

    • Include a secret with a value to create or update.
    • Set a secret to null to delete.
    • Secrets not included in the request are left unchanged.

    The following example creates API_KEY, updates the already existing DB_PASSWORD, and deletes OLD_SECRET:

    {
    "secrets": {
    "API_KEY": { "type": "secret_text", "name": "API_KEY", "text": "my-api-key" },
    "DB_PASSWORD": { "type": "secret_text", "name": "DB_PASSWORD", "text": "my-db-password" },
    "OLD_SECRET": null
    }
    }

    You can do the same from the command line using wrangler secret bulk:

    Terminal window
    npx wrangler secret bulk < secrets.json

    To delete a key, set its value to null in the JSON file. Deletion is not supported with .env files.

    Each request supports up to 100 total operations (creates, updates, and deletes combined).

  1. You can now attach cron schedules directly to a Workflow binding in wrangler.jsonc. Each scheduled run creates a new Workflow instance automatically, so you do not need to define a separate Worker with a scheduled handler just to trigger your Workflow on an interval.

    For example, you can configure hourly, every-15-minute, or weekday schedules on the same Workflow:

    JSONC
    {
    "workflows": [
    {
    "name": "my-scheduled-workflow",
    "binding": "MY_WORKFLOW",
    "class_name": "MyScheduledWorkflow",
    "schedules": ["0 * * * *", "*/15 * * * *", "0 9 * * MON-FRI"],
    },
    ],
    }

    Cron workloads get all the same benefits of Workflows with built-in retries, multi-step durable execution, and configurable timeouts of Workflows.

    TypeScript
    import {
    WorkflowEntrypoint,
    WorkflowEvent,
    WorkflowStep,
    } from "cloudflare:workers";
    // Runs automatically on each cron schedule defined for the MY_WORKFLOW binding in wrangler.jsonc.
    export class MyScheduledWorkflow extends WorkflowEntrypoint<Env> {
    async run(event: WorkflowEvent, step: WorkflowStep) {
    const data = await step.do("fetch source data", async () => {
    return await fetchSourceData();
    });
    // If this step fails, only this step is retried with the custom logic below
    await step.do(
    "process and store results",
    {
    retries: { limit: 5, delay: "30 seconds", backoff: "exponential" },
    timeout: "10 minutes",
    },
    async () => {
    await processAndStore(data);
    },
    );
    }
    }

    This makes it easier to build recurring, scheduled jobs such as database backups, invoice generation, report aggregation, and cleanup tasks without wiring up a separate Cron Trigger entrypoint.

    For more information, refer to Trigger Workflows.

  1. The latest release of the Agents SDK adds four new ways to build with @cloudflare/think: on-demand Agent Skills, chat messengers (starting with Telegram), declarative scheduled tasks, and durable reasoning steps inside Workflows. This release also significantly hardens durable chat recovery, so turns reliably ride through deploys, evictions, and stalled model streams in production.

    Agent Skills (experimental)

    Give an agent a catalog of on-demand instructions, resources, and scripts. A skill source adds a catalog to the system prompt, and the model activates a skill only when a task matches — so a large library of capabilities does not bloat every prompt.

    JavaScript
    import { Think, skills } from "@cloudflare/think";
    import bundledSkills from "agents:skills";
    export class SkillsAgent extends Think {
    getSkills() {
    return [
    bundledSkills,
    skills.r2(this.env.SKILLS_BUCKET, { prefix: "skills/" }),
    ];
    }
    }

    The agents:skills import bundles a local ./skills directory through the Agents Vite plugin (one directory per skill, each with a SKILL.md). Skills can also load from R2 or a manifest. When skills are available, Think exposes activate_skill, read_skill_resource, and an optional run_skill_script tool. Skill loading is resilient: a duplicate or failing source is skipped with a warning instead of breaking the agent.

    Agent Skills are experimental, and script execution in particular is early. The API may change in a future release. We would love your feedback — tell us what you are building and what is missing in the Agents repository.

    Messengers

    Connect a Think agent directly to a chat platform. Think owns the webhook route, conversation routing, durable reply fiber, and streamed delivery back to the provider. Telegram ships as the first provider.

    JavaScript
    import { Think } from "@cloudflare/think";
    import {
    defineMessengers,
    ThinkMessengerStateAgent,
    } from "@cloudflare/think/messengers";
    import telegramMessenger from "@cloudflare/think/messengers/telegram";
    export { ThinkMessengerStateAgent };
    export class SupportAgent extends Think {
    getMessengers() {
    return defineMessengers({
    telegram: telegramMessenger({
    token: this.env.TELEGRAM_BOT_TOKEN,
    userName: "support_bot",
    secretToken: this.env.TELEGRAM_WEBHOOK_SECRET_TOKEN,
    }),
    });
    }
    }

    Each Chat SDK thread maps to its own Think sub-agent by default, so group chats and direct messages do not share memory. Multiple bots, custom conversation routing, and custom providers are all supported.

    Scheduled tasks

    Declare recurring, timezone-aware prompts and handlers with a typed domain-specific language (DSL). Think reconciles the declarations on startup and re-arms the next occurrence after each run, backed by durable idempotent submissions.

    JavaScript
    import { Think, defineScheduledTasks } from "@cloudflare/think";
    export class DigestAgent extends Think {
    getScheduledTasks() {
    return defineScheduledTasks({
    weeklyCommitReport: {
    schedule: "every week on monday at 09:00",
    prompt:
    "Compile my GitHub commits for the last week and summarize them.",
    },
    workout: {
    schedule: "every day at 08:00 in Europe/London",
    prompt: "Start my workout.",
    },
    });
    }
    }

    Think Workflows

    Run a model-driven reasoning step inside a Cloudflare Workflow with ThinkWorkflow and step.prompt(), with durable typed structured output, long waits, and approval gates.

    JavaScript
    import { z } from "zod";
    import { ThinkWorkflow } from "@cloudflare/think/workflows";
    const draftSchema = z.object({
    title: z.string(),
    summary: z.string(),
    labels: z.array(z.string()),
    });
    export class TriageWorkflow extends ThinkWorkflow {
    async run(event, step) {
    const draft = await step.prompt("triage-issue", {
    prompt: `Triage issue #${event.payload.issueNumber}`,
    output: draftSchema,
    timeout: "3 days",
    });
    await step.do("apply-labels", async () => {
    await this.agent.applyLabels(draft.labels);
    });
    }
    }

    Production hardening for durable chat recovery

    Durable chat turns have always been designed to survive a mid-turn deploy or Durable Object eviction. This release is a major hardening pass on that machinery for production.

    • Better recovery during deploys. Turns now ride through continuous deploys and evictions without losing completed work or re-running tools that already ran.
    • A live "recovering…" signal. useAgentChat exposes a new isRecovering flag, so a recovering turn shows progress instead of looking frozen. Most UIs render isStreaming || isRecovering as "busy".
    • Stalled streams recover. Set chatStreamStallTimeoutMs to route a hung provider stream into the same recovery path instead of leaving an infinite spinner.
    • Sub-agents re-attach. On parent recovery, an in-flight agentTool() child is re-attached to its result rather than abandoned and re-run, so long-running children no longer lose work under deploys.

    MCP transport improvements

    • Resumable streams — In-flight tool calls over Server-Sent Events (SSE) survive a dropped connection. Clients reconnect with Last-Event-ID and replay anything they missed.
    • Readable server IDsaddMcpServer accepts an optional id, so tools surface as readable keys (for example tool_github_create_pull_request) instead of opaque connection IDs.
    • Better handling of concurrent requests — Overlapping JSON-RPC requests are now correctly correlated to their responses across the HTTP and RPC transports.

    Other improvements

    • Compaction — A Session's tokenCounter now also drives the compaction boundary decision ("what to compress"), not just the fire/no-fire trigger.
    • @cloudflare/worker-bundler — Adds a virtualModules option to createWorker to provide in-memory module source during bundling.
    • Client-tool continuations — Parallel tool results now coalesce into a single continuation, immediate resume requests attach to the pending continuation, and server-side needsApproval continuations resume reliably after approval.

    Upgrade

    To update to the latest version:

    npm i agents@latest @cloudflare/think@latest @cloudflare/ai-chat@latest

    Refer to the Agents API reference and Chat agents documentation for more information.

  1. You can now share local dev sessions through Cloudflare Tunnel and get a public URL when using either Wrangler or the Cloudflare Vite plugin. This is useful when you need to share a preview, test a webhook, or access your app from another device.

    Vite local dev tunnel demo

    This lets you either:

    To start a tunnel, press t in Wrangler or t + Enter in Vite while your dev server is running. For details on setting up a named tunnel, refer to Share a local dev server.

  1. You can now view the size of your Hyperdrive database connection pools, giving you the ability to self-diagnose connection issues. Using the Cloudflare dashboard or the hyperdrivePoolSizesAdaptiveGroups dataset in the GraphQL Analytics API, you can see waitingClients, currentPoolSize, availablePoolSlots, and maxPoolSize for each of your configurations.

    A new Pool connections chart has been added to the Metrics tab of each Hyperdrive configuration in the Cloudflare dashboard. You can use the location selector to drill down into specific locations hosting your connection pool by airport code.

    Hyperdrive pool size metrics chart

    The chart shows:

    • Waiting clients: Client requests waiting for an available connection.
    • Open connections: Active connections to your database.
    • Pool size maximum: Your configured origin connection limit.

    Connection contention appears as a spike in waiting clients, or when open connections consistently approach the pool size maximum. If your open connections regularly approach this limit, consider contacting Cloudflare to increase your Hyperdrive connection limit.

    Pool size metrics

    The hyperdrivePoolSizesAdaptiveGroups dataset in the GraphQL Analytics API exposes the following key connection pool metrics for each Hyperdrive configuration:

    Under avg:

    • currentPoolSize — Average number of connections currently open in the pool.
    • availablePoolSlots — Average number of pool connections available for checkout.
    • waitingClients — Average number of clients waiting for a connection from the pool.

    Under max:

    • maxPoolSize — Configured maximum size of the connection pool.
    • currentPoolSize — Peak number of connections open in the pool.
    • waitingClients — Peak number of clients waiting for a connection from the pool.

    For more information, refer to Metrics and analytics and Connection pooling.

  1. In your Worker's dashboard, there is now a dedicated Domains tab where you can purchase a new domain through Cloudflare Registrar and have it automatically connected, add an existing domain, and manage all of your Worker's routing in one place.

    The new Domains tab in the Workers dashboard

    You can also enable or disable your workers.dev subdomain and Preview URLs, put them behind Cloudflare Access to require sign-in, and jump directly to analytics or domain overview for any connected domain.

    To get started, go to Workers & Pages, select a Worker, and open the Domains tab.

    Go to Workers & Pages
  1. The latest release of the Agents SDK brings more reliable chat recovery, fixes Agent state synchronization during reconnects, adds durable submissions for Think, exposes routing retry configuration, and adds connection control for Voice agents.

    Chat recovery improvements

    @cloudflare/ai-chat now keeps server turns running when a browser or client stream is interrupted. This is useful for long-running AI responses where users refresh the page, close a tab, or temporarily lose connection. Calling stop() still cancels the server turn.

    Set cancelOnClientAbort: true if browser or client aborts should also cancel the server turn:

    JavaScript
    const chat = useAgentChat({
    agent: "assistant",
    name: "user-123",
    cancelOnClientAbort: true,
    });

    Notable bug fixes:

    • Chat stream resume negotiation no longer throws when replay races with a closed WebSocket connection.
    • Recovered chat continuations no longer leave useAgentChat stuck in a streaming state when the original socket disconnects before a terminal response.
    • Approval auto-continuation preserves reasoning parts and persists continuation reasoning in the final message.
    • isServerStreaming now resets correctly when a resumed stream moves from the fallback observer path to a transport-owned stream.

    Agent state and routing fixes

    agents@0.12.4 prevents duplicate initial state frames during WebSocket connection setup. This avoids stale initial state messages overwriting state updates already sent by the client.

    Agent recovery is also more reliable when tool calls span a Durable Object restart. Recovery now defers user finish hooks until after agent startup and isolates hook failures, so one failed hook does not block other recovered runs from finalizing.

    getAgentByName() now supports routingRetry for transient Durable Object routing failures:

    JavaScript
    import { getAgentByName } from "agents";
    const agent = await getAgentByName(env.AssistantAgent, "user-123", {
    routingRetry: {
    maxAttempts: 3,
    },
    });

    Durable Think submissions

    @cloudflare/think now supports durable programmatic submissions. submitMessages() provides durable acceptance, idempotent retries, status inspection, cancellation, and cleanup for server-driven turns that should continue after the caller returns.

    Think.chat() RPC turns now run inside chat recovery fibers and persist their stream chunks. Interrupted sub-agent turns can recover partial output instead of starting over.

    ChatOptions.tools has been removed from the TypeScript API. Define durable tools on the child agent or use agent tools for orchestration. Runtime options.tools values passed by legacy callers are ignored with a warning.

    Think message pruning behavior change

    @cloudflare/think no longer applies pruneMessages({ toolCalls: "before-last-2-messages" }) to model context by default. The previous default could strip client-side tool results from longer multi-turn flows.

    truncateOlderMessages still runs as before, so context cost remains bounded. Subclasses that relied on the old aggressive pruning can opt back in from beforeTurn:

    JavaScript
    import { Think } from "@cloudflare/think";
    import { pruneMessages } from "ai";
    export class MyAgent extends Think {
    beforeTurn(ctx) {
    return {
    messages: pruneMessages({
    messages: ctx.messages,
    toolCalls: "before-last-2-messages",
    }),
    };
    }
    }

    Voice agent connection control

    @cloudflare/voice adds an enabled option to useVoiceAgent. React apps can now delay creating and connecting a VoiceClient until prerequisites such as capability tokens are ready.

    JavaScript
    const voice = useVoiceAgent({
    agent: "MyVoiceAgent",
    enabled: Boolean(token),
    });

    This release also fixes Workers AI speech-to-text session edge cases and withVoice text streaming from AI SDK textStream responses.

    Other improvements

    • Streamable HTTP routing — Server-to-client requests now route through the originating POST stream when no standalone SSE stream is available.
    • Structured tool output — Tool output shapes are preserved when truncating older messages or oversized persisted rows.
    • Non-chat Think tool steps — Think agent-tool children can complete without emitting assistant text and can return structured output through getAgentToolOutput.
    • Sub-agent schedules — Stale sub-agent schedule rows are pruned when their owning facet registry entry no longer exists.
    • @cloudflare/codemode — Adds a browser-safe export with an iframe sandbox executor and resolves OpenAPI specs inside the sandbox to avoid Worker Loader RPC size limits.

    Upgrade

    To update to the latest version:

    Terminal window
    npm i agents@latest @cloudflare/ai-chat@latest @cloudflare/think@latest @cloudflare/voice@latest

    Refer to the Agents API reference and Chat agents documentation for more information.

  1. Multiple security vulnerabilities were disclosed by the React team and Vercel affecting React Server Components and Next.js. These include denial of service, middleware and proxy bypass, server-side request forgery, cross-site scripting, and cache poisoning issues across a range of severity levels.

    We strongly recommend updating your application and its dependencies immediately. Patched versions are available for React (react-server-dom-webpack, react-server-dom-parcel, and react-server-dom-turbopack 19.0.6, 19.1.7, and 19.2.6) and Next.js (15.5.16 and 16.2.5).

    WAF protections

    Cloudflare WAF rules deployed in response to prior React Server Component CVEs (CVE-2025-55184 and CVE-2026-23864) already provide coverage for the newly disclosed denial-of-service vulnerabilities. These rules are enabled by default with a Block action for all customers using the Cloudflare Managed Ruleset, including Free plan customers using the Free Managed Ruleset.

    RulesetRule descriptionRule IDDefault action
    Cloudflare Managed RulesetReact - DoS - CVE-2025-551842694f1610c0b471393b21aef102ec699Block
    Cloudflare Managed RulesetReact - DoS - CVE-2026-23864aaede80b4d414dc89c443cea61680354Block

    The existing rules detect the underlying attack patterns generically. As a result, they apply to the new CVE-2026-23870 denial-of-service vulnerability in Server Components and the corresponding Next.js advisory GHSA-8h8q-6873-q5fj.

    Cloudflare is investigating whether WAF rules can be safely and effectively deployed for three of the high-severity advisories: CVE-2026-23870 / GHSA-8h8q-6873-q5fj, GHSA-267c-6grr-h53f, and GHSA-mg66-mrh9-m8jx. If it is possible to create a managed WAF rule that mitigates these CVEs and does not potentially break application behavior, Cloudflare will add additional managed WAF rules. These rules will be announced through the WAF changelog. Because these vulnerabilities were shared with Cloudflare with minimal advance notice, we are still investigating what WAF mitigations are possible.

    Several of the disclosed vulnerabilities are not possible to block in WAF. We strongly recommend updating your applications so they are not purely reliant on WAF mitigations.

    Customers on Pro, Business, or Enterprise plans should ensure that Managed Rules are enabled.

    Next.js adapters

    Vinext: Vinext is a Vite plugin that reimplements the Next.js API surface. Vinext's latest release is not vulnerable to any of the disclosed CVEs. Vinext's architecture differs from stock Next.js in ways that sidestep the affected code paths. For example, it does not implement the PPR resume protocol, does not expose Pages Router data-route endpoints, and strips internal headers such as x-nextjs-data at request boundaries. As an extra layer of defense, we added a React 19.2.6 or later requirement when running vinext init (PR #1118, PR #1112) to prevent accidentally running a vulnerable version of React with Vinext.

    OpenNext on Cloudflare: OpenNext is an adapter that lets you deploy Next.js apps to the Cloudflare Workers platform. OpenNext itself is not directly vulnerable to the React denial-of-service CVE, but users must update the Next.js version in their application. The OpenNext team has updated the adapter to further harden against these vectors and released a new version of the Cloudflare adapter. Test fixtures and examples have been updated to use patched versions (PR #1255).

    Summary of disclosed vulnerabilities

    AdvisorySeverityIssueWAF status
    CVE-2026-23870 / GHSA-8h8q-6873-q5fjHighDenial of service in Server ComponentsWAF rules in place: 2694f1610c0b471393b21aef102ec699, aaede80b4d414dc89c443cea61680354
    Cloudflare is investigating additional managed WAF coverage
    GHSA-267c-6grr-h53fHighMiddleware bypass via segment-prefetch routesCloudflare is investigating if this can be safely and effectively mitigated by a managed WAF rule
    GHSA-mg66-mrh9-m8jxHighDenial of service via connection exhaustion in Cache ComponentsCloudflare is investigating if this can be safely and effectively mitigated by a managed WAF rule
    GHSA-492v-c6pp-mqqvHighMiddleware bypass via dynamic route parameter injectionNot possible to safely enable a managed WAF rule without potentially breaking application behavior
    GHSA-c4j6-fc7j-m34rHighSSRF via WebSocket upgradesNot possible to safely enable a managed WAF rule without potentially breaking application behavior
    GHSA-36qx-fr4f-26g5HighMiddleware bypass in Pages Router i18nCustom WAF rule possible; global managed rule could potentially break application behavior
    GHSA-ffhc-5mcf-pf4qModerateXSS via CSP noncesCustom WAF rule possible; global managed rule could potentially break application behavior
    GHSA-gx5p-jg67-6x7hModerateXSS in beforeInteractive scriptsNot possible to safely enable a managed WAF rule without potentially breaking application behavior
    GHSA-h64f-5h5j-jqjhModerateDenial of service in Image Optimization APICustom WAF rule possible; global managed rule could potentially break application behavior
    GHSA-wfc6-r584-vfw7ModerateCache poisoning in RSC responsesCustom WAF rule possible; global managed rule could potentially break application behavior
    GHSA-vfv6-92ff-j949LowCache poisoning via RSC cache-busting collisionsNot possible to safely enable a managed WAF rule without potentially breaking application behavior
    GHSA-3g8h-86w9-wvmqLowMiddleware redirect cache poisoningCustom WAF rule possible; global managed rule could potentially break application behavior
  1. You can now get a single unified trace across Worker-to-Worker subrequests, with trace context propagating automatically. Previously, automatic tracing produced disconnected traces when a Worker called another Worker through a service binding or Durable Object.

    Unified trace showing nested spans across a Durable Object subrequest and a service binding call

    This means you can:

    • Follow a request through your entire Worker architecture in one trace view
    • See service binding and Durable Object calls as nested child spans instead of separate traces
    • Debug cross-Worker request flows in the Cloudflare dashboard or in an external observability platform via OpenTelemetry

    Tracing must be enabled in your Wrangler configuration for traces to be recorded. Checkout Workers tracing to get started.

    Up next, we are working on external trace context propagation using W3C Trace Context standards, which will allow traces from your Workers to link with traces from services outside of Cloudflare.

  1. You can now use @cloudflare/dynamic-workflows to run a Workflow inside a Dynamic Worker, ensuring durable execution for code that is loaded at runtime.

    The Worker Loader loads Dynamic Workers on demand, which previously made durability challenging. Even within a Dynamic Worker, a Workflow might sleep for hours or days between steps, and by the time it resumes, the original Dynamic Worker code would no longer be in memory.

    The library solves this by tagging each Workflow instance with metadata that identifies which Dynamic Worker to load — for example, a tenant ID — then reloading the matching Dynamic Worker through the Worker Loader whenever a Workflow awakens.

    Because Dynamic Workers are created on-demand, you do not have to register each Workflow up front or manage them individually. Load the Workflow code in the Dynamic Worker when it is needed, and the Workflows engine handles persistence and retries behind the scenes. Your Workflow code itself is unaffected by the routing and behaves as normal.

    This unlocks patterns where the Workflow code itself is dynamic. For example, this is useful with:

    • SaaS platforms where each tenant defines their own automation, such as onboarding sequences, approval chains, or billing retry logic.
    • AI agent frameworks where agents generate and execute multi-step plans at runtime, surviving restarts and waiting for human approval between tool calls.
    • Multi-tenant job systems where each customer submits their own processing logic and every step persists progress and retries on failure.
    TypeScript
    import {
    createDynamicWorkflowEntrypoint,
    DynamicWorkflowBinding,
    wrapWorkflowBinding,
    type WorkflowRunner,
    } from "@cloudflare/dynamic-workflows";
    export { DynamicWorkflowBinding };
    interface Env {
    WORKFLOWS: Workflow;
    LOADER: WorkerLoader;
    }
    function loadTenant(env: Env, tenantId: string) {
    return env.LOADER.get(tenantId, async () => ({
    compatibilityDate: "2026-01-01",
    mainModule: "index.js",
    modules: { "index.js": await fetchTenantCode(tenantId) },
    // The Dynamic Worker uses this exactly like a real Workflow binding;
    // every create() is tagged with { tenantId } automatically.
    env: { WORKFLOWS: wrapWorkflowBinding({ tenantId }) },
    }));
    }
    // The entrypoint name must match `class_name` in the workflows binding of your Wrangler config file.
    export const DynamicWorkflow = createDynamicWorkflowEntrypoint<Env>(
    async ({ env, metadata }) => {
    const stub = loadTenant(env, metadata.tenantId as string);
    return stub.getEntrypoint("TenantWorkflow") as unknown as WorkflowRunner;
    },
    );
    export default {
    fetch(request: Request, env: Env) {
    const tenantId = request.headers.get("x-tenant-id")!;
    return loadTenant(env, tenantId).getEntrypoint().fetch(request);
    },
    };

    For a full walkthrough, refer to the Dynamic Workflows guide.

  1. Pay-as-you-go customers can now monitor usage-based costs and configure spend alerts through two new features: the Billable Usage dashboard and Budget alerts.

    Billable Usage dashboard

    The Billable Usage dashboard provides daily visibility into usage-based costs across your Cloudflare account. The data comes from the same system that generates your monthly invoice, so the figures match your bill.

    The dashboard displays:

    • A bar chart showing daily usage charges for your billing period
    • A sortable table breaking down usage by product, including total usage, billable usage, and cumulative costs
    • Ability to view previous billing periods

    Usage data aligns to your billing cycle, not the calendar month. The total usage cost shown at the end of a completed billing period matches the usage overage charges on your corresponding invoice.

    To access the dashboard, go to Manage Account > Billing > Billable Usage.

    Screenshot of the Billable Usage dashboard in the Cloudflare dashboard

    Budget alerts

    Budget alerts allow you to set dollar-based thresholds for your account-level usage spend. You receive an email notification when your projected monthly spend reaches your configured threshold, giving you proactive visibility into your bill before month-end.

    To configure a budget alert:

    1. Go to Manage Account > Billing > Billable Usage.
    2. Select Set Budget Alert.
    3. Enter a budget threshold amount greater than $0.
    4. Select Create.

    Alternatively, configure alerts via Notifications > Add > Budget Alert.

    Create Budget Alert modal in the Cloudflare dashboard

    You can create multiple budget alerts at different dollar amounts. The notifications system automatically deduplicates alerts if multiple thresholds trigger at the same time. Budget alerts are calculated daily based on your usage trends and fire once per billing cycle when your projected spend first crosses your threshold.

    Both features are available to Pay-as-you-go accounts with usage-based products (Workers, R2, Images, etc.). Enterprise contract accounts are not supported.

    For more information, refer to the Usage based billing documentation.

  1. Binary frames received on a WebSocket are now delivered to the message event as Blob objects by default. This matches the WebSocket specification and standard browser behavior. Previously, binary frames were always delivered as ArrayBuffer. The binaryType property on WebSocket controls the delivery type on a per-WebSocket basis.

    This change has been active for Workers with compatibility dates on or after 2026-03-17, via the websocket_standard_binary_type compatibility flag. We should have documented this change when it shipped but didn't. We're sorry for the trouble that caused. If your Worker handles binary WebSocket messages and assumes event.data is an ArrayBuffer, the frames will arrive as Blob instead, and a naive instanceof ArrayBuffer check will silently drop every frame.

    To opt back into ArrayBuffer delivery, assign binaryType before calling accept(). This works regardless of the compatibility flag:

    JavaScript
    const resp = await fetch("https://example.com", {
    headers: { Upgrade: "websocket" },
    });
    const ws = resp.webSocket;
    // Opt back into ArrayBuffer delivery for this WebSocket.
    ws.binaryType = "arraybuffer";
    ws.accept();
    ws.addEventListener("message", (event) => {
    if (typeof event.data === "string") {
    // Text frame.
    } else {
    // event.data is an ArrayBuffer because we set binaryType above.
    }
    });

    If you are not ready to migrate and want to keep ArrayBuffer as the default for all WebSockets in your Worker, add the no_websocket_standard_binary_type flag to your Wrangler configuration file.

    This change has no effect on the Durable Object hibernatable WebSocket webSocketMessage handler, which continues to receive binary data as ArrayBuffer.

    For more information, refer to WebSockets binary messages.

  1. Workflows limits have been raised to the following:

    LimitPreviousNew
    Concurrent instances (running in parallel)10,00050,000
    Instance creation rate (per account)100/second per account300/second per account, 100/second per workflow
    Queued instances per Workflow 11 million2 million

    These increases apply to all users on the Workers Paid plan. Refer to the Workflows limits documentation for more details.

    Footnotes

    1. Queued instances are instances that have been created or awoken and are waiting for a concurrency slot.

  1. Local Explorer is a browser-based interface and REST API for viewing and editing local resource data during development. It removes the need to write throwaway scripts or dig through .wrangler/state to understand what data your Worker has stored locally.

    Local Explorer is available in Wrangler 4.82.1+ and the Cloudflare Vite plugin 1.32.0+. Start a local development session and press e in your terminal, or navigate to /cdn-cgi/explorer on your local dev server.

    Supported resources

    Local Explorer supports five resource types and works across multiple workers running locally:

    • KV — Browse keys, view values and metadata, create, update, and delete key-value pairs.
    • R2 — List objects, view metadata, upload files, and delete objects. Supports directory views and multi-select.
    • D1 — Browse tables and rows, run arbitrary SQL queries, and edit schemas in a full data studio.
    • Durable Objects (SQLite storage) — Browse individual object SQLite tables, run SQL queries, and edit schemas.
    • Workflows — List instances, view status and step history, trigger new runs, and pause, resume, restart, or terminate instances.

    OpenAPI-powered REST API

    Local Explorer exposes a REST API at /cdn-cgi/explorer/api that provides programmatic access to the same operations available in the browser. The root endpoint returns an OpenAPI specification describing all available endpoints, parameters, and response formats.

    Terminal window
    curl http://localhost:8787/cdn-cgi/explorer/api

    Point an AI coding agent at /cdn-cgi/explorer/api and it can discover and interact with your local resources without manual setup. This enables iterative development loops where an agent can populate test data in KV or D1, inspect Durable Object state, trigger Workflow runs, or upload files to R2.

    For more details, refer to the Local Explorer documentation.

  1. The simultaneous open connections limit has been relaxed. Previously, each Worker invocation was limited to six open connections at a time for the entire lifetime of each connection, including while reading the response body. Now, a connection is freed as soon as response headers arrive, so the six-connection limit only constrains how many connections can be in the initial "waiting for headers" phase simultaneously.

    Before: New connections are blocked until an earlier connection fully completes

    A 7th fetch is queued until an earlier connection fully completes, including reading its entire response body

    After: New connections can start as soon as response headers arrive

    A 7th fetch starts as soon as any earlier connection receives its response headers

    This means Workers can now have many more connections open at the same time without queueing, as long as no more than six are waiting for their initial response. This eliminates the Response closed due to connection limit exception that could previously occur when the runtime canceled stalled connections to prevent deadlocks.

    Previously, the runtime used a deadlock avoidance algorithm that watched each open connection for I/O activity. If all six connections appeared idle — even momentarily — the runtime would cancel the least-recently-used connection to make room for new requests. In practice, this heuristic was fragile. For example, when a response used Content-Encoding: gzip, the runtime's internal decompression created brief gaps between read and write operations. During these gaps, the connection appeared stalled despite being actively read by the Worker. If multiple connections hit these gaps at the same time, the runtime could spuriously cancel a connection that was working correctly. By only counting connections during the waiting-for-headers phase — where the runtime is fully in control and there is no ambiguity about whether the connection is active — this class of bug is eliminated entirely.

    Before: Connections could be canceled during brief internal pauses

    A connection with gaps from gzip decompression appears idle and is canceled by the runtime

    After: Connections complete normally regardless of internal pauses

    The same connection completes normally because the body phase is no longer counted against the limit