Skip to content

Changelog

New updates and improvements at Cloudflare.

Developer platform
hero image
  1. The latest release of the Agents SDK makes it easier to run long work in the background, drive turns through one entry point, and keep chat agents working through deploys, evictions, and reconnects.

    This release adds first-class detached (background) sub-agent runs with live progress and durable milestones, a single runTurn turn-admission entry point, and a large round of recovery and reliability fixes that continue converging @cloudflare/think and @cloudflare/ai-chat onto one model.

    Background sub-agents with progress and milestones

    runAgentTool can now dispatch a sub-agent without blocking the calling turn. A detached run returns a handle immediately and is owned by a durable, eviction-surviving backbone instead of being abandoned when the dispatching turn ends.

    JavaScript
    class OrdersAgent extends Think {
    async startImport(input) {
    // Fire-and-forget, or wire a durable completion callback
    // (by method name, like schedule()):
    await this.runAgentTool(ImportAgent, {
    input,
    detached: { onFinish: "onImportDone", maxBudgetMs: 60 * 60 * 1000 },
    });
    }
    // result.status: "completed" | "error" | "aborted" | "interrupted"
    async onImportDone(run, result) {}
    }

    Highlights:

    • Durable, exactly-once-on-the-happy-path completion via a warm fast path plus a self-scheduling reconcile backbone that survives eviction and deploys.
    • Bounded. An absolute maxBudgetMs ceiling (default 24h) and cancelAgentTool(runId) keep abandoned runs from holding a concurrency slot forever.
    • detached: { notify: true } lets a finished background run inject a message back into the chat so the model reacts to the result — no hand-wired onFinish needed.

    Sub-agents can also report mid-run progress that rides their own turn stream back to the parent's connected clients:

    JavaScript
    // Inside the child sub-agent:
    await this.reportProgress({
    fraction: 0.6,
    phase: "deploying",
    message: "Generating menu page…",
    });

    Progress surfaces on AgentToolRunState.progress via useAgentToolEvents, so a background-runs tray can render a live bar without drilling in, and the latest snapshot is persisted for inspection after eviction. Naming a milestone promotes a signal to a durable, replayable row, and detached: { onMilestones } can surface a milestone as a synthetic chat message ("narrate" for a cheap status line, or "react" to drive a model turn).

    One entry point for turns: runTurn

    @cloudflare/think adds a public runTurn(options) facade that unifies turn admission behind a single mode:

    JavaScript
    await this.runTurn({ mode: "wait", messages }); // saveMessages / continueLastTurn
    await this.runTurn({ mode: "submit", messages }); // durable submitMessages
    await this.runTurn({ mode: "stream", messages }); // chat()

    stream mode accepts array and function inputs to match wait mode, and all entry points now route through a shared internal admission path that throws a clear error on nested blocking admissions that previously could deadlock.

    Recovery and reliability

    A large part of this release continues hardening recovery and converging @cloudflare/think and @cloudflare/ai-chat onto one model:

    • Stream stall watchdog. AIChatAgent can detect and recover from a hung model/transport stream via the opt-in chatStreamStallTimeoutMs watchdog. With chatRecovery enabled the stall routes into the same bounded-recovery machinery a deploy or eviction uses; otherwise it surfaces as a terminal stream error so the spinner clears.
    • Interrupted tool-call repair. AIChatAgent now repairs a transcript with a dead server-tool call before re-entering inference (parity with @cloudflare/think), so a recovered turn no longer fails with AI_MissingToolResultsError. An overridable repairInterruptedToolPart(part) hook lets apps customize the repaired shape.
    • Stuck status after reconnect. Fixed AI SDK status getting stuck when a reconnect races a turn that has been accepted but has not started streaming yet, so the UI now renders the in-flight turn instead of settling on ready.
    • Live "recovering…" on connect. AIChatAgent now replays the recovering status to a client that connects mid-recovery, so useAgentChat's isRecovering reflects in-progress recovery immediately instead of appearing frozen.
    • Terminal connection failures. The client stops reconnecting on terminal WebSocket close events and exposes them via connectionError / onConnectionError on AgentClient, useAgent, and useAgentChat.
    • Agent-tool child recovery. A healthy long-running sub-agent run is no longer abandoned as interrupted after a deploy (both @cloudflare/think and AIChatAgent).
    • Workflows from sub-agent facets. Agent Workflows can now start from sub-agent facets, with callbacks and Workflow RPC routed back to the originating facet.
    • Plus forward-progress crediting convergence, broadcast-first give-up ordering, an event-driven auto-continuation barrier, and structured row-size compaction in AIChatAgent.

    Other improvements

    • Shared chat React core. A new agents/chat/react entry exposes useAgentChat, transport helpers, and shared wire types, with syncMessagesToServer for server-authoritative transcript storage. @cloudflare/think/react and @cloudflare/ai-chat/react are now thin wrappers over it.
    • Optional ai peer. The root agents and @cloudflare/codemode runtimes no longer reference AI SDK types, so they bundle without ai / zod installed; AI-specific entry points still require the peer when imported. just-bash likewise moves to an optional peer used only by the skills bash runner.
    • Code Mode. The default DynamicWorkerExecutor timeout increases from 30s to 60s, executions now dispose the dynamically-loaded Worker and its RPC stub after each run (fixing a flaky isolate-shutdown assertion), connector imports are cleaned up, and the outer MCP tool-call context is passed to openApiMcpServer request callbacks.
    • Voice. Voice turns now support AI SDK fullStream responses (and warn when textStream is used).
    • MCP. McpAgent server-to-client requests can now be sent from callbacks that do not inherit the agent's async context, including callbacks reached through Worker Loader RPC.
    • Experimental: server actions and channels. This release lays groundwork for guarded server actions (action() / getActions() with a durable replay ledger and approvals) and a unified channels surface (configureChannels(), deliverNotice()). Both are experimental and their APIs may change, so we don't recommend depending on them yet.

    Upgrade

    To update to the latest version:

    npm i agents@latest @cloudflare/think@latest @cloudflare/ai-chat@latest @cloudflare/codemode@latest @cloudflare/voice@latest

    Refer to the Think documentation, Code Mode documentation, and Agents documentation for more information.

  1. Durable Objects now supports a us jurisdiction, letting you create Durable Objects that only run and store data within the United States. Use the us jurisdiction when you need to keep a Durable Object's compute and storage inside the United States to meet data residency requirements.

    Create a namespace restricted to the us jurisdiction the same way as any other jurisdiction:

    JavaScript
    // Worker
    export default {
    async fetch(request, env) {
    const usSubnamespace = env.MY_DURABLE_OBJECT.jurisdiction("us");
    const stub = usSubnamespace.getByName("general");
    return stub.fetch(request);
    },
    };

    Workers may still access Durable Objects constrained to the us jurisdiction from anywhere in the world. The jurisdiction constraint only controls where the Durable Object itself runs and persists data.

    For the full list of supported jurisdictions, refer to Data location — Restrict Durable Objects to a jurisdiction.

  1. AI Search now gives you more control over similarity cache freshness. Similarity cache helps reduce latency and inference cost by reusing responses for semantically similar queries.

    With these updates, you can choose how long responses are eligible for reuse and clear cached responses when they may be stale.

    Cache duration now defaults to 48 hours

    Previously, AI Search cached responses for a fixed duration of 30 days. Cached responses now use the instance's cache_ttl setting, and the default is 48 hours.

    You can set cache_ttl when creating or updating an instance to choose a cache duration from 10 minutes to 6 days.

    Use a shorter TTL when your source content changes frequently and freshness is more important. Use a longer TTL when your content is stable and you want more cache reuse.

    For example, set cache_ttl to 518400 to retain cached responses for 6 days:

    {
    "cache_ttl": 518400
    }

    Purge cached responses

    You can also purge all cached responses for an instance on demand. Purging cached responses does not delete indexed content or source files.

    It prevents AI Search from reusing previous cached responses, so subsequent similar queries generate fresh answers and repopulate the cache.

    Terminal window
    curl -X POST "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai-search/instances/$INSTANCE_NAME/purge_cache" \
    -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"

    You can also purge cached responses from the instance settings page in the Cloudflare dashboard.

    Refer to similarity cache for the full list of supported cache_ttl values and more details about cache behavior.

  1. Workflows makes it easier to build reliable multi-step applications that can recover when downstream systems fail. Rollback handlers now receive the original step context via a ctx object for the step being rolled back. This includes ctx.step.name, ctx.step.count, ctx.attempt, and the step config with defaults applied.

    The step configuration includes the retry and timeout settings used for that step, so you can customize your step recovery logic according to those fields.

    TypeScript
    await step.do(
    "create charge",
    async () => {
    const charge = await createCharge();
    return { chargeId: charge.id };
    },
    {
    rollback: async ({ ctx, output, error }) => {
    // `output` is the value returned by the step being rolled back.
    const { chargeId } = output as { chargeId: string };
    await refundCharge(chargeId, {
    // `ctx` is the original step context, including step name, count, attempt, and config.
    reason: `${ctx.step.name}: ${error.message}`,
    });
    },
    rollbackConfig: {
    // `rollbackConfig` controls retries and timeout for the rollback handler.
    retries: { limit: 3, delay: "30 seconds", backoff: "linear" },
    timeout: "5 minutes",
    },
    },
    );

    Refer to rollback options to learn more.

  1. R2 SQL now supports window functions, SELECT DISTINCT, set operations, and additional aggregates, making it easier to write analytical queries without preprocessing your data elsewhere.

    R2 SQL is Cloudflare's serverless, distributed SQL engine for querying Apache Iceberg tables stored in R2 Data Catalog.

    New capabilities

    • Window functionsROW_NUMBER, RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE, and aggregates with an OVER (...) clause, including PARTITION BY and explicit frames
    • QUALIFY — filter rows based on a window function result
    • DISTINCTSELECT DISTINCT, DISTINCT ON (...), and the DISTINCT modifier on aggregates such as COUNT(DISTINCT ...)
    • Set operationsUNION, UNION ALL, INTERSECT, and EXCEPT
    • Grouping extensionsGROUPING SETS, ROLLUP, and CUBE
    • Exact aggregatesMEDIAN, PERCENTILE_CONT, ARRAY_AGG, and STRING_AGG

    Examples

    Rank rows with a window function

    SELECT customer_id, region,
    ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) AS rank_in_region
    FROM my_namespace.sales_data

    Filter with QUALIFY

    SELECT customer_id, region, total_amount
    FROM my_namespace.sales_data
    QUALIFY ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) <= 3

    Combine tables with a set operation

    SELECT customer_id FROM my_namespace.sales_data
    EXCEPT
    SELECT customer_id FROM my_namespace.archived_sales

    The named WINDOW clause is not supported — inline the OVER (...) specification at each call site. For the full syntax reference, refer to the SQL reference. For supported features and performance guidance, refer to Limitations and best practices.

  1. The Routes page in the Cloudflare dashboard now shows the routes across all of your connectors — Cloudflare Mesh and Cloudflare Tunnel routes alongside Cloudflare WAN and Magic Transit static routes — in a single table, instead of a separate routes view per product.

    The unified Routes page in the Cloudflare dashboard, showing routes across connectors in a single table

    From the unified Routes page you can:

    • Visualize your network with an interactive map that shows how your destinations flow through to your connectors — including equal-cost multi-path (ECMP) routes where the same prefix is served by several connectors. Select a node to filter the table down to the routes behind it.
    • See every route in one table, with its destination, type, connector, priority, and source, and filter or sort to find what you need.
    • Create, edit, and delete routes of any supported type without leaving the page. When adding a Cloudflare WAN or Magic Transit static route, you now pick the next hop by connector name instead of typing its IP.
    • Manage virtual networks from a dedicated tab.
    • Test a route to see which connector and next hop a destination resolves to before you commit a change.

    To find it, go to Networking > Routes in the dashboard sidebar.

    Go to Routes

    Your existing routes, APIs, and configurations are unchanged — this is a dashboard experience that brings them together in one place. Learn how to add routes and manage virtual networks.

  1. Durable Objects now supports two new location hints for Asia-Pacific: apac-ne (Northeast Asia-Pacific) and apac-se (Southeast Asia-Pacific). Use apac-ne or apac-se when you want finer-grained placement within Asia-Pacific rather than the broader apac hint.

    Use the new hints the same way as any other locationHint:

    JavaScript
    // Northeast Asia-Pacific (Japan, Korea, etc.)
    const stubNE = env.MY_DURABLE_OBJECT.get(id, { locationHint: "apac-ne" });
    // Southeast Asia-Pacific (Singapore, Indonesia, etc.)
    const stubSE = env.MY_DURABLE_OBJECT.get(id, { locationHint: "apac-se" });

    If your users are spread across all of Asia-Pacific, the existing apac hint remains the right choice. Only reach for apac-ne or apac-se when your traffic is clearly concentrated in one sub-region and you want to minimize round-trip time to that audience. The default behavior and what we generally recommended is not adding a location hint unless absolutely needed, this will create the Durable Object as close to the initializing request as possible to reduce latency.

    As with all location hints, these are best-effort suggestions. Cloudflare will place the Durable Object in a nearby data center, not necessarily the exact hinted location.

    For the full list of supported hints, refer to Data location — Provide a location hint.

  1. Durable Objects now remain alive for the duration of active outbound connections created via connect() or an outbound WebSocket. Previously, a Durable Object would be evicted after 70-140 seconds of no incoming traffic, even if the object had an open outbound connection, which is a common pattern when streaming responses from a large language model (LLM) over TCP or an outbound WebSocket.

    With this change, each active outbound connection prevents eviction. Once all outbound connections close, the standard 70-140 second inactivity window applies before the Durable Object is evicted.

    Before: streaming connections were cut off by eviction

    Timeline showing a Durable Object evicted 70-140 seconds after the last incoming request, cutting off an in-flight LLM stream while the outbound connection is still open

    After: active outbound connections keep the Durable Object alive

    Timeline showing the same outbound stream completing because the active connection keeps the Durable Object alive, with the inactivity window starting only after the connection closes

    If you are building agents on Cloudflare, this is especially relevant. An agent that streams tokens from an LLM while calling models, or that performs long-running tasks over an outbound connection, now stays alive for the duration of that connection instead of being evicted mid-stream.

    Limits:

    • Each outbound connection keeps the Durable Object alive for a maximum of 15 minutes. After 15 minutes, the connection stops preventing eviction (the connection itself continues operating), and the standard eviction rules resume.
    • The Durable Object's existing per-account instance limits still apply.

    For more information, refer to Lifecycle of a Durable Object.

  1. AI agents can now deploy Workers to Cloudflare without first requiring a user to sign up, open a browser-based OAuth flow, click through the dashboard, or create an API token. When an agent tries to deploy without Cloudflare credentials, Wrangler can tell it to rerun with --temporary, then deploy the Worker to a temporary preview account.

    To try this with your agent, update to Wrangler 4.102.0 or later, make sure you are logged out (wrangler logout), and then ask your agent to build something and deploy it to Cloudflare. The agent should follow Wrangler's output and deploy using the --temporary flag.

    Diagram showing an AI agent deploying, verifying, and redeploying a Worker to a temporary account, then claiming it after authentication and moving it to a permanent account
    Terminal window
    wrangler deploy --temporary

    The temporary deployment stays live for 60 minutes. During that window, the agent can verify the Worker, redeploy changes, and return both the live Worker URL and claim URL. Opening the claim URL lets you sign in to or create a Cloudflare account and make the temporary account permanent.

    Temporary preview accounts currently support a limited set of products, including Workers, Workers Static Assets, Workers KV, D1, Durable Objects, Hyperdrive, Queues, and SSL/TLS certificates. For supported products, limits, and claim behavior, refer to Claim deployments (temporary accounts).

    For more context, refer to Temporary Cloudflare Accounts for Agents.

  1. exec() is now available for Containers. Use this.ctx.container.exec() to start processes inside a running Container, stream standard input and output, inspect exit codes, and signal each process.

    Call exec() from a class extending Container, or from another Durable Object through this.ctx.container. The associated Container must already be running.

    This example starts the Container when needed, then reads its Node.js version:

    src/index.js
    import { Container } from "@cloudflare/containers";
    export class MyContainer extends Container {
    async readVersion() {
    if (!this.ctx.container.running) {
    await this.start();
    }
    const process = await this.ctx.container.exec(["node", "--version"]);
    const output = await process.output();
    const decoder = new TextDecoder();
    return {
    exitCode: output.exitCode,
    stdout: decoder.decode(output.stdout),
    stderr: decoder.decode(output.stderr),
    };
    }
    }

    The command array starts an executable directly, without an implicit shell. Invoke a shell explicitly for pipes, redirects, or variable expansion.

    One RPC method can coordinate multiple exec() calls in one caller-to-Durable Object round trip. It can also pass byte-oriented ReadableStream input or return streamed output with flow control.

    For options and streaming examples, refer to Execute commands.

  1. You can create PlanetScale Postgres and MySQL databases from Cloudflare and bill PlanetScale database usage through your Cloudflare account as a pay-as-you-go customer. Cloudflare contract customers will be able to add PlanetScale usage to their contract in July so reach out to your Cloudflare account team if interested.

    Create a PlanetScale database from the Cloudflare dashboard to check out globally distributed Workers optimized for regional data access.

    Go to Create a PlanetScale database Request flow from a user to Workers, Hyperdrive caches, connection pools, and PlanetScale.

    PlanetScale databases created from Cloudflare work with Workers through Hyperdrive. Hyperdrive manages database connection pools and query caching, so you can use PlanetScale as a centralized relational database for Workers applications without changing your database drivers, object-relational mapping (ORM) libraries, or SQL tooling.

    PlanetScale usage appears on your Cloudflare invoice each billing period as a dollar total at PlanetScale's standard pricing. You can introspect per-database billing usage via PlanetScale's dashboard.

    When you create a PlanetScale database from the Cloudflare dashboard, you receive the same PlanetScale developer experience, including development branches, query insights, and Model Context Protocol (MCP) server support for agents.

    To get started, refer to PlanetScale Postgres and MySQL with Hyperdrive.

  1. You can now configure Artifacts namespaces, repos, and tokens directly from the Cloudflare dashboard.

    Artifacts is Git-compatible storage that lets you store repos on Cloudflare and interact with them using standard Git workflows.

    You can view and create namespaces, which are top-level containers for repos:

    Artifacts namespaces dashboard showing namespace search and create namespace controls

    You can view, create, fork, and search repos within a namespace:

    Artifacts repositories dashboard showing repo source, access, and created columns

    You can open a repo to view its files and copy its Git remote URL.

    Artifacts repository overview showing files, commits, token management, and quick actions

    You can also provision tokens directly from the dashboard to scope Git access to a single repo, with read tokens for clone, fetch, and pull workflows, or write tokens when a client needs to push changes.

    To get started, go to the Cloudflare dashboard and select Storage & databases > Artifacts.

    If you are enrolled in the Artifacts beta, you can use the dashboard to set up Artifacts. If you would like to join the beta, complete the request form.

  1. The latest release of the Agents SDK makes it easier to build agents that can safely interact with real systems and keep working through interruptions.

    Agents can now browse websites through Browser Run, write code against external tools through Code Mode, use client-provided tools when delegating to Think sub-agents, and recover more reliably from deploys, Durable Object evictions, and connection churn.

    Safer browser automation

    Agents can now use Browser Run through a single durable browser_execute tool. Instead of choosing from a fixed list of actions, the model writes code against the Chrome DevTools Protocol (CDP) and can inspect pages, capture screenshots, read rendered content, debug frontend behavior, and interact with live browser sessions.

    JavaScript
    const browserTools = createBrowserTools({
    ctx: this.ctx,
    browser: this.env.BROWSER,
    loader: this.env.LOADER,
    session: { mode: "dynamic" },
    });

    Browser sessions can be one-time, reused, or promoted from one-time to persistent during a run. This is useful when an agent needs a human to log in, complete MFA, or approve a sensitive action. The run can pause, keep the same tabs and cookies, and resume after approval.

    The browser tools also add Live View URLs, optional session recording, and quick actions such as browser_markdown, browser_extract, browser_links, and browser_scrape for one-shot browsing tasks.

    Resumable code execution with approvals

    Code Mode now uses createCodemodeRuntime, connectors, and a durable execution log. This lets you give a model one codemode tool instead of a large prompt full of tool definitions. The model can discover the capabilities it needs, write code against typed globals, and reuse saved snippets.

    JavaScript
    const runtime = createCodemodeRuntime({
    ctx: this.ctx,
    executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
    connectors: [new GithubConnector(this.ctx, this.env, connection)],
    });
    const result = streamText({
    model,
    messages,
    tools: { codemode: runtime.tool() },
    });

    When the code reaches an approval-gated action, the runtime pauses execution and returns a pending approval. After approval, completed calls replay from the durable log, the approved action runs, and the same code continues. This makes it practical to build agents that create issues, update external systems, or perform other side effects without custom pause-and-resume logic for every tool.

    Better Think delegation

    Think sub-agents can now use client-defined tools over the RPC chat() path. A parent agent can pass tool schemas with clientTools and resolve tool calls through onClientToolCall. This lets delegated agents use caller-provided capabilities without requiring a browser WebSocket.

    JavaScript
    await child.chat(message, callback, {
    signal,
    clientTools: [
    {
    name: "get_user_timezone",
    description: "Get the caller's timezone",
    parameters: { type: "object" },
    },
    ],
    onClientToolCall: async ({ toolName, input }) => {
    return runClientTool(toolName, input);
    },
    });

    Think Workflows also improve step.prompt(). A prompt step now runs a full agentic turn before returning structured output, so the agent can call tools before producing the typed result. This makes Workflow steps more useful for durable triage, research, and approval flows.

    The unified Think execute tool can also include cdp.* browser capabilities alongside state.* and tools.* when Browser Run is bound.

    Voice output device selection

    Voice clients can route assistant audio to a specific output device. Use outputDeviceId with useVoiceAgent, or call client.setOutputDevice() from the framework-agnostic client.

    JavaScript
    const voice = useVoiceAgent({
    agent: "MyVoiceAgent",
    outputDeviceId: selectedSpeakerId,
    });

    Browsers without speaker-selection support continue playing through the default output device and report a non-fatal outputDeviceError.

    Reliability fixes

    This release includes several fixes for production agents:

    • useAgent and AgentClient handle WebSocket replacement more reliably during reconnects and configuration changes.
    • Chat stream replay is more reliable after reconnects, deploys, and provider errors.
    • Fiber recovery continues across multi-pass scans and backs off when recovery hooks keep failing.
    • Agent teardown continues even when the request that started teardown is canceled.
    • Large session histories use byte-budgeted reads to reduce memory pressure during startup.

    Upgrade

    To update to the latest version:

    npm i agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latest

    Refer to the Code Mode documentation, Browser tools documentation, Think tools documentation, and Voice documentation for more information.

  1. These updates introduce new features for optimizing and manipulating with Images:

    • New composite option: Control how overlays are blended with the base image.
    • Percentage widths: Set the dimensions of an overlay as a fraction of the dimensions of the base image.
    • New fit modes: Use aspect-crop to always preserve the target aspect ratio or scale-up to always enlarge images.
    • New upscale parameter: Apply AI upscaling to produce sharper, more detailed results when enlarging images.
  1. We are excited to announce GLM-5.2 on Workers AI, Z.ai's flagship agentic coding model.

    @cf/zai-org/glm-5.2 is a text generation model built for agentic coding workflows. With function calling and reasoning support, it can handle long codebases, multi-step planning, and tool-augmented agents.

    Key features and use cases:

    • Agentic coding: Designed for autonomous coding tasks, long-horizon planning, and complex software engineering workflows
    • Large context window: GLM-5.2 supports up to a 1,048,576 token context window. Workers AI is launching the model with a 262,144 token context window and plans to increase this in the future
    • Function calling: Build agents that invoke tools and APIs across multiple conversation turns
    • Reasoning: Tackles complex problem-solving and step-by-step reasoning tasks

    Use GLM-5.2 through the Workers AI binding (env.AI.run()), the REST API at /run or /v1/chat/completions, or AI Gateway.

    Pricing is available on the model page or pricing page.

  1. VPC Network bindings now support the connect() Socket API for raw TCP connections to private destinations, in addition to HTTP traffic via fetch().

    This means Workers can now open TCP sockets to any private service reachable through the bound Cloudflare Tunnel, Cloudflare Mesh, or Cloudflare WAN on-ramp — Redis, Memcached, MQTT, custom binary protocols, or any other TCP-based service.

    JSONC
    {
    "$schema": "./node_modules/wrangler/config-schema.json",
    "vpc_networks": [
    {
    "binding": "PRIVATE_NETWORK",
    "network_id": "cf1:network",
    "remote": true
    }
    ]
    }

    At runtime, use connect() on the binding to open a TCP socket to a private destination:

    TypeScript
    export default {
    async fetch(request: Request, env: Env) {
    // Open a TCP connection to a private Redis instance
    const socket = await env.PRIVATE_NETWORK.connect("10.0.1.50:6379");
    // Write a Redis PING command
    const writer = socket.writable.getWriter();
    await writer.write(new TextEncoder().encode("PING\r\n"));
    await writer.close();
    return new Response(socket.readable);
    },
    };

    For more details, refer to VPC Networks and the Workers Binding API.

  1. You can now create custom trace spans in your Workers code using tracing.enterSpan(). Custom spans appear alongside the automatic platform instrumentation (fetch calls, KV reads, D1 queries, and other platform operations) in your traces and OpenTelemetry exports, with correct parent-child nesting.

    The API is available via import { tracing } from "cloudflare:workers" or through the handler context as ctx.tracing:

    TypeScript
    import { tracing } from "cloudflare:workers";
    export default {
    async fetch(request, env, ctx) {
    return tracing.enterSpan("handleRequest", async (span) => {
    span.setAttribute("url.path", new URL(request.url).pathname);
    const data = await env.MY_KV.get("key");
    return new Response(data);
    });
    },
    };

    Spans nest automatically based on the JavaScript async context, and are auto-ended when the callback returns or its returned promise settles. The Span object provides setAttribute(key, value) for attaching metadata and an isTraced property to check whether the current request is being sampled.

    Trace waterfall showing custom spans nested alongside automatic KV and fetch instrumentation

    Tracing must be enabled in your Wrangler configuration for spans to be recorded.

    For full API details and examples, refer to Custom spans.

  1. AI Gateway logs now capture the user agent of the client that made each request, making it easier to identify which SDK, library, or application sent the traffic flowing through your gateway. For example, you can tell apart requests coming from openai-python versus a custom application or a Cloudflare Worker.

    The user agent appears alongside the other details in each log entry, and you can filter logs by user agent (equals, does not equal, or contains) in the dashboard.

    For more information, refer to Logging.

  1. You can now filter the Metrics tab for a Durable Objects namespace by an individual Durable Object's ID or name in the Cloudflare dashboard. Previously, metrics charts only showed aggregate, namespace-level data, making it difficult to isolate the behavior of a specific object.

    Go to Durable Objects The Durable Objects Metrics tab filtered to a single object by ID, showing per-object requests and errors by invocation status.

    Start typing an ID or name into the filter and select a match from the autocomplete dropdown. The autocomplete only shows objects with invocations during the selected time range, so an object that does not appear has not been invoked in that window. This does not necessarily mean the object has been deleted. Every chart on the page updates to reflect only the selected object. This makes it easier to identify and investigate a single Durable Object when debugging a high-traffic object, an error spike, or unexpected storage usage. Clear the filter to return to namespace-level metrics.

    Metrics are powered by the GraphQL Analytics API, so standard analytics behavior such as ingestion delay and sampling applies.

    For more information, refer to Metrics and analytics.

  1. Cloudflare's Terraform v5 Provider makes it easy for developers to manage their Cloudflare infrastructure using a configuration as code approach. It releases every 2-3 weeks to ensure that you can always manage the latest features in the platform. This week, we launched Terraform v5.20.0, which adds 24 new resources, bumps the underlying Go SDK to cloudflare-go v7, and includes a range of bug fixes and state upgraders based on community feedback.

    New resources

    • cloudflare_ai_search_namespace: Manage AI Search namespaces
    • cloudflare_custom_csr: Manage custom certificate signing requests
    • cloudflare_dls_prefix_binding: Manage DLS regional service prefix bindings
    • cloudflare_flagship_app: Manage Flagship feature flag apps
    • cloudflare_flagship_flag: Manage Flagship feature flags
    • cloudflare_google_tag_gateway: Manage Google Tag Gateway
    • cloudflare_load_balancer_monitor_group: Manage load balancer monitor groups
    • cloudflare_oauth_client: Manage IAM OAuth clients
    • cloudflare_origin_cloud_region: Manage origin cloud regions (v2 endpoints)
    • cloudflare_secrets_store: Manage Secrets Store instances
    • cloudflare_secrets_store_secret: Manage Secrets Store secrets
    • cloudflare_share: Manage resource shares
    • cloudflare_share_recipient: Manage share recipients
    • cloudflare_share_resource: Manage shared resources
    • cloudflare_zero_trust_device_deployment_groups: Manage Zero Trust device deployment groups
    • cloudflare_zero_trust_dlp_data_class: Manage DLP data classes
    • cloudflare_zero_trust_dlp_data_tag: Manage DLP data tags
    • cloudflare_zero_trust_dlp_data_tag_category: Manage DLP data tag categories
    • cloudflare_zero_trust_dlp_sensitivity_group: Manage DLP sensitivity groups
    • cloudflare_zero_trust_dlp_sensitivity_level: Manage DLP sensitivity levels
    • cloudflare_zero_trust_dlp_sensitivity_level_order: Manage DLP sensitivity level ordering
    • cloudflare_zero_trust_resource_library_application: Manage Zero Trust resource library applications
    • cloudflare_zero_trust_resource_library_category: Manage Zero Trust resource library categories
    • cloudflare_zero_trust_tunnel_warp_connector_config: Manage WARP connector tunnel configurations

    Features

    • cache: add create (POST) method for smart_tiered_cache
    • cache: update OPCR config to v2 endpoints
    • dlp: promote classification Stainless config to main
    • dlp: add custom prompt topics endpoint
    • email_security_block_sender: state upgrader for v4 to v5 migration
    • email_security_impersonation_registry: state upgrader for v4 to v5 migration
    • email_security_trusted_domains: state upgrader for v4 to v5 migration
    • snippets: add Terraform id_property annotations for snippet and snippet_rules
    • bump Go SDK to cloudflare-go v7

    Bug fixes

    • account_member: missing upgrade path from v5.0–v5.15
    • authenticated_origin_pulls_settings: nil pointer panic
    • bot_management: restore content_bots_protection handling in model.go
    • dns_record: prevent FQDN normalization from swallowing name shortening changes
    • list: nullify empty nested objects to prevent inconsistent result after apply
    • load_balancer_pool: accept early-v5 object-shape state at schema_version=0
    • load_balancer_pool: add UseStateForUnknown for load_shedding attribute to prevent drift
    • r2_custom_domain: restore degraded-response handling in resource.go
    • regional_hostname: update cloudflare-go imports from v6 to v7
    • secrets_store: fix model/schema parity and guard acceptance tests
    • spectrum_application: accept early-v5 object-shape state at schema_version=0
    • worker: preserve observability.traces.propagation_policy across reads
    • worker: add propagation_policy to observability defaults
    • worker_version: restore handwritten D1 database_id handling
    • workers_custom_domain: missing CertId field in state migration
    • workers_script: restore annotations Read workaround stripped by codegen
    • zero_trust_access_identity_provider: change read_only from computed to optional
    • zero_trust_access_identity_provider: add UseStateForUnknown to SAML-only config fields
    • zero_trust_access_identity_provider: use UseNonNullStateForUnknown on scim_config fields
    • zero_trust_access_policy: populate account_id when migrating zone-scoped v4 state
    • zero_trust_access_policy: missing common_names transform in migration
    • gracefully handle nil pointer dereference when config has attributes_flat during migration
    • set initial schema version to 500 for all new resources

    Refactors

    Extracted MoveState nil guard into shared helper

    For more information

  1. @cf/moonshotai/kimi-k2.7-code is now available on Workers AI. Kimi K2.7 Code is a code-optimized variant of the Kimi K2 family, built on a Mixture-of-Experts architecture with 1T total parameters and 32B active per token.

    Improved coding and agent performance

    K2.7 Code delivers meaningful gains over K2.6 on coding and agentic benchmarks:

    • +21.8% on Kimi Code Bench v2
    • +11.0% on Program Bench
    • +31.5% on MLS Bench Lite

    Reasoning efficiency

    K2.7 Code uses 30% fewer reasoning tokens compared to K2.6, reducing overthinking and lowering inference cost for reasoning-heavy workloads.

    Key capabilities

    • 262.1k token context window for retaining full conversation history, tool definitions, and codebases across long-running agent sessions
    • Long-horizon coding with improved instruction following and higher end-to-end coding task success rates
    • Vision inputs for processing images alongside text
    • Thinking mode with configurable reasoning depth via chat_template_kwargs.thinking
    • Multi-turn tool calling for building agents that invoke tools across multiple conversation turns
    • Structured outputs with JSON schema support

    Differences from Kimi K2.6

    If you are migrating from Kimi K2.6, note the following:

    • K2.7 Code is optimized for coding tasks with improved benchmark performance and reasoning efficiency
    • Cached input token pricing is $0.19 per M tokens (vs $0.16 for K2.6)
    • API usage is identical — no parameter changes required

    Get started

    Use Kimi K2.7 Code through the Workers AI binding (env.AI.run()), the REST API at /ai/run, or the OpenAI-compatible endpoint at /v1/chat/completions. You can also use AI Gateway with any of these endpoints.

    For more information, refer to the Kimi K2.7 Code model page and pricing.

  1. Browser Run's /snapshot endpoint now supports a formats parameter that lets you return multiple page formats in a single API call. Previously, /snapshot returned only HTML content and a screenshot. You can now also include Markdown and the accessibility tree in the same response.

    These formats are particularly useful for AI agent workflows:

    • Markdown provides a token-efficient representation of page content that LLMs can process directly, without parsing HTML markup.
    • The accessibility tree provides a structured representation of a page's elements, including roles, labels, and hierarchy, helping LLMs understand page structure and navigate its contents.

    The following example returns a screenshot, Markdown, and the accessibility tree in one call:

    Terminal window
    curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/snapshot' \
    -H 'Authorization: Bearer <apiToken>' \
    -H 'Content-Type: application/json' \
    -d '{
    "url": "https://example.com/",
    "formats": ["screenshot", "markdown", "accessibilityTree"]
    }'

    You must request at least two formats. If you only need one, use the respective single-format endpoint such as /screenshot or /markdown.

    Refer to the /snapshot documentation for the full list of accepted values.

  1. Dynamic Workers usage on the Workers overview page

    Customers can now view the number of Dynamic Workers invoked during their billing period from the Workers overview page in the Cloudflare dashboard.

    This count reflects the number of Dynamic Workers that Cloudflare would bill for during the selected billing period. Dynamic Workers usage data only goes back to June 1, 2026.

    You can also query this count through the GraphQL Analytics API by using workersInvocationsByOwnerAndScriptGroups and selecting distinctDynamicWorkerCount:

    query getDynamicWorkersCount(
    $accountTag: string!
    $filter: AccountWorkersInvocationsByOwnerAndScriptGroupsFilter_InputObject
    ) {
    viewer {
    accounts(filter: { accountTag: $accountTag }) {
    workersInvocationsByOwnerAndScriptGroups(limit: 10000, filter: $filter) {
    uniq {
    distinctDynamicWorkerCount
    }
    }
    }
    }
    }

    Use variables to set the account and billing-period date range:

    {
    "accountTag": "<ACCOUNT_ID>",
    "filter": {
    "date_geq": "2026-06-01",
    "date_leq": "2026-06-30"
    }
    }

    For more information, refer to Dynamic Workers pricing.

  1. AI Search now supports namespace-level Wrangler commands, making it easier to manage namespaces from your terminal, scripts, and agent workflows.

    The following commands are available:

    CommandDescription
    wrangler ai-search namespace listList AI Search namespaces
    wrangler ai-search namespace createCreate a new AI Search namespace
    wrangler ai-search namespace getGet details for a namespace
    wrangler ai-search namespace updateUpdate a namespace description
    wrangler ai-search namespace deleteDelete an AI Search namespace

    Create a namespace for a new application or tenant directly from the CLI:

    Terminal window
    wrangler ai-search namespace create docs-production --description "Production documentation search"

    List namespaces with pagination or filter by name or description:

    Terminal window
    wrangler ai-search namespace list --search docs --page 1 --per-page 10

    Use --json with list, create, get, and update to return structured output that automation and AI agents can parse directly.

    Instance-level commands also now support a --namespace flag, so you can interact with instances inside a specific namespace from the CLI:

    Terminal window
    wrangler ai-search list --namespace docs-production

    For full usage details, refer to the AI Search Wrangler commands documentation.

  1. The Flagship API reference is now available. You can use the Cloudflare API to create and update apps, and to create, update, delete, and list feature flags without using the dashboard.

    For example, create a new boolean flag with the API:

    Terminal window
    curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/flagship/apps/$APP_ID/flags \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
    -d '{
    "key": "new-checkout",
    "enabled": true,
    "default_variation": "off",
    "variations": {
    "off": false,
    "on": true
    },
    "rules": []
    }'

    To create an API token, go to Account API Tokens in the Cloudflare dashboard and search for Flagship.

    The API reference includes endpoints for Flagship apps, flags, changelog entries, and flag evaluation. Agents can also use the Flagship reference in the Cloudflare skill to create and manage Flagship resources.

    Refer to the Flagship documentation to learn more about evaluating feature flags from your applications.