Changelog
New updates and improvements at Cloudflare.
The latest release of the Agents SDK ↗ makes it easier to run long work in the background, drive turns through one entry point, and keep chat agents working through deploys, evictions, and reconnects.
This release adds first-class detached (background) sub-agent runs with live progress and durable milestones, a single
runTurnturn-admission entry point, and a large round of recovery and reliability fixes that continue converging@cloudflare/thinkand@cloudflare/ai-chatonto one model.runAgentToolcan now dispatch a sub-agent without blocking the calling turn. A detached run returns a handle immediately and is owned by a durable, eviction-surviving backbone instead of being abandoned when the dispatching turn ends.JavaScript class OrdersAgent extends Think {async startImport(input) {// Fire-and-forget, or wire a durable completion callback// (by method name, like schedule()):await this.runAgentTool(ImportAgent, {input,detached: { onFinish: "onImportDone", maxBudgetMs: 60 * 60 * 1000 },});}// result.status: "completed" | "error" | "aborted" | "interrupted"async onImportDone(run, result) {}}TypeScript class OrdersAgent extends Think {async startImport(input) {// Fire-and-forget, or wire a durable completion callback// (by method name, like schedule()):await this.runAgentTool(ImportAgent, {input,detached: { onFinish: "onImportDone", maxBudgetMs: 60 * 60 * 1000 },});}// result.status: "completed" | "error" | "aborted" | "interrupted"async onImportDone(run, result) {}}Highlights:
- Durable, exactly-once-on-the-happy-path completion via a warm fast path plus a self-scheduling reconcile backbone that survives eviction and deploys.
- Bounded. An absolute
maxBudgetMsceiling (default 24h) andcancelAgentTool(runId)keep abandoned runs from holding a concurrency slot forever. detached: { notify: true }lets a finished background run inject a message back into the chat so the model reacts to the result — no hand-wiredonFinishneeded.
Sub-agents can also report mid-run progress that rides their own turn stream back to the parent's connected clients:
JavaScript // Inside the child sub-agent:await this.reportProgress({fraction: 0.6,phase: "deploying",message: "Generating menu page…",});TypeScript // Inside the child sub-agent:await this.reportProgress({fraction: 0.6,phase: "deploying",message: "Generating menu page…",});Progress surfaces on
AgentToolRunState.progressviauseAgentToolEvents, so a background-runs tray can render a live bar without drilling in, and the latest snapshot is persisted for inspection after eviction. Naming amilestonepromotes a signal to a durable, replayable row, anddetached: { onMilestones }can surface a milestone as a synthetic chat message ("narrate"for a cheap status line, or"react"to drive a model turn).@cloudflare/thinkadds a publicrunTurn(options)facade that unifies turn admission behind a singlemode:JavaScript await this.runTurn({ mode: "wait", messages }); // saveMessages / continueLastTurnawait this.runTurn({ mode: "submit", messages }); // durable submitMessagesawait this.runTurn({ mode: "stream", messages }); // chat()TypeScript await this.runTurn({ mode: "wait", messages }); // saveMessages / continueLastTurnawait this.runTurn({ mode: "submit", messages }); // durable submitMessagesawait this.runTurn({ mode: "stream", messages }); // chat()streammode accepts array and function inputs to matchwaitmode, and all entry points now route through a shared internal admission path that throws a clear error on nested blocking admissions that previously could deadlock.A large part of this release continues hardening recovery and converging
@cloudflare/thinkand@cloudflare/ai-chatonto one model:- Stream stall watchdog.
AIChatAgentcan detect and recover from a hung model/transport stream via the opt-inchatStreamStallTimeoutMswatchdog. WithchatRecoveryenabled the stall routes into the same bounded-recovery machinery a deploy or eviction uses; otherwise it surfaces as a terminal stream error so the spinner clears. - Interrupted tool-call repair.
AIChatAgentnow repairs a transcript with a dead server-tool call before re-entering inference (parity with@cloudflare/think), so a recovered turn no longer fails withAI_MissingToolResultsError. An overridablerepairInterruptedToolPart(part)hook lets apps customize the repaired shape. - Stuck status after reconnect. Fixed AI SDK
statusgetting stuck when a reconnect races a turn that has been accepted but has not started streaming yet, so the UI now renders the in-flight turn instead of settling onready. - Live "recovering…" on connect.
AIChatAgentnow replays the recovering status to a client that connects mid-recovery, souseAgentChat'sisRecoveringreflects in-progress recovery immediately instead of appearing frozen. - Terminal connection failures. The client stops reconnecting on terminal WebSocket close events and exposes them via
connectionError/onConnectionErroronAgentClient,useAgent, anduseAgentChat. - Agent-tool child recovery. A healthy long-running sub-agent run is no longer abandoned as
interruptedafter a deploy (both@cloudflare/thinkandAIChatAgent). - Workflows from sub-agent facets. Agent Workflows can now start from sub-agent facets, with callbacks and Workflow RPC routed back to the originating facet.
- Plus forward-progress crediting convergence, broadcast-first give-up ordering, an event-driven auto-continuation barrier, and structured row-size compaction in
AIChatAgent.
- Shared chat React core. A new
agents/chat/reactentry exposesuseAgentChat, transport helpers, and shared wire types, withsyncMessagesToServerfor server-authoritative transcript storage.@cloudflare/think/reactand@cloudflare/ai-chat/reactare now thin wrappers over it. - Optional
aipeer. The rootagentsand@cloudflare/codemoderuntimes no longer reference AI SDK types, so they bundle withoutai/zodinstalled; AI-specific entry points still require the peer when imported.just-bashlikewise moves to an optional peer used only by the skills bash runner. - Code Mode. The default
DynamicWorkerExecutortimeout increases from 30s to 60s, executions now dispose the dynamically-loaded Worker and its RPC stub after each run (fixing a flaky isolate-shutdown assertion), connector imports are cleaned up, and the outer MCP tool-call context is passed toopenApiMcpServerrequest callbacks. - Voice. Voice turns now support AI SDK
fullStreamresponses (and warn whentextStreamis used). - MCP.
McpAgentserver-to-client requests can now be sent from callbacks that do not inherit the agent's async context, including callbacks reached through Worker Loader RPC. - Experimental: server actions and channels. This release lays groundwork for guarded server actions (
action()/getActions()with a durable replay ledger and approvals) and a unified channels surface (configureChannels(),deliverNotice()). Both are experimental and their APIs may change, so we don't recommend depending on them yet.
To update to the latest version:
npm i agents@latest @cloudflare/think@latest @cloudflare/ai-chat@latest @cloudflare/codemode@latest @cloudflare/voice@latestyarn add agents@latest @cloudflare/think@latest @cloudflare/ai-chat@latest @cloudflare/codemode@latest @cloudflare/voice@latestpnpm add agents@latest @cloudflare/think@latest @cloudflare/ai-chat@latest @cloudflare/codemode@latest @cloudflare/voice@latestbun add agents@latest @cloudflare/think@latest @cloudflare/ai-chat@latest @cloudflare/codemode@latest @cloudflare/voice@latestRefer to the Think documentation, Code Mode documentation, and Agents documentation for more information.
Durable Objects now supports a
usjurisdiction, letting you create Durable Objects that only run and store data within the United States. Use theusjurisdiction when you need to keep a Durable Object's compute and storage inside the United States to meet data residency requirements.Create a namespace restricted to the
usjurisdiction the same way as any other jurisdiction:JavaScript // Workerexport default {async fetch(request, env) {const usSubnamespace = env.MY_DURABLE_OBJECT.jurisdiction("us");const stub = usSubnamespace.getByName("general");return stub.fetch(request);},};Workers may still access Durable Objects constrained to the
usjurisdiction from anywhere in the world. The jurisdiction constraint only controls where the Durable Object itself runs and persists data.For the full list of supported jurisdictions, refer to Data location — Restrict Durable Objects to a jurisdiction.
AI Search now gives you more control over similarity cache freshness. Similarity cache helps reduce latency and inference cost by reusing responses for semantically similar queries.
With these updates, you can choose how long responses are eligible for reuse and clear cached responses when they may be stale.
Previously, AI Search cached responses for a fixed duration of 30 days. Cached responses now use the instance's
cache_ttlsetting, and the default is 48 hours.You can set
cache_ttlwhen creating or updating an instance to choose a cache duration from 10 minutes to 6 days.Use a shorter TTL when your source content changes frequently and freshness is more important. Use a longer TTL when your content is stable and you want more cache reuse.
For example, set
cache_ttlto518400to retain cached responses for 6 days:{"cache_ttl": 518400}You can also purge all cached responses for an instance on demand. Purging cached responses does not delete indexed content or source files.
It prevents AI Search from reusing previous cached responses, so subsequent similar queries generate fresh answers and repopulate the cache.
Terminal window curl -X POST "https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/ai-search/instances/$INSTANCE_NAME/purge_cache" \-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"You can also purge cached responses from the instance settings page in the Cloudflare dashboard.
Refer to similarity cache for the full list of supported
cache_ttlvalues and more details about cache behavior.
Workflows makes it easier to build reliable multi-step applications that can recover when downstream systems fail. Rollback handlers now receive the original step context via a
ctxobject for the step being rolled back. This includesctx.step.name,ctx.step.count,ctx.attempt, and the stepconfigwith defaults applied.The step configuration includes the retry and timeout settings used for that step, so you can customize your step recovery logic according to those fields.
TypeScript await step.do("create charge",async () => {const charge = await createCharge();return { chargeId: charge.id };},{rollback: async ({ ctx, output, error }) => {// `output` is the value returned by the step being rolled back.const { chargeId } = output as { chargeId: string };await refundCharge(chargeId, {// `ctx` is the original step context, including step name, count, attempt, and config.reason: `${ctx.step.name}: ${error.message}`,});},rollbackConfig: {// `rollbackConfig` controls retries and timeout for the rollback handler.retries: { limit: 3, delay: "30 seconds", backoff: "linear" },timeout: "5 minutes",},},);Refer to rollback options to learn more.
R2 SQL now supports window functions,
SELECT DISTINCT, set operations, and additional aggregates, making it easier to write analytical queries without preprocessing your data elsewhere.R2 SQL is Cloudflare's serverless, distributed SQL engine for querying Apache Iceberg ↗ tables stored in R2 Data Catalog.
- Window functions —
ROW_NUMBER,RANK,DENSE_RANK,PERCENT_RANK,CUME_DIST,NTILE,LAG,LEAD,FIRST_VALUE,LAST_VALUE,NTH_VALUE, and aggregates with anOVER (...)clause, includingPARTITION BYand explicit frames - QUALIFY — filter rows based on a window function result
- DISTINCT —
SELECT DISTINCT,DISTINCT ON (...), and theDISTINCTmodifier on aggregates such asCOUNT(DISTINCT ...) - Set operations —
UNION,UNION ALL,INTERSECT, andEXCEPT - Grouping extensions —
GROUPING SETS,ROLLUP, andCUBE - Exact aggregates —
MEDIAN,PERCENTILE_CONT,ARRAY_AGG, andSTRING_AGG
SELECT customer_id, region,ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) AS rank_in_regionFROM my_namespace.sales_dataSELECT customer_id, region, total_amountFROM my_namespace.sales_dataQUALIFY ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) <= 3SELECT customer_id FROM my_namespace.sales_dataEXCEPTSELECT customer_id FROM my_namespace.archived_salesThe named
WINDOWclause is not supported — inline theOVER (...)specification at each call site. For the full syntax reference, refer to the SQL reference. For supported features and performance guidance, refer to Limitations and best practices.- Window functions —
The Routes page in the Cloudflare dashboard now shows the routes across all of your connectors — Cloudflare Mesh and Cloudflare Tunnel routes alongside Cloudflare WAN and Magic Transit static routes — in a single table, instead of a separate routes view per product.

From the unified Routes page you can:
- Visualize your network with an interactive map that shows how your destinations flow through to your connectors — including equal-cost multi-path (ECMP) routes where the same prefix is served by several connectors. Select a node to filter the table down to the routes behind it.
- See every route in one table, with its destination, type, connector, priority, and source, and filter or sort to find what you need.
- Create, edit, and delete routes of any supported type without leaving the page. When adding a Cloudflare WAN or Magic Transit static route, you now pick the next hop by connector name instead of typing its IP.
- Manage virtual networks from a dedicated tab.
- Test a route to see which connector and next hop a destination resolves to before you commit a change.
To find it, go to Networking > Routes in the dashboard sidebar.
Go to RoutesYour existing routes, APIs, and configurations are unchanged — this is a dashboard experience that brings them together in one place. Learn how to add routes and manage virtual networks.
Durable Objects now supports two new location hints for Asia-Pacific:
apac-ne(Northeast Asia-Pacific) andapac-se(Southeast Asia-Pacific). Useapac-neorapac-sewhen you want finer-grained placement within Asia-Pacific rather than the broaderapachint.Use the new hints the same way as any other
locationHint:JavaScript // Northeast Asia-Pacific (Japan, Korea, etc.)const stubNE = env.MY_DURABLE_OBJECT.get(id, { locationHint: "apac-ne" });// Southeast Asia-Pacific (Singapore, Indonesia, etc.)const stubSE = env.MY_DURABLE_OBJECT.get(id, { locationHint: "apac-se" });If your users are spread across all of Asia-Pacific, the existing
apachint remains the right choice. Only reach forapac-neorapac-sewhen your traffic is clearly concentrated in one sub-region and you want to minimize round-trip time to that audience. The default behavior and what we generally recommended is not adding a location hint unless absolutely needed, this will create the Durable Object as close to the initializing request as possible to reduce latency.As with all location hints, these are best-effort suggestions. Cloudflare will place the Durable Object in a nearby data center, not necessarily the exact hinted location.
For the full list of supported hints, refer to Data location — Provide a location hint.
Durable Objects now remain alive for the duration of active outbound connections created via
connect()or an outbound WebSocket. Previously, a Durable Object would be evicted after 70-140 seconds of no incoming traffic, even if the object had an open outbound connection, which is a common pattern when streaming responses from a large language model (LLM) over TCP or an outbound WebSocket.With this change, each active outbound connection prevents eviction. Once all outbound connections close, the standard 70-140 second inactivity window applies before the Durable Object is evicted.
If you are building agents on Cloudflare, this is especially relevant. An agent that streams tokens from an LLM while calling models, or that performs long-running tasks over an outbound connection, now stays alive for the duration of that connection instead of being evicted mid-stream.
Limits:
- Each outbound connection keeps the Durable Object alive for a maximum of 15 minutes. After 15 minutes, the connection stops preventing eviction (the connection itself continues operating), and the standard eviction rules resume.
- The Durable Object's existing per-account instance limits still apply.
For more information, refer to Lifecycle of a Durable Object.
AI agents can now deploy Workers to Cloudflare without first requiring a user to sign up, open a browser-based OAuth flow, click through the dashboard, or create an API token. When an agent tries to deploy without Cloudflare credentials, Wrangler can tell it to rerun with
--temporary, then deploy the Worker to a temporary preview account.To try this with your agent, update to Wrangler 4.102.0 or later, make sure you are logged out (
wrangler logout), and then ask your agent to build something and deploy it to Cloudflare. The agent should follow Wrangler's output and deploy using the--temporaryflag.
Terminal window wrangler deploy --temporaryThe temporary deployment stays live for 60 minutes. During that window, the agent can verify the Worker, redeploy changes, and return both the live Worker URL and claim URL. Opening the claim URL lets you sign in to or create a Cloudflare account and make the temporary account permanent.
Temporary preview accounts currently support a limited set of products, including Workers, Workers Static Assets, Workers KV, D1, Durable Objects, Hyperdrive, Queues, and SSL/TLS certificates. For supported products, limits, and claim behavior, refer to Claim deployments (temporary accounts).
For more context, refer to Temporary Cloudflare Accounts for Agents ↗.
exec()is now available for Containers. Usethis.ctx.container.exec()to start processes inside a running Container, stream standard input and output, inspect exit codes, and signal each process.Call
exec()from a class extendingContainer, or from another Durable Object throughthis.ctx.container. The associated Container must already be running.This example starts the Container when needed, then reads its Node.js version:
src/index.js import { Container } from "@cloudflare/containers";export class MyContainer extends Container {async readVersion() {if (!this.ctx.container.running) {await this.start();}const process = await this.ctx.container.exec(["node", "--version"]);const output = await process.output();const decoder = new TextDecoder();return {exitCode: output.exitCode,stdout: decoder.decode(output.stdout),stderr: decoder.decode(output.stderr),};}}src/index.ts import { Container } from "@cloudflare/containers";export class MyContainer extends Container {async readVersion() {if (!this.ctx.container.running) {await this.start();}const process = await this.ctx.container.exec(["node", "--version"]);const output = await process.output();const decoder = new TextDecoder();return {exitCode: output.exitCode,stdout: decoder.decode(output.stdout),stderr: decoder.decode(output.stderr),};}}The command array starts an executable directly, without an implicit shell. Invoke a shell explicitly for pipes, redirects, or variable expansion.
One RPC method can coordinate multiple
exec()calls in one caller-to-Durable Object round trip. It can also pass byte-orientedReadableStreaminput or return streamed output with flow control.For options and streaming examples, refer to Execute commands.
You can create PlanetScale Postgres and MySQL databases from Cloudflare and bill PlanetScale database usage through your Cloudflare account as a pay-as-you-go customer. Cloudflare contract customers will be able to add PlanetScale usage to their contract in July so reach out to your Cloudflare account team if interested.
Create a PlanetScale database from the Cloudflare dashboard to check out globally distributed Workers optimized for regional data access.
Go to Create a PlanetScale databasePlanetScale databases created from Cloudflare work with Workers through Hyperdrive. Hyperdrive manages database connection pools and query caching, so you can use PlanetScale as a centralized relational database for Workers applications without changing your database drivers, object-relational mapping (ORM) libraries, or SQL tooling.
PlanetScale usage appears on your Cloudflare invoice each billing period as a dollar total at PlanetScale's standard pricing ↗. You can introspect per-database billing usage via PlanetScale's dashboard ↗.
When you create a PlanetScale database from the Cloudflare dashboard, you receive the same PlanetScale developer experience, including development branches, query insights, and Model Context Protocol (MCP) server support for agents.
To get started, refer to PlanetScale Postgres and MySQL with Hyperdrive.
You can now configure Artifacts namespaces, repos, and tokens directly from the Cloudflare dashboard.
Artifacts is Git-compatible storage that lets you store repos on Cloudflare and interact with them using standard Git workflows.
You can view and create namespaces, which are top-level containers for repos:

You can view, create, fork, and search repos within a namespace:

You can open a repo to view its files and copy its Git remote URL.

You can also provision tokens directly from the dashboard to scope Git access to a single repo, with read tokens for clone, fetch, and pull workflows, or write tokens when a client needs to push changes.
To get started, go to the Cloudflare dashboard ↗ and select Storage & databases > Artifacts.
If you are enrolled in the Artifacts beta, you can use the dashboard to set up Artifacts. If you would like to join the beta, complete the request form ↗.
The latest release of the Agents SDK ↗ makes it easier to build agents that can safely interact with real systems and keep working through interruptions.
Agents can now browse websites through Browser Run, write code against external tools through Code Mode, use client-provided tools when delegating to Think sub-agents, and recover more reliably from deploys, Durable Object evictions, and connection churn.
Agents can now use Browser Run through a single durable
browser_executetool. Instead of choosing from a fixed list of actions, the model writes code against the Chrome DevTools Protocol (CDP) and can inspect pages, capture screenshots, read rendered content, debug frontend behavior, and interact with live browser sessions.JavaScript const browserTools = createBrowserTools({ctx: this.ctx,browser: this.env.BROWSER,loader: this.env.LOADER,session: { mode: "dynamic" },});TypeScript const browserTools = createBrowserTools({ctx: this.ctx,browser: this.env.BROWSER,loader: this.env.LOADER,session: { mode: "dynamic" },});Browser sessions can be one-time, reused, or promoted from one-time to persistent during a run. This is useful when an agent needs a human to log in, complete MFA, or approve a sensitive action. The run can pause, keep the same tabs and cookies, and resume after approval.
The browser tools also add Live View URLs, optional session recording, and quick actions such as
browser_markdown,browser_extract,browser_links, andbrowser_scrapefor one-shot browsing tasks.Code Mode now uses
createCodemodeRuntime, connectors, and a durable execution log. This lets you give a model onecodemodetool instead of a large prompt full of tool definitions. The model can discover the capabilities it needs, write code against typed globals, and reuse saved snippets.JavaScript const runtime = createCodemodeRuntime({ctx: this.ctx,executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),connectors: [new GithubConnector(this.ctx, this.env, connection)],});const result = streamText({model,messages,tools: { codemode: runtime.tool() },});TypeScript const runtime = createCodemodeRuntime({ctx: this.ctx,executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),connectors: [new GithubConnector(this.ctx, this.env, connection)],});const result = streamText({model,messages,tools: { codemode: runtime.tool() },});When the code reaches an approval-gated action, the runtime pauses execution and returns a pending approval. After approval, completed calls replay from the durable log, the approved action runs, and the same code continues. This makes it practical to build agents that create issues, update external systems, or perform other side effects without custom pause-and-resume logic for every tool.
Think sub-agents can now use client-defined tools over the RPC
chat()path. A parent agent can pass tool schemas withclientToolsand resolve tool calls throughonClientToolCall. This lets delegated agents use caller-provided capabilities without requiring a browser WebSocket.JavaScript await child.chat(message, callback, {signal,clientTools: [{name: "get_user_timezone",description: "Get the caller's timezone",parameters: { type: "object" },},],onClientToolCall: async ({ toolName, input }) => {return runClientTool(toolName, input);},});TypeScript await child.chat(message, callback, {signal,clientTools: [{name: "get_user_timezone",description: "Get the caller's timezone",parameters: { type: "object" },},],onClientToolCall: async ({ toolName, input }) => {return runClientTool(toolName, input);},});Think Workflows also improve
step.prompt(). A prompt step now runs a full agentic turn before returning structured output, so the agent can call tools before producing the typed result. This makes Workflow steps more useful for durable triage, research, and approval flows.The unified Think execute tool can also include
cdp.*browser capabilities alongsidestate.*andtools.*when Browser Run is bound.Voice clients can route assistant audio to a specific output device. Use
outputDeviceIdwithuseVoiceAgent, or callclient.setOutputDevice()from the framework-agnostic client.JavaScript const voice = useVoiceAgent({agent: "MyVoiceAgent",outputDeviceId: selectedSpeakerId,});TypeScript const voice = useVoiceAgent({agent: "MyVoiceAgent",outputDeviceId: selectedSpeakerId,});Browsers without speaker-selection support continue playing through the default output device and report a non-fatal
outputDeviceError.This release includes several fixes for production agents:
useAgentandAgentClienthandle WebSocket replacement more reliably during reconnects and configuration changes.- Chat stream replay is more reliable after reconnects, deploys, and provider errors.
- Fiber recovery continues across multi-pass scans and backs off when recovery hooks keep failing.
- Agent teardown continues even when the request that started teardown is canceled.
- Large session histories use byte-budgeted reads to reduce memory pressure during startup.
To update to the latest version:
npm i agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latestyarn add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latestpnpm add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latestbun add agents@latest @cloudflare/think@latest @cloudflare/codemode@latest @cloudflare/ai-chat@latest @cloudflare/voice@latestRefer to the Code Mode documentation, Browser tools documentation, Think tools documentation, and Voice documentation for more information.
These updates introduce new features for optimizing and manipulating with Images:
- New
compositeoption: Control how overlays are blended with the base image. - Percentage widths: Set the dimensions of an overlay as a fraction of the dimensions of the base image.
- New
fitmodes: Useaspect-cropto always preserve the target aspect ratio orscale-upto always enlarge images. - New
upscaleparameter: Apply AI upscaling to produce sharper, more detailed results when enlarging images.
- New
We are excited to announce GLM-5.2 on Workers AI, Z.ai's flagship agentic coding model.
@cf/zai-org/glm-5.2is a text generation model built for agentic coding workflows. With function calling and reasoning support, it can handle long codebases, multi-step planning, and tool-augmented agents.Key features and use cases:
- Agentic coding: Designed for autonomous coding tasks, long-horizon planning, and complex software engineering workflows
- Large context window: GLM-5.2 supports up to a 1,048,576 token context window. Workers AI is launching the model with a 262,144 token context window and plans to increase this in the future
- Function calling: Build agents that invoke tools and APIs across multiple conversation turns
- Reasoning: Tackles complex problem-solving and step-by-step reasoning tasks
Use GLM-5.2 through the Workers AI binding (
env.AI.run()), the REST API at/runor/v1/chat/completions, or AI Gateway.Pricing is available on the model page or pricing page.
VPC Network bindings now support the
connect()Socket API for raw TCP connections to private destinations, in addition to HTTP traffic viafetch().This means Workers can now open TCP sockets to any private service reachable through the bound Cloudflare Tunnel, Cloudflare Mesh, or Cloudflare WAN on-ramp — Redis, Memcached, MQTT, custom binary protocols, or any other TCP-based service.
JSONC {"$schema": "./node_modules/wrangler/config-schema.json","vpc_networks": [{"binding": "PRIVATE_NETWORK","network_id": "cf1:network","remote": true}]}TOML [[vpc_networks]]binding = "PRIVATE_NETWORK"network_id = "cf1:network"remote = trueAt runtime, use
connect()on the binding to open a TCP socket to a private destination:TypeScript export default {async fetch(request: Request, env: Env) {// Open a TCP connection to a private Redis instanceconst socket = await env.PRIVATE_NETWORK.connect("10.0.1.50:6379");// Write a Redis PING commandconst writer = socket.writable.getWriter();await writer.write(new TextEncoder().encode("PING\r\n"));await writer.close();return new Response(socket.readable);},};For more details, refer to VPC Networks and the Workers Binding API.
You can now create custom trace spans in your Workers code using
tracing.enterSpan(). Custom spans appear alongside the automatic platform instrumentation (fetch calls, KV reads, D1 queries, and other platform operations) in your traces and OpenTelemetry exports, with correct parent-child nesting.The API is available via
import { tracing } from "cloudflare:workers"or through the handler context asctx.tracing:TypeScript import { tracing } from "cloudflare:workers";export default {async fetch(request, env, ctx) {return tracing.enterSpan("handleRequest", async (span) => {span.setAttribute("url.path", new URL(request.url).pathname);const data = await env.MY_KV.get("key");return new Response(data);});},};Spans nest automatically based on the JavaScript async context, and are auto-ended when the callback returns or its returned promise settles. The
Spanobject providessetAttribute(key, value)for attaching metadata and anisTracedproperty to check whether the current request is being sampled.
Tracing must be enabled in your Wrangler configuration for spans to be recorded.
For full API details and examples, refer to Custom spans.
AI Gateway logs now capture the user agent of the client that made each request, making it easier to identify which SDK, library, or application sent the traffic flowing through your gateway. For example, you can tell apart requests coming from
openai-pythonversus a custom application or a Cloudflare Worker.The user agent appears alongside the other details in each log entry, and you can filter logs by user agent (equals, does not equal, or contains) in the dashboard.
For more information, refer to Logging.
You can now filter the Metrics tab for a Durable Objects namespace by an individual Durable Object's ID or name in the Cloudflare dashboard. Previously, metrics charts only showed aggregate, namespace-level data, making it difficult to isolate the behavior of a specific object.
Go to Durable Objects
Start typing an ID or name into the filter and select a match from the autocomplete dropdown. The autocomplete only shows objects with invocations during the selected time range, so an object that does not appear has not been invoked in that window. This does not necessarily mean the object has been deleted. Every chart on the page updates to reflect only the selected object. This makes it easier to identify and investigate a single Durable Object when debugging a high-traffic object, an error spike, or unexpected storage usage. Clear the filter to return to namespace-level metrics.
Metrics are powered by the GraphQL Analytics API, so standard analytics behavior such as ingestion delay and sampling applies.
For more information, refer to Metrics and analytics.
Cloudflare's Terraform v5 Provider makes it easy for developers to manage their Cloudflare infrastructure using a configuration as code approach. It releases every 2-3 weeks ↗ to ensure that you can always manage the latest features in the platform. This week, we launched Terraform v5.20.0, which adds 24 new resources, bumps the underlying Go SDK to cloudflare-go v7, and includes a range of bug fixes and state upgraders based on community feedback.
- cloudflare_ai_search_namespace: Manage AI Search namespaces
- cloudflare_custom_csr: Manage custom certificate signing requests
- cloudflare_dls_prefix_binding: Manage DLS regional service prefix bindings
- cloudflare_flagship_app: Manage Flagship feature flag apps
- cloudflare_flagship_flag: Manage Flagship feature flags
- cloudflare_google_tag_gateway: Manage Google Tag Gateway
- cloudflare_load_balancer_monitor_group: Manage load balancer monitor groups
- cloudflare_oauth_client: Manage IAM OAuth clients
- cloudflare_origin_cloud_region: Manage origin cloud regions (v2 endpoints)
- cloudflare_secrets_store: Manage Secrets Store instances
- cloudflare_secrets_store_secret: Manage Secrets Store secrets
- cloudflare_share: Manage resource shares
- cloudflare_share_recipient: Manage share recipients
- cloudflare_share_resource: Manage shared resources
- cloudflare_zero_trust_device_deployment_groups: Manage Zero Trust device deployment groups
- cloudflare_zero_trust_dlp_data_class: Manage DLP data classes
- cloudflare_zero_trust_dlp_data_tag: Manage DLP data tags
- cloudflare_zero_trust_dlp_data_tag_category: Manage DLP data tag categories
- cloudflare_zero_trust_dlp_sensitivity_group: Manage DLP sensitivity groups
- cloudflare_zero_trust_dlp_sensitivity_level: Manage DLP sensitivity levels
- cloudflare_zero_trust_dlp_sensitivity_level_order: Manage DLP sensitivity level ordering
- cloudflare_zero_trust_resource_library_application: Manage Zero Trust resource library applications
- cloudflare_zero_trust_resource_library_category: Manage Zero Trust resource library categories
- cloudflare_zero_trust_tunnel_warp_connector_config: Manage WARP connector tunnel configurations
- cache: add create (POST) method for smart_tiered_cache
- cache: update OPCR config to v2 endpoints
- dlp: promote classification Stainless config to main
- dlp: add custom prompt topics endpoint
- email_security_block_sender: state upgrader for v4 to v5 migration
- email_security_impersonation_registry: state upgrader for v4 to v5 migration
- email_security_trusted_domains: state upgrader for v4 to v5 migration
- snippets: add Terraform
id_propertyannotations for snippet and snippet_rules - bump Go SDK to cloudflare-go v7
- account_member: missing upgrade path from v5.0–v5.15
- authenticated_origin_pulls_settings: nil pointer panic
- bot_management: restore
content_bots_protectionhandling in model.go - dns_record: prevent FQDN normalization from swallowing name shortening changes
- list: nullify empty nested objects to prevent inconsistent result after apply
- load_balancer_pool: accept early-v5 object-shape state at schema_version=0
- load_balancer_pool: add
UseStateForUnknownforload_sheddingattribute to prevent drift - r2_custom_domain: restore degraded-response handling in resource.go
- regional_hostname: update cloudflare-go imports from v6 to v7
- secrets_store: fix model/schema parity and guard acceptance tests
- spectrum_application: accept early-v5 object-shape state at schema_version=0
- worker: preserve
observability.traces.propagation_policyacross reads - worker: add
propagation_policyto observability defaults - worker_version: restore handwritten D1
database_idhandling - workers_custom_domain: missing
CertIdfield in state migration - workers_script: restore annotations Read workaround stripped by codegen
- zero_trust_access_identity_provider: change
read_onlyfrom computed to optional - zero_trust_access_identity_provider: add
UseStateForUnknownto SAML-only config fields - zero_trust_access_identity_provider: use
UseNonNullStateForUnknownon scim_config fields - zero_trust_access_policy: populate
account_idwhen migrating zone-scoped v4 state - zero_trust_access_policy: missing
common_namestransform in migration - gracefully handle nil pointer dereference when config has
attributes_flatduring migration - set initial schema version to 500 for all new resources
Extracted
MoveStatenil guard into shared helper
@cf/moonshotai/kimi-k2.7-codeis now available on Workers AI. Kimi K2.7 Code is a code-optimized variant of the Kimi K2 family, built on a Mixture-of-Experts architecture with 1T total parameters and 32B active per token.K2.7 Code delivers meaningful gains over K2.6 on coding and agentic benchmarks:
- +21.8% on Kimi Code Bench v2
- +11.0% on Program Bench
- +31.5% on MLS Bench Lite
K2.7 Code uses 30% fewer reasoning tokens compared to K2.6, reducing overthinking and lowering inference cost for reasoning-heavy workloads.
- 262.1k token context window for retaining full conversation history, tool definitions, and codebases across long-running agent sessions
- Long-horizon coding with improved instruction following and higher end-to-end coding task success rates
- Vision inputs for processing images alongside text
- Thinking mode with configurable reasoning depth via
chat_template_kwargs.thinking - Multi-turn tool calling for building agents that invoke tools across multiple conversation turns
- Structured outputs with JSON schema support
If you are migrating from Kimi K2.6, note the following:
- K2.7 Code is optimized for coding tasks with improved benchmark performance and reasoning efficiency
- Cached input token pricing is $0.19 per M tokens (vs $0.16 for K2.6)
- API usage is identical — no parameter changes required
Use Kimi K2.7 Code through the Workers AI binding (
env.AI.run()), the REST API at/ai/run, or the OpenAI-compatible endpoint at/v1/chat/completions. You can also use AI Gateway with any of these endpoints.For more information, refer to the Kimi K2.7 Code model page and pricing.
Browser Run's
/snapshotendpoint now supports aformatsparameter that lets you return multiple page formats in a single API call. Previously,/snapshotreturned only HTML content and a screenshot. You can now also include Markdown and the accessibility tree in the same response.These formats are particularly useful for AI agent workflows:
- Markdown provides a token-efficient representation of page content that LLMs can process directly, without parsing HTML markup.
- The accessibility tree provides a structured representation of a page's elements, including roles, labels, and hierarchy, helping LLMs understand page structure and navigate its contents.
The following example returns a screenshot, Markdown, and the accessibility tree in one call:
Terminal window curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/snapshot' \-H 'Authorization: Bearer <apiToken>' \-H 'Content-Type: application/json' \-d '{"url": "https://example.com/","formats": ["screenshot", "markdown", "accessibilityTree"]}'TypeScript import Cloudflare from "cloudflare";const client = new Cloudflare({apiToken: process.env["CLOUDFLARE_API_TOKEN"],});const snapshot = await client.browserRendering.snapshot.create({account_id: process.env["CLOUDFLARE_ACCOUNT_ID"],url: "https://example.com/",formats: ["screenshot", "markdown", "accessibilityTree"],});console.log(snapshot.markdown);console.log(snapshot.accessibilityTree);TypeScript interface Env {BROWSER: BrowserRun;}export default {async fetch(request, env): Promise<Response> {return await env.BROWSER.quickAction("snapshot", {url: "https://example.com/",formats: ["screenshot", "markdown", "accessibilityTree"],});},} satisfies ExportedHandler<Env>;You must request at least two formats. If you only need one, use the respective single-format endpoint such as
/screenshotor/markdown.Refer to the
/snapshotdocumentation for the full list of accepted values.

Customers can now view the number of Dynamic Workers invoked during their billing period from the Workers overview page in the Cloudflare dashboard.
This count reflects the number of Dynamic Workers that Cloudflare would bill for during the selected billing period. Dynamic Workers usage data only goes back to June 1, 2026.
You can also query this count through the GraphQL Analytics API by using
workersInvocationsByOwnerAndScriptGroupsand selectingdistinctDynamicWorkerCount:query getDynamicWorkersCount($accountTag: string!$filter: AccountWorkersInvocationsByOwnerAndScriptGroupsFilter_InputObject) {viewer {accounts(filter: { accountTag: $accountTag }) {workersInvocationsByOwnerAndScriptGroups(limit: 10000, filter: $filter) {uniq {distinctDynamicWorkerCount}}}}}Use variables to set the account and billing-period date range:
{"accountTag": "<ACCOUNT_ID>","filter": {"date_geq": "2026-06-01","date_leq": "2026-06-30"}}For more information, refer to Dynamic Workers pricing.
AI Search now supports namespace-level Wrangler commands, making it easier to manage namespaces from your terminal, scripts, and agent workflows.
The following commands are available:
Command Description wrangler ai-search namespace listList AI Search namespaces wrangler ai-search namespace createCreate a new AI Search namespace wrangler ai-search namespace getGet details for a namespace wrangler ai-search namespace updateUpdate a namespace description wrangler ai-search namespace deleteDelete an AI Search namespace Create a namespace for a new application or tenant directly from the CLI:
Terminal window wrangler ai-search namespace create docs-production --description "Production documentation search"List namespaces with pagination or filter by name or description:
Terminal window wrangler ai-search namespace list --search docs --page 1 --per-page 10Use
--jsonwithlist,create,get, andupdateto return structured output that automation and AI agents can parse directly.Instance-level commands also now support a
--namespaceflag, so you can interact with instances inside a specific namespace from the CLI:Terminal window wrangler ai-search list --namespace docs-productionFor full usage details, refer to the AI Search Wrangler commands documentation.
The Flagship API reference is now available. You can use the Cloudflare API to create and update apps, and to create, update, delete, and list feature flags without using the dashboard.
For example, create a new boolean flag with the API:
Terminal window curl https://api.cloudflare.com/client/v4/accounts/$ACCOUNT_ID/flagship/apps/$APP_ID/flags \-H "Content-Type: application/json" \-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \-d '{"key": "new-checkout","enabled": true,"default_variation": "off","variations": {"off": false,"on": true},"rules": []}'To create an API token, go to Account API Tokens ↗ in the Cloudflare dashboard and search for Flagship.
The API reference includes endpoints for Flagship apps, flags, changelog entries, and flag evaluation. Agents can also use the Flagship reference in the Cloudflare skill ↗ to create and manage Flagship resources.
Refer to the Flagship documentation to learn more about evaluating feature flags from your applications.