Skip to content

Changelog

New updates and improvements at Cloudflare.

Developer platform
hero image
  1. Workers AI now supports structured JSON outputs with JSON mode, which allows you to request a structured output response when interacting with AI models.

    This makes it much easier to retrieve structured data from your AI models, and avoids the (error prone!) need to parse large unstructured text responses to extract your data.

    JSON mode in Workers AI is compatible with the OpenAI SDK's structured outputs response_format API, which can be used directly in a Worker:

    JavaScript
    import { OpenAI } from "openai";
    // Define your JSON schema for a calendar event
    const CalendarEventSchema = {
    type: "object",
    properties: {
    name: { type: "string" },
    date: { type: "string" },
    participants: { type: "array", items: { type: "string" } },
    },
    required: ["name", "date", "participants"],
    };
    export default {
    async fetch(request, env) {
    const client = new OpenAI({
    apiKey: env.OPENAI_API_KEY,
    // Optional: use AI Gateway to bring logs, evals & caching to your AI requests
    // https://developers.cloudflare.com/ai-gateway/usage/providers/openai/
    // baseUrl: "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai"
    });
    const response = await client.chat.completions.create({
    model: "gpt-4o-2024-08-06",
    messages: [
    { role: "system", content: "Extract the event information." },
    {
    role: "user",
    content: "Alice and Bob are going to a science fair on Friday.",
    },
    ],
    // Use the `response_format` option to request a structured JSON output
    response_format: {
    // Set json_schema and provide ra schema, or json_object and parse it yourself
    type: "json_schema",
    schema: CalendarEventSchema, // provide a schema
    },
    });
    // This will be of type CalendarEventSchema
    const event = response.choices[0].message.parsed;
    return Response.json({
    calendar_event: event,
    });
    },
    };

    To learn more about JSON mode and structured outputs, visit the Workers AI documentation.

  1. Workflows now supports up to 4,500 concurrent (running) instances, up from the previous limit of 100. This limit will continue to increase during the Workflows open beta. This increase applies to all users on the Workers Paid plan, and takes effect immediately.

    Review the Workflows limits documentation and/or dive into the get started guide to start building on Workflows.

  1. You can now interact with the Images API directly in your Worker.

    This allows more fine-grained control over transformation request flows and cache behavior. For example, you can resize, manipulate, and overlay images without requiring them to be accessible through a URL.

    The Images binding can be configured in the Cloudflare dashboard for your Worker or in the Wrangler configuration file in your project's directory:

    JSONC
    {
    "images": {
    "binding": "IMAGES", // i.e. available in your Worker on env.IMAGES
    },
    }

    Within your Worker code, you can interact with this binding by using env.IMAGES.

    Here's how you can rotate, resize, and blur an image, then output the image as AVIF:

    TypeScript
    const info = await env.IMAGES.info(stream);
    // stream contains a valid image, and width/height is available on the info object
    const response = (
    await env.IMAGES.input(stream)
    .transform({ rotate: 90 })
    .transform({ width: 128 })
    .transform({ blur: 20 })
    .output({ format: "image/avif" })
    ).response();
    return response;

    For more information, refer to Images Bindings.

  1. Super Slurper can now migrate data from any S3-compatible object storage provider to Cloudflare R2. This includes transfers from services like MinIO, Wasabi, Backblaze B2, and DigitalOcean Spaces.

    Super Slurper S3-Compatible Source

    For more information on Super Slurper and how to migrate data from your existing S3-compatible storage buckets to R2, refer to our documentation.

  1. We've updated the Workers AI text generation models to include context windows and limits definitions and changed our APIs to estimate and validate the number of tokens in the input prompt, not the number of characters.

    This update allows developers to use larger context windows when interacting with Workers AI models, which can lead to better and more accurate results.

    Our catalog page provides more information about each model's supported context window.

  1. Zaraz at zone level to Tag management at account level

    Previously, you could only configure Zaraz by going to each individual zone under your Cloudflare account. Now, if you’d like to get started with Zaraz or manage your existing configuration, you can navigate to the Tag Management section on the Cloudflare dashboard – this will make it easier to compare and configure the same settings across multiple zones.

    These changes will not alter any existing configuration or entitlements for zones you already have Zaraz enabled on. If you’d like to edit existing configurations, you can go to the Tag Setup section of the dashboard, and select the zone you'd like to edit.

  1. Workers for Platforms is an architecture wherein a centralized dispatch Worker processes incoming requests and routes them to isolated sub-Workers, called User Workers.

    Workers for Platforms Requests

    Previously, when a new User Worker was uploaded, there was a short delay before it became available for dispatch. This meant that even though an API request could return a 200 OK response, the script might not yet be ready to handle requests, causing unexpected failures for platforms that immediately dispatch to new Workers.

    With this update, first-time uploads of User Workers are now deployed synchronously. A 200 OK response guarantees the script is fully provisioned and ready to handle traffic immediately, ensuring more predictable deployments and reducing errors.

  1. We've updated the Workers AI pricing to include the latest models and how model usage maps to Neurons.

    • Each model's core input format(s) (tokens, audio seconds, images, etc) now include mappings to Neurons, making it easier to understand how your included Neuron volume is consumed and how you are charged at scale
    • Per-model pricing, instead of the previous bucket approach, allows us to be more flexible on how models are charged based on their size, performance and capabilities. As we optimize each model, we can then pass on savings for that model.
    • You will still only pay for what you consume: Workers AI inference is serverless, and not billed by the hour.

    Going forward, models will be launched with their associated Neuron costs, and we'll be updating the Workers AI dashboard and API to reflect consumption in both raw units and Neurons. Visit the Workers AI pricing page to learn more about Workers AI pricing.

  1. Auto-fixing Workers Name in Git Repo

    Small misconfigurations shouldn’t break your deployments. Cloudflare is introducing automatic error detection and fixes in Workers Builds, identifying common issues in your wrangler.toml or wrangler.jsonc and proactively offering fixes, so you spend less time debugging and more time shipping.

    Here's how it works:

    1. Before running your build, Cloudflare checks your Worker's Wrangler configuration file (wrangler.toml or wrangler.jsonc) for common errors.
    2. Once you submit a build, if Cloudflare finds an error it can fix, it will submit a pull request to your repository that fixes it.
    3. Once you merge this pull request, Cloudflare will run another build.

    We're starting with fixing name mismatches between your Wrangler file and the Cloudflare dashboard, a top cause of build failures.

    This is just the beginning, we want your feedback on what other errors we should catch and fix next. Let us know in the Cloudflare Developers Discord, #workers-and-pages-feature-suggestions.

  1. You can now customize a queue's message retention period, from a minimum of 60 seconds to a maximum of 14 days. Previously, it was fixed to the default of 4 days.

    Customize a queue's message retention period

    You can customize the retention period on the settings page for your queue, or using Wrangler:

    Update message retention period
    $ wrangler queues update my-queue --message-retention-period-secs 600

    This feature is available on all new and existing queues. If you haven't used Cloudflare Queues before, get started with the Cloudflare Queues guide.

  1. We've added an example prompt to help you get started with building AI agents and applications on Cloudflare Workers, including Workflows, Durable Objects, and Workers KV.

    You can use this prompt with your favorite AI model, including Claude 3.5 Sonnet, OpenAI's o3-mini, Gemini 2.0 Flash, or Llama 3.3 on Workers AI. Models with large context windows will allow you to paste the prompt directly: provide your own prompt within the <user_prompt></user_prompt> tags.

    Terminal window
    {paste_prompt_here}
    <user_prompt>
    user: Build an AI agent using Cloudflare Workflows. The Workflow should run when a new GitHub issue is opened on a specific project with the label 'help' or 'bug', and attempt to help the user troubleshoot the issue by calling the OpenAI API with the issue title and description, and a clear, structured prompt that asks the model to suggest 1-3 possible solutions to the issue. Any code snippets should be formatted in Markdown code blocks. Documentation and sources should be referenced at the bottom of the response. The agent should then post the response to the GitHub issue. The agent should run as the provided GitHub bot account.
    </user_prompt>

    This prompt is still experimental, but we encourage you to try it out and provide feedback.

  1. Super Slurper now transfers data from cloud object storage providers like AWS S3 and Google Cloud Storage to Cloudflare R2 up to 5x faster than it did before.

    We moved from a centralized service to a distributed system built on the Cloudflare Developer Platform — using Cloudflare Workers, Durable Objects, and Queues — to both improve performance and increase system concurrency capabilities (and we'll share more details about how we did it soon!)

    Super Slurper Objects Migrated

    Time to copy 75,000 objects from AWS S3 to R2 decreased from 15 minutes 30 seconds (old) to 3 minutes 25 seconds (after performance improvements)

    For more information on Super Slurper and how to migrate data from existing object storage to R2, refer to our documentation.

  1. Previously, all viewers watched "the live edge," or the latest content of the broadcast, synchronously. If a viewer paused for more than a few seconds, the player would automatically "catch up" when playback started again. Seeking through the broadcast was only available once the recording was available after it concluded.

    Starting today, customers can make a small adjustment to the player embed or manifest URL to enable the DVR experience for their viewers. By offering this feature as an opt-in adjustment, our customers are empowered to pick the best experiences for their applications.

    When building a player embed code or manifest URL, just add dvrEnabled=true as a query parameter. There are some things to be aware of when using this option. For more information, refer to DVR for Live.

  1. Import repo or choose template

    You can now create a Worker by:

    • Importing a Git repository: Choose an existing Git repo on your GitHub/GitLab account and set up Workers Builds to deploy your Worker.
    • Deploying a template with Git: Choose from a brand new selection of production ready examples to help you get started with popular frameworks like Astro, Remix and Next or build stateful applications with Cloudflare resources like D1 databases, Workers AI or Durable Objects! When you're ready to deploy, Cloudflare will set up your project by cloning the template to your GitHub/GitLab account, provisioning any required resources and deploying your Worker.

    With every push to your chosen branch, Cloudflare will automatically build and deploy your Worker.

    To get started, go to the Workers dashboard.

    These new features are available today in the Cloudflare dashboard to a subset of Cloudflare customers, and will be coming to all customers in the next few weeks. Don't see it in your dashboard, but want early access? Add your Cloudflare Account ID to this form.

  1. AI Gateway adds additional ways to handle requests - Request Timeouts and Request Retries, making it easier to keep your applications responsive and reliable.

    Timeouts and retries can be used on both the Universal Endpoint or directly to a supported provider.

    Request timeouts A request timeout allows you to trigger fallbacks or a retry if a provider takes too long to respond.

    To set a request timeout directly to a provider, add a cf-aig-request-timeout header.

    Provider-specific endpoint example
    curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \
    --header 'Authorization: Bearer {cf_api_token}' \
    --header 'Content-Type: application/json' \
    --header 'cf-aig-request-timeout: 5000'
    --data '{"prompt": "What is Cloudflare?"}'

    Request retries A request retry automatically retries failed requests, so you can recover from temporary issues without intervening.

    To set up request retries directly to a provider, add the following headers:

    • cf-aig-max-attempts (number)
    • cf-aig-retry-delay (number)
    • cf-aig-backoff ("constant" | "linear" | "exponential)
  1. AI Gateway has added three new providers: Cartesia, Cerebras, and ElevenLabs, giving you more even more options for providers you can use through AI Gateway. Here's a brief overview of each:

    • Cartesia provides text-to-speech models that produce natural-sounding speech with low latency.
    • Cerebras delivers low-latency AI inference to Meta's Llama 3.1 8B and Llama 3.3 70B models.
    • ElevenLabs offers text-to-speech models with human-like voices in 32 languages.
    Example of Cerebras log in AI Gateway

    To get started with AI Gateway, just update the base URL. Here's how you can send a request to Cerebras using cURL:

    Example fetch request
    curl -X POST https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/cerebras/chat/completions \
    --header 'content-type: application/json' \
    --header 'Authorization: Bearer CEREBRAS_TOKEN' \
    --data '{
    "model": "llama-3.3-70b",
    "messages": [
    {
    "role": "user",
    "content": "What is Cloudflare?"
    }
    ]
    }'
  1. Screenshot of Terraform defining a Zone

    Cloudflare's v5 Terraform Provider is now generally available. With this release, Terraform resources are now automatically generated based on OpenAPI Schemas. This change brings alignment across our SDKs, API documentation, and now Terraform Provider. The new provider boosts coverage by increasing support for API properties to 100%, adding 25% more resources, and more than 200 additional data sources. Going forward, this will also reduce the barriers to bringing more resources into Terraform across the broader Cloudflare API. This is a small, but important step to making more of our platform manageable through GitOps, making it easier for you to manage Cloudflare just like you do your other infrastructure.

    The Cloudflare Terraform Provider v5 is a ground-up rewrite of the provider and introduces breaking changes for some resource types. Please refer to the upgrade guide for best practices, or the blog post on automatically generating Cloudflare's Terraform Provider for more information about the approach.

    For more info

  1. We've revamped the Workers Metrics dashboard.

    Workers Metrics dashboard

    Now you can easily compare metrics across Worker versions, understand the current state of a gradual deployment, and review key Workers metrics in a single view. This new interface enables you to:

    • Drag-and-select using a graphical timepicker for precise metric selection.
    Workers Metrics graphical timepicker
    • Use histograms to visualize cumulative metrics, allowing you to bucket and compare rates over time.
    • Focus on Worker versions by directly interacting with the version numbers in the legend.
    Workers Metrics legend selector
    • Monitor and compare active gradual deployments.
    • Track error rates across versions with grouping both by version and by invocation status.
    • Measure how Smart Placement improves request duration.

    Learn more about metrics.

  1. Workers for Platforms customers can now attach static assets (HTML, CSS, JavaScript, images) directly to User Workers, removing the need to host separate infrastructure to serve the assets.

    This allows your platform to serve entire front-end applications from Cloudflare's global edge, utilizing caching for fast load times, while supporting dynamic logic within the same Worker. Cloudflare automatically scales its infrastructure to handle high traffic volumes, enabling you to focus on building features without managing servers.

    What you can build

    Static Sites: Host and serve HTML, CSS, JavaScript, and media files directly from Cloudflare's network, ensuring fast loading times worldwide. This is ideal for blogs, landing pages, and documentation sites because static assets can be efficiently cached and delivered closer to the user, reducing latency and enhancing the overall user experience.

    Full-Stack Applications: Combine asset hosting with Cloudflare Workers to power dynamic, interactive applications. If you're an e-commerce platform, you can serve your customers' product pages and run inventory checks from within the same Worker.

    index.js
    export default {
    async fetch(request, env) {
    const url = new URL(request.url);
    // Check real-time inventory
    if (url.pathname === "/api/inventory/check") {
    const product = url.searchParams.get("product");
    const inventory = await env.INVENTORY_KV.get(product);
    return new Response(inventory);
    }
    // Serve static assets (HTML, CSS, images)
    return env.ASSETS.fetch(request);
    },
    };

    Get Started: Upload static assets using the Workers for Platforms API or Wrangler. For more information, visit our Workers for Platforms documentation.

  1. You can now transform HTML elements with streamed content using HTMLRewriter.

    Methods like replace, append, and prepend now accept Response and ReadableStream values as Content.

    This can be helpful in a variety of situations. For instance, you may have a Worker in front of an origin, and want to replace an element with content from a different source. Prior to this change, you would have to load all of the content from the upstream URL and convert it into a string before replacing the element. This slowed down overall response times.

    Now, you can pass the Response object directly into the replace method, and HTMLRewriter will immediately start replacing the content as it is streamed in. This makes responses faster.

    index.js
    class ElementRewriter {
    async element(element) {
    // able to replace elements while streaming content
    // the fetched body is not buffered into memory as part
    // of the replace
    let res = await fetch("https://upstream-content-provider.example");
    element.replace(res);
    }
    }
    export default {
    async fetch(request, env, ctx) {
    let response = await fetch("https://site-to-replace.com");
    return new HTMLRewriter()
    .on("[data-to-replace]", new ElementRewriter())
    .transform(response);
    },
    };

    For more information, see the HTMLRewriter documentation.

  1. We have released new Workers bindings API methods, allowing you to connect Workers applications to AI Gateway directly. These methods simplify how Workers calls AI services behind your AI Gateway configurations, removing the need to use the REST API and manually authenticate.

    To add an AI binding to your Worker, include the following in your Wrangler configuration file:

    Add an AI binding to your Worker.

    With the new AI Gateway binding methods, you can now:

    • Send feedback and update metadata with patchLog.
    • Retrieve detailed log information using getLog.
    • Execute universal requests to any AI Gateway provider with run.

    For example, to send feedback and update metadata using patchLog:

    Send feedback and update metadata using patchLog:
  1. Browser Rendering now supports 10 concurrent browser instances per account and 10 new instances per minute, up from the previous limits of 2.

    This allows you to launch more browser tasks from Cloudflare Workers.

    To manage concurrent browser sessions, you can use Queues or Workflows:

    index.js
    export default {
    async queue(batch, env) {
    for (const message of batch.messages) {
    const browser = await puppeteer.launch(env.BROWSER);
    const page = await browser.newPage();
    try {
    await page.goto(message.url, {
    waitUntil: message.waitUntil,
    });
    // Process page...
    } finally {
    await browser.close();
    }
    }
    },
    };
  1. Stream's generated captions leverage Workers AI to automatically transcribe audio and provide captions to the player experience. We have added support for these languages:

    • cs - Czech
    • nl - Dutch
    • fr - French
    • de - German
    • it - Italian
    • ja - Japanese
    • ko - Korean
    • pl - Polish
    • pt - Portuguese
    • ru - Russian
    • es - Spanish

    For more information, learn about adding captions to videos.

  1. Hyperdrive now automatically configures your Cloudflare Tunnel to connect to your private database.

    Automatic configuration of Cloudflare Access and Service Token in the Cloudflare dashboard for Hyperdrive.

    When creating a Hyperdrive configuration for a private database, you only need to provide your database credentials and set up a Cloudflare Tunnel within the private network where your database is accessible. Hyperdrive will automatically create the Cloudflare Access, Service Token, and Policies needed to secure and restrict your Cloudflare Tunnel to the Hyperdrive configuration.

    To create a Hyperdrive for a private database, you can follow the Hyperdrive documentation. You can still manually create the Cloudflare Access, Service Token, and Policies if you prefer.

    This feature is available from the Cloudflare dashboard.

  1. You can now have up to 1000 Workers KV namespaces per account.

    Workers KV namespace limits were increased from 200 to 1000 for all accounts. Higher limits for Workers KV namespaces enable better organization of key-value data, such as by category, tenant, or environment.

    Consult the Workers KV limits documentation for the rest of the limits. This increased limit is available for both the Free and Paid Workers plans.