# Build Agents on Cloudflare
URL: https://developers.cloudflare.com/agents/
import {
CardGrid,
Description,
Feature,
LinkButton,
LinkTitleCard,
PackageManagers,
Plan,
RelatedProduct,
Render,
TabItem,
Tabs,
} from "~/components";
Build and deploy AI-powered Agents on Cloudflare that can autonomously perform tasks, communicate with clients in real time, persist state, execute long-running and repeat tasks on a schedule, send emails, run asynchronous workflows, browse the web, query data from your Postgres database, call AI models, support human-in-the-loop use-cases, and more.
#### Ship your first Agent
Use the agent started template to create your first Agent with the `agents-sdk`:
```sh
# install it
npm create cloudflare@latest agents-starter -- --template=cloudflare/agents-starter
# and deploy it
npx wrangler@latest deploy
```
Head to the guide on [building a chat agent](/agents/getting-started/build-a-chat-agent) to learn how to build and deploy an Agent to prod.
If you're already building on [Workers](/workers/), you can install the `agents-sdk` package directly into an existing project:
```sh
npm i agents-sdk
```
Dive into the [Agent SDK reference](/agents/api-reference/sdk/) to learn more about how to use the `agents-sdk` package and defining an `Agent`.
#### Why build agents on Cloudflare?
We built the `agents-sdk` with a few things in mind:
- **Batteries (state) included**: Agents come with [built-in state management](/agents/examples/manage-and-sync-state/), with the ability to automatically sync state between an Agent and clients, trigger events on state changes, and read+write to each Agent's SQL database.
- **Communicative**: You can connect to an Agent via [WebSockets](/agents/examples/websockets/) and stream updates back to client in real-time. Handle a long-running response from a reasoning model, the results of an [asynchronous workflow](/agents/examples/run-workflows/), or build a chat app that builds on the `useAgent` hook included in the `agents-sdk`.
- **Extensible**: Agents are code. Use the [AI models](/agents/examples/using-ai-models/) you want, bring-your-own headless browser service, pull data from your database hosted in another cloud, add your own methods to your Agent and call them.
Agents built with `agents-sdk` can be deployed directly to Cloudflare and run on top of [Durable Objects](/durable-objects/) — which you can think of as stateful micro-servers that can scale to tens of millions — and are able to run wherever they need to. Run your Agents close to a user for low-latency interactivity, close to your data for throughput, and/or anywhere in between.
***
#### Build on the Cloudflare Platform
Build serverless applications and deploy instantly across the globe for exceptional performance, reliability, and scale.
Observe and control your AI applications with caching, rate limiting, request retries, model fallback, and more.
Build full-stack AI applications with Vectorize, Cloudflare’s vector database. Adding Vectorize enables you to perform tasks such as semantic search, recommendations, anomaly detection or can be used to provide context and memory to an LLM.
Run machine learning models, powered by serverless GPUs, on Cloudflare's global network.
Build stateful agents that guarantee executions, including automatic retries, persistent state that runs for minutes, hours, days, or weeks.
---
# Configuration
URL: https://developers.cloudflare.com/agents/api-reference/configuration/
import { MetaInfo, Render, Type, WranglerConfig } from "~/components";
An Agent is configured like any other Cloudflare Workers project, and uses [a wrangler configuration](/workers/wrangler/configuration/) file to define where your code is and what services (bindings) it will use.
The typical file structure for an Agent project created from `npm create cloudflare@latest agents-starter -- --template cloudflare/agents-starter` follows:
```sh
.
|-- package-lock.json
|-- package.json
|-- public
| `-- index.html
|-- src
| `-- index.ts // your Agent definition
|-- test
| |-- index.spec.ts // your tests
| `-- tsconfig.json
|-- tsconfig.json
|-- vitest.config.mts
|-- worker-configuration.d.ts
`-- wrangler.jsonc // your Workers & Agent configuration
```
Below is a minimal `wrangler.jsonc` file that defines the configuration for an Agent, including the entry point, `durable_object` namespace, and code `migrations`:
```jsonc
{
"$schema": "node_modules/wrangler/config-schema.json",
"name": "agents-example",
"main": "src/index.ts",
"compatibility_date": "2025-02-23",
"compatibility_flags": ["nodejs_compat"],
"durable_objects": {
"bindings": [
{
// Required:
"name": "MyAgent", // How your Agent is called from your Worker
"class_name": "MyAgent", // Must match the class name of the Agent in your code
// Optional: set this if the Agent is defined in another Worker script
"script_name": "the-other-worker"
},
],
},
"migrations": [
{
"tag": "v1",
// Mandatory for the Agent to store state
"new_sqlite_classes": ["MyAgent"],
},
],
"observability": {
"enabled": true,
},
}
```
The configuration includes:
- A `main` field that points to the entry point of your Agent, which is typically a TypeScript (or JavaScript) file.
- A `durable_objects` field that defines the [Durable Object namespace](/durable-objects/reference/glossary/) that your Agents will run within.
- A `migrations` field that defines the code migrations that your Agent will use. This field is mandatory and must contain at least one migration. The `new_sqlite_classes` field is mandatory for the Agent to store state.
---
# API Reference
URL: https://developers.cloudflare.com/agents/api-reference/
import { DirectoryListing } from "~/components"
---
# Agents SDK
URL: https://developers.cloudflare.com/agents/api-reference/sdk/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
At its most basic, an Agent is a JavaScript class that extends the `Agent` class from the `agents-sdk` package. An Agent encapsulates all of the logic for an Agent, including how clients can connect to it, how it stores state, the methods it exposes, and any error handling.
```ts
import { Agent } from "agents-sdk";
class MyAgent extends Agent {
// Define methods on the Agent
}
export default MyAgent;
```
An Agent can have many (millions of) instances: each instance is a separate micro-server that runs independently of the others. This allows Agents to scale horizontally: an Agent can be associated with a single user, or many thousands of users, depending on the agent you're building.
Instances of an Agent are addressed by a unique identifier: that identifier (ID) can be the user ID, an email address, GitHub username, a flight ticket number, an invoice ID, or any other identifier that helps to uniquely identify the instance and for whom it is acting on behalf of.
## The Agent class
Writing an Agent requires you to define a class that extends the `Agent` class from the `agents-sdk` package. An Agent encapsulates all of the logic for an Agent, including how clients can connect to it, how it stores state, the methods it exposes, and any error handling.
An Agent has the following class methods:
```ts
import { Agent } from "agents-sdk";
interface Env {
// Define environment variables & bindings here
}
// Pass the Env as a TypeScript type argument
// Any services connected to your Agent or Worker as Bindings
// are then available on this.env.
class MyAgent extends Agent {
// Called when an Agent is started (or woken up)
async onStart() {
// Can access this.env and this.state
console.log('Agent started');
}
// Called when a HTTP request is received
// Can be connected to routeAgentRequest to automatically route
// requests to an individual Agent.
async onRequest(request: Request) {
console.log("Received request!");
}
// Called when a WebSocket connection is established
async onConnect(connection: Connection, ctx: ConnectionContext) {
console.log("Connected!");
// Check the request at ctx.request
// Authenticate the client
// Give them the OK.
connection.accept();
}
// Called for each message received on the WebSocket connection
async onMessage(connection: Connection, message: WSMessage) {
console.log(`message from client ID: ${connection.id}`)
// Send messages back to the client
connection.send("Hello!");
}
// WebSocket error and disconnection (close) handling.
async onError(connection: Connection, error: unknown): Promise {
console.error(`WS error: ${error}`);
}
async onClose(connection: Connection, code: number, reason: string, wasClean: boolean): Promise {
console.log(`WS closed: ${code} - ${reason} - wasClean: ${wasClean}`);
connection.close();
}
// Called when the Agent's state is updated
// via this.setState or the useAgent hook from the agents-sdk/react package.
async onStateUpdate(state: any) {
// 'state' will be typed if you supply a type parameter to the Agent class.
}
}
export default MyAgent;
```
:::note
To learn more about how to manage state within an Agent, refer to the documentation on [managing and syncing state](/agents/examples/manage-and-sync-state/).
:::
You can also define your own methods on an Agent: it's technically valid to publish an Agent only has your own methods exposed, and create/get Agents directly from a Worker.
Your own methods can access the Agent's environment variables and bindings on `this.env`, state on `this.setState`, and call other methods on the Agent via `this.yourMethodName`.
## Calling Agents from Workers
You can create and run an instance of an Agent directly from a Worker in one of three ways:
1. Using the `routeAgentRequest` helper: this will automatically map requests to an individual Agent based on the `/agents/:agent/:name` URL pattern. The value of `:agent` will be the name of your Agent class converted to `kebab-case`, and the value of `:name` will be the name of the Agent instance you want to create or retrieve.
2. Calling `getAgentByName`, which will create a new Agent instance if none exists by that name, or retrieve a handle to an existing instance.
3. The [Durable Objects stub API](/durable-objects/api/id/), which provides a lower level API for creating and retrieving Agents.
These three patterns are shown below: we recommend using either `routeAgentRequest` or `getAgentByName`, which help avoid some boilerplate.
```ts
import { Agent, AgentNamespace, getAgentByName, routeAgentRequest } from 'agents-sdk';
interface Env {
// Define your Agent on the environment here
// Passing your Agent class as a TypeScript type parameter allows you to call
// methods defined on your Agent.
MyAgent: AgentNamespace;
}
export default {
async fetch(request, env, ctx): Promise {
// Routed addressing
// Automatically routes HTTP requests and/or WebSocket connections to /agents/:agent/:name
// Best for: connecting React apps directly to Agents using useAgent from agents-sdk/react
(await routeAgentRequest(request, env)) || Response.json({ msg: 'no agent here' }, { status: 404 });
// Named addressing
// Best for: convenience method for creating or retrieving an agent by name/ID.
let namedAgent = getAgentByName(env.MyAgent, 'my-unique-agent-id');
// Pass the incoming request straight to your Agent
let namedResp = (await namedAgent).fetch(request);
// Durable Objects-style addressing
// Best for: controlling ID generation, associating IDs with your existing systems,
// and customizing when/how an Agent is created or invoked
const id = env.MyAgent.newUniqueId();
const agent = env.MyAgent.get(id);
// Pass the incoming request straight to your Agent
let resp = await agent.fetch(request);
return Response.json({ hello: 'visit https://developers.cloudflare.com/agents for more' });
},
} satisfies ExportedHandler;
export class MyAgent extends Agent {
// Your Agent implementation goes here
}
```
---
# Calling LLMs
URL: https://developers.cloudflare.com/agents/concepts/calling-llms/
import { Render } from "~/components";
### Understanding LLM providers and model types
Different LLM providers offer models optimized for specific types of tasks. When building AI systems, choosing the right model is crucial for both performance and cost efficiency.
#### Reasoning Models
Models like OpenAI's o1, Anthropic's Claude, and DeepSeek's R1 are particularly well-suited for complex reasoning tasks. These models excel at:
- Breaking down problems into steps
- Following complex instructions
- Maintaining context across long conversations
- Generating code and technical content
For example, when implementing a travel booking system, you might use a reasoning model to analyze travel requirements and generate appropriate booking strategies.
#### Instruction Models
Models like GPT-4 and Claude Instant are optimized for following straightforward instructions efficiently. They work well for:
- Content generation
- Simple classification tasks
- Basic question answering
- Text transformation
These models are often more cost-effective for straightforward tasks that do not require complex reasoning.
---
# Human in the Loop
URL: https://developers.cloudflare.com/agents/concepts/human-in-the-loop/
import { Render, Note, Aside } from "~/components";
### What is Human-in-the-Loop?
Human-in-the-Loop (HITL) workflows integrate human judgment and oversight into automated processes. These workflows pause at critical points for human review, validation, or decision-making before proceeding. This approach combines the efficiency of automation with human expertise and oversight where it matters most.

#### Understanding Human-in-the-Loop workflows
In a Human-in-the-Loop workflow, processes are not fully automated. Instead, they include designated checkpoints where human intervention is required. For example, in a travel booking system, a human may want to confirm the travel before an agent follows through with a transaction. The workflow manages this interaction, ensuring that:
1. The process pauses at appropriate review points
2. Human reviewers receive necessary context
3. The system maintains state during the review period
4. Review decisions are properly incorporated
5. The process continues once approval is received
### Best practices for Human-in-the-Loop workflows
#### Long-Term State Persistence
Human review processes do not operate on predictable timelines. A reviewer might need days or weeks to make a decision, especially for complex cases requiring additional investigation or multiple approvals. Your system needs to maintain perfect state consistency throughout this period, including:
- The original request and context
- All intermediate decisions and actions
- Any partial progress or temporary states
- Review history and feedback
:::note[Tip]
[Durable Objects](/durable-objects/) provide an ideal solution for managing state in Human-in-the-Loop workflows, offering persistent compute instances that maintain state for hours, weeks, or months.
:::
#### Continuous Improvement Through Evals
Human reviewers play a crucial role in evaluating and improving LLM performance. Implement a systematic evaluation process where human feedback is collected not just on the final output, but on the LLM's decision-making process. This can include:
- Decision Quality Assessment: Have reviewers evaluate the LLM's reasoning process and decision points, not just the final output.
- Edge Case Identification: Use human expertise to identify scenarios where the LLM's performance could be improved.
- Feedback Collection: Gather structured feedback that can be used to fine-tune the LLM or adjust the workflow. [AI Gateway](/ai-gateway/evaluations/add-human-feedback/) can be a useful tool for setting up an LLM feedback loop.
#### Error handling and recovery
Robust error handling is essential for maintaining workflow integrity. Your system should gracefully handle various failure scenarios, including reviewer unavailability, system outages, or conflicting reviews. Implement clear escalation paths for handling exceptional cases that fall outside normal parameters.
The system should maintain stability during paused states, ensuring that no work is lost even during extended review periods. Consider implementing automatic checkpointing that allows workflows to be resumed from the last stable state after any interruption.
---
# Concepts
URL: https://developers.cloudflare.com/agents/concepts/
import { DirectoryListing } from "~/components";
---
# Tools
URL: https://developers.cloudflare.com/agents/concepts/tools/
### What are tools?
Tools enable AI systems to interact with external services and perform actions. They provide a structured way for agents and workflows to invoke APIs, manipulate data, and integrate with external systems. Tools form the bridge between AI decision-making capabilities and real-world actions.
### Understanding tools
In an AI system, tools are typically implemented as function calls that the AI can use to accomplish specific tasks. For example, a travel booking agent might have tools for:
- Searching flight availability
- Checking hotel rates
- Processing payments
- Sending confirmation emails
Each tool has a defined interface specifying its inputs, outputs, and expected behavior. This allows the AI system to understand when and how to use each tool appropriately.
### Common tool patterns
#### API integration tools
The most common type of tools are those that wrap external APIs. These tools handle the complexity of API authentication, request formatting, and response parsing, presenting a clean interface to the AI system.
#### Model Context Protocol (MCP)
The [Model Context Protocol](https://modelcontextprotocol.io/introduction) provides a standardized way to define and interact with tools. Think of it as an abstraction on top of APIs designed for LLMs to interact with external resources. MCP defines a consistent interface for:
- **Tool Discovery**: Systems can dynamically discover available tools
- **Parameter Validation**: Tools specify their input requirements using JSON Schema
- **Error Handling**: Standardized error reporting and recovery
- **State Management**: Tools can maintain state across invocations
#### Data processing tools
Tools that handle data transformation and analysis are essential for many AI workflows. These might include:
- CSV parsing and analysis
- Image processing
- Text extraction
- Data validation
---
# Agents
URL: https://developers.cloudflare.com/agents/concepts/what-are-agents/
import { Render } from "~/components";
### What are agents?
An agent is an AI system that can autonomously execute tasks by making decisions about tool usage and process flow. Unlike traditional automation that follows predefined paths, agents can dynamically adapt their approach based on context and intermediate results. Agents are also distinct from co-pilots (e.g. traditional chat applications) in that they can fully automate a task, as opposed to simply augmenting and extending human input.
- **Agents** → non-linear, non-deterministic (can change from run to run)
- **Workflows** → linear, deterministic execution paths
- **Co-pilots** → augmentative AI assistance requiring human intervention
### Example: Booking vacations
If this is your first time working with, or interacting with agents, this example will illustrate how an agent works within a context like booking a vacation. If you are already familiar with the topic, read on.
Imagine you're trying to book a vacation. You need to research flights, find hotels, check restaurant reviews, and keep track of your budget.
#### Traditional workflow automation
A traditional automation system follows a predetermined sequence:
- Takes specific inputs (dates, location, budget)
- Calls predefined API endpoints in a fixed order
- Returns results based on hardcoded criteria
- Cannot adapt if unexpected situations arise

#### AI Co-pilot
A co-pilot acts as an intelligent assistant that:
- Provides hotel and itinerary recommendations based on your preferences
- Can understand and respond to natural language queries
- Offers guidance and suggestions
- Requires human decision-making and action for execution

#### Agent
An agent combines AI's ability to make judgements and call the relevant tools to execute the task. An agent's output will be nondeterministic given:
- Real-time availability and pricing changes
- Dynamic prioritization of constraints
- Ability to recover from failures
- Adaptive decision-making based on intermediate results

An agents can dynamically generate an itinerary and execute on booking reservations, similarly to what you would expect from a travel agent.
### Three primary components of agent systems:
- **Decision Engine**: Usually an LLM (Large Language Model) that determines action steps
- **Tool Integration**: APIs, functions, and services the agent can utilize
- **Memory System**: Maintains context and tracks task progress
#### How agents work
Agents operate in a continuous loop of:
1. **Observing** the current state or task
2. **Planning** what actions to take, using AI for reasoning
3. **Executing** those actions using available tools (often APIs or [MCPs](https://modelcontextprotocol.io/introduction))
4. **Learning** from the results (storing results in memory, updating task progress, and preparing for next iteration)
---
# Workflows
URL: https://developers.cloudflare.com/agents/concepts/workflows/
import { Render } from "~/components";
## What are workflows?
A workflow is the orchestration layer that coordinates how an agent's components work together. It defines the structured paths through which tasks are processed, tools are called, and results are managed. While agents make dynamic decisions about what to do, workflows provide the underlying framework that governs how those decisions are executed.
### Understanding workflows in agent systems
Think of a workflow like the operating procedures of a company. The company (agent) can make various decisions, but how those decisions get implemented follows established processes (workflows). For example, when you book a flight through a travel agent, they might make different decisions about which flights to recommend, but the process of actually booking the flight follows a fixed sequence of steps.
Let's examine a basic agent workflow:
### Core components of a workflow
A workflow typically consists of several key elements:
1. **Input Processing**
The workflow defines how inputs are received and validated before being processed by the agent. This includes standardizing formats, checking permissions, and ensuring all required information is present.
2. **Tool Integration**
Workflows manage how external tools and services are accessed. They handle authentication, rate limiting, error recovery, and ensuring tools are used in the correct sequence.
3. **State Management**
The workflow maintains the state of ongoing processes, tracking progress through multiple steps and ensuring consistency across operations.
4. **Output Handling**
Results from the agent's actions are processed according to defined rules, whether that means storing data, triggering notifications, or formatting responses.
---
# Getting started
URL: https://developers.cloudflare.com/agents/getting-started/
import { DirectoryListing } from "~/components"
---
# Testing your Agents
URL: https://developers.cloudflare.com/agents/getting-started/testing-your-agent/
import { Render, PackageManagers, WranglerConfig } from "~/components"
Because Agents run on Cloudflare Workers and Durable Objects, they can be tested using the same tools and techniques as Workers and Durable Objects.
## Writing and running tests
### Setup
:::note
The `agents-sdk-starter` template and new Cloudflare Workers projects already include the relevant `vitest` and `@cloudflare/vitest-pool-workers` packages, as well as a valid `vitest.config.js` file.
:::
Before you write your first test, install the necessary packages:
```sh
npm install vitest@~3.0.0 --save-dev --save-exact
npm install @cloudflare/vitest-pool-workers --save-dev
```
Ensure that your `vitest.config.js` file is identical to the following:
```js
import { defineWorkersConfig } from "@cloudflare/vitest-pool-workers/config";
export default defineWorkersConfig({
test: {
poolOptions: {
workers: {
wrangler: { configPath: "./wrangler.toml" },
},
},
},
});
```
### Add the Agent configuration
Add a `durableObjects` configuration to `vitest.config.js` with the name of your Agent class:
```js
import { defineWorkersConfig } from '@cloudflare/vitest-pool-workers/config';
export default defineWorkersConfig({
test: {
poolOptions: {
workers: {
main: './src/index.ts',
miniflare: {
durableObjects: {
NAME: 'MyAgent',
},
},
},
},
},
});
```
### Write a test
:::note
Review the [Vitest documentation](https://vitest.dev/) for more information on testing, including the test API reference and advanced testing techniques.
:::
Tests use the `vitest` framework. A basic test suite for your Agent can validate how your Agent responds to requests, but can also unit test your Agent's methods and state.
```ts
import { env, createExecutionContext, waitOnExecutionContext, SELF } from 'cloudflare:test';
import { describe, it, expect } from 'vitest';
import worker from '../src';
import { Env } from '../src';
interface ProvidedEnv extends Env {}
describe('make a request to my Agent', () => {
// Unit testing approach
it('responds with state', async () => {
// Provide a valid URL that your Worker can use to route to your Agent
// If you are using routeAgentRequest, this will be /agent/:agent/:name
const request = new Request('http://example.com/agent/my-agent/agent-123');
const ctx = createExecutionContext();
const response = await worker.fetch(request, env, ctx);
await waitOnExecutionContext(ctx);
expect(await response.text()).toMatchObject({ hello: 'from your agent' });
});
it('also responds with state', async () => {
const request = new Request('http://example.com/agent/my-agent/agent-123');
const response = await SELF.fetch(request);
expect(await response.text()).toMatchObject({ hello: 'from your agent' });
});
});
```
### Run tests
Running tests is done using the `vitest` CLI:
```sh
$ npm run test
# or run vitest directly
$ npx vitest
```
```sh output
MyAgent
✓ should return a greeting (1 ms)
Test Files 1 passed (1)
```
Review the [documentation on testing](/workers/testing/vitest-integration/get-started/write-your-first-test/) for additional examples and test configuration.
## Running Agents locally
You can also run an Agent locally using the `wrangler` CLI:
```sh
$ npx wrangler dev
```
```sh output
Your Worker and resources are simulated locally via Miniflare. For more information, see: https://developers.cloudflare.com/workers/testing/local-development.
Your worker has access to the following bindings:
- Durable Objects:
- MyAgent: MyAgent
Starting local server...
[wrangler:inf] Ready on http://localhost:53645
```
This spins up a local development server that runs the same runtime as Cloudflare Workers, and allows you to iterate on your Agent's code and test it locally without deploying it.
Visit the [`wrangler dev`](https://developers.cloudflare.com/workers/wrangler/commands/#dev) docs to review the CLI flags and configuration options.
---
# Browse the web
URL: https://developers.cloudflare.com/agents/examples/browse-the-web/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
Agents can browse the web using the [Browser Rendering](/browser-rendering/) API or your preferred headless browser service.
### Browser Rendering API
The [Browser Rendering](/browser-rendering/) allows you to spin up headless browser instances, render web pages, and interact with websites through your Agent.
You can define a method that uses Puppeteer to pull the content of a web page, parse the DOM, and extract relevant information by calling the OpenAI model:
```ts
interface Env {
BROWSER: Fetcher;
}
export class MyAgent extends Agent {
async browse(browserInstance: Fetcher, urls: string[]) {
let responses = [];
for (const url of urls) {
const browser = await puppeteer.launch(browserInstance);
const page = await browser.newPage();
await page.goto(url);
await page.waitForSelector('body');
const bodyContent = await page.$eval('body', (element) => element.innerHTML);
const client = new OpenAI({
apiKey: this.env.OPENAI_API_KEY,
});
let resp = await client.chat.completions.create({
model: this.env.MODEL,
messages: [
{
role: 'user',
content: `Return a JSON object with the product names, prices and URLs with the following format: { "name": "Product Name", "price": "Price", "url": "URL" } from the website content below. ${bodyContent}`,
},
],
response_format: {
type: 'json_object',
},
});
responses.push(resp);
await browser.close();
}
return responses;
}
}
```
You'll also need to add install the `@cloudflare/puppeteer` package and add the following to the wrangler configuration of your Agent:
```sh
npm install @cloudflare/puppeteer --save-dev
```
```jsonc
{
// ...
"browser": {
"binding": "MYBROWSER"
}
// ...
}
```
### Browserbase
You can also use [Browserbase](https://docs.browserbase.com/integrations/cloudflare/typescript) by using the Browserbase API directly from within your Agent.
Once you have your [Browserbase API key](https://docs.browserbase.com/integrations/cloudflare/typescript), you can add it to your Agent by creating a [secret](/workers/configuration/secrets/):
```sh
cd your-agent-project-folder
npx wrangler@latest secret put BROWSERBASE_API_KEY
```
```sh output
Enter a secret value: ******
Creating the secret for the Worker "agents-example"
Success! Uploaded secret BROWSERBASE_API_KEY
```
Install the `@cloudflare/puppeteer` package and use it from within your Agent to call the Browserbase API:
```sh
npm install @cloudflare/puppeteer
```
```ts
interface Env {
BROWSERBASE_API_KEY: string;
}
export class MyAgent extends Agent {
constructor(env: Env) {
super(env);
}
}
```
---
# Build an MCP Server
URL: https://developers.cloudflare.com/agents/examples/build-mcp-server/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
import { Aside } from '@astrojs/starlight/components';
[Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) is an open standard that allows AI agents and assistants (like [Claude Desktop](https://claude.ai/download) or [Cursor](https://www.cursor.com/)) to interact with services directly. If you want users to access your service through an AI assistant, you can spin up an MCP server for your application.
### Building an MCP Server on Workers
With Cloudflare Workers and the [workers-mcp](https://github.com/cloudflare/workers-mcp/) package, you can turn any API or service into an MCP server with minimal setup. Just define your API methods as TypeScript functions, and workers-mcp takes care of tool discovery, protocol handling, and request routing. Once deployed, MCP clients like Claude can connect and interact with your service automatically.
Below we will run you through an example of building an MCP server that fetches weather data from an external API and exposes it as an MCP tool that Claude can call directly:
**How it works:**
* **TypeScript methods as MCP tools:** Each public method in your class is exposed as an MCP tool that agents can call. In this example, getWeather is the tool that fetches data from an external weather API.
* **Automatic tool documentation:** JSDoc comments define the tool description, parameters, and return values, so Claude knows exactly how to call your method and interpret the response.
* **Build-in MCP compatibility:** The `ProxyToSelf` class translates incoming requests into relevant JS RPC methods
* **Enforced type safety:** Parameter and return types are automatically derived from your TypeScript definitions
## Get Started
Follow these steps to create and deploy your own MCP server on Cloudflare Workers.
### Create a new Worker
If you haven't already, install [Wrangler](https://developers.cloudflare.com/workers/wrangler/) and log in:
```bash
npm install wrangler
wrangler login
```
Initialize a new project:
```bash
npx create-cloudflare@latest my-mcp-worker
cd my-mcp-worker
```
### Configure your wrangler file
When you run the setup command, it will build your Worker using the configuration in your wrangler.toml or wrangler.jsonc file. By default, no additional configuration is needed, but if you have multiple Cloudflare accounts, you'll need to specify your account ID as shown below.
```toml
name = "my-mcp-worker"
main = "src/index.ts"
compatibility_date = "2025-03-03"
account_id = "your-account-id"
```
### Install the MCP tooling
Inside your project directory, install the [workers-mcp](https://github.com/cloudflare/workers-mcp) package:
```bash
npm install workers-mcp
```
This package provides the tools needed to run your Worker as an MCP server.
### Configure your Worker to support MCP
Run the following setup command:
```bash
npx workers-mcp setup
```
This guided installation process takes a brand new or existing Workers project and adds the required tooling to turn it into an MCP server:
- Automatic documentation generation
- Shared-secret security using Wrangler Secrets
- Installs a local proxy so you can access it from your MCP desktop clients (like Claude Desktop)
### Set up the MCP Server
The setup command will automatically update your src/index.ts with the following code:
```ts
import { WorkerEntrypoint } from 'cloudflare:workers';
import { ProxyToSelf } from 'workers-mcp';
export default class MyWorker extends WorkerEntrypoint {
/**
* A warm, friendly greeting from your new MCP server.
* @param name {string} The name of the person to greet.
* @return {string} The greeting message.
*/
sayHello(name: string) {
return `Hello from an MCP Worker, ${name}!`;
}
/**
* @ignore
*/
async fetch(request: Request): Promise {
// ProxyToSelf handles MCP protocol compliance.
return new ProxyToSelf(this).fetch(request);
}
}
```
This converts your Cloudflare Worker into an MCP server, enabling interactions with AI assistants. **The key components are:**
- **WorkerEntrypoint:** The WorkerEntrypoint class handles all incoming request management and routing. This provides the structure needed to expose MCP tools within the Worker.
- **Tool Definition:** Methods, for example, sayHello, are annotated with JSDoc, which automatically registers the method as an MCP tool. AI assistants can call this method dynamically, passing a name and receiving a greeting in response. Additional tools can be defined using the same pattern.
- **ProxyToSelf:** MCP servers must follow a specific request/response format. ProxyToSelf ensures that incoming requests are properly routed to the correct MCP tools. Without this, you would need to manually parse requests and validate responses.
**Note:** Every public method that is annotated with JSDoc becomes an MCP tool that is discoverable by AI assistants.
### Add data fetching to the MCP
When building an MCP, in many cases, you will need to connect to a resource or an API to take an action. To do this you can use the `fetch` method to make direct API calls, such as in the example below for grabbing the current wearther:
```ts
import { WorkerEntrypoint } from 'cloudflare:workers';
import { ProxyToSelf } from 'workers-mcp';
export default class WeatherWorker extends WorkerEntrypoint {
/**
* Get current weather for a location
* @param location {string} City name or zip code
* @return {object} Weather information
*/
async getWeather(location: string) {
// Connect to a weather API
const response = await fetch(`https://api.weather.example/v1/${location}`);
const data = await response.json();
return {
temperature: data.temp,
conditions: data.conditions,
forecast: data.forecast
};
}
/**
* @ignore
*/
async fetch(request: Request): Promise {
// ProxyToSelf handles MCP protocol compliance
return new ProxyToSelf(this).fetch(request);
}
}
```
### Deploy the MCP server
Update your wrangler.toml with the appropriate configuration then deploy your Worker:
```bash
npx wrangler deploy
```
Your MCP server is now deployed globally and all your public class methods are exposed as MCP tools that AI assistants can now interact with.
#### Updating your MCP server
When you make changes to your MCP server, run the following command to update it:
```bash
npm run deploy
```
**Note:** If you change method names, parameters, or add/remove methods, Claude and other MCP clients will not see these updates until you restart them. This is because MCP clients cache the tool metadata for performance reasons.
### Connecting MCP clients to your server
The `workers-mcp setup` command automatically configures Claude Desktop to work with your MCP server. To use your MCP server through other [MCP clients](https://modelcontextprotocol.io/clients), you'll need to configure them manually.
#### Cursor
To get your Cloudflare MCP server working in [Cursor](https://modelcontextprotocol.io/clients#cursor), you need to combine the 'command' and 'args' from your config file into a single string and use type 'command'.
In Cursor, create an MCP server entry with:
type: command
command: /path/to/workers-mcp run your-mcp-server-name https://your-server-url.workers.dev /path/to/your/project
For example, using the same configuration as above, your Cursor command would be:
```
/Users/username/path/to/my-new-worker/node_modules/.bin/workers-mcp run my-new-worker https://my-new-worker.username.workers.dev /Users/username/path/to/my-new-worker
```
#### Other MCP clients
For [Windsurf](https://modelcontextprotocol.io/clients#windsurf-editor) and other [MCP clients](https://modelcontextprotocol.io/clients#client-details), update your configuration file to include your worker using the same format as Claude Desktop:
```json
{
"mcpServers": {
"your-mcp-server-name": {
"command": "/path/to/workers-mcp",
"args": [
"run",
"your-mcp-server-name",
"https://your-server-url.workers.dev",
"/path/to/your/project"
],
"env": {}
}
}
}
```
Make sure to replace the placeholders with your actual server name, URL, and project path.
### Coming soon
The Model Context Protocol spec is [actively evolving](https://github.com/modelcontextprotocol/specification/discussions) and we're working on expanding Cloudflare's MCP support. Here's what we're working on:
- **Remote MCP support**: Connect to MCP servers directly without requiring a local proxy
- **Authentication**: OAuth support for secure MCP server connections
- **Real-time communication**: SSE (Server-Sent Events) and WebSocket support for persistent connections and stateful interactions between clients and servers
- **Extended capabilities**: Native support for more MCP protocol capabilities like [resources](https://modelcontextprotocol.io/docs/concepts/resources), [prompts](https://modelcontextprotocol.io/docs/concepts/prompts) and [sampling](https://modelcontextprotocol.io/docs/concepts/sampling)
---
# Examples
URL: https://developers.cloudflare.com/agents/examples/
import { DirectoryListing, PackageManagers } from "~/components";
Agents running on Cloudflare can:
---
# Manage and sync state
URL: https://developers.cloudflare.com/agents/examples/manage-and-sync-state/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
Every Agent has built-in state management capabilities, including built-in storage and synchronization between the Agent and frontend applications. State within an Agent is:
* Persisted across Agent restarts: data is permanently persisted within the Agent.
* Automatically serialized/deserialized: you can store any JSON-serializable data.
* Immediately consistent within the Agent: read your own writes.
* Thread-safe for concurrent updates
Agent state is stored in a SQL database that is embedded within each individual Agent instance: you can interact with it using the higher-level `this.setState` API (recommended) or by directly querying the database with `this.sql`.
#### State API
Every Agent has built-in state management capabilities. You can set and update the Agent's state directly using `this.setState`:
```ts
import { Agent } from "agents-sdk";
export class MyAgent extends Agent {
// Update state in response to events
async incrementCounter() {
this.setState({
...this.state,
counter: this.state.counter + 1,
});
}
// Handle incoming messages
async onMessage(message) {
if (message.type === "update") {
this.setState({
...this.state,
...message.data,
});
}
}
// Handle state updates
onStateUpdate(state, source: "server" | Connection) {
console.log("state updated", state);
}
}
```
If you're using TypeScript, you can also provide a type for your Agent's state by passing in a type as a [type parameter](https://www.typescriptlang.org/docs/handbook/2/generics.html#using-type-parameters-in-generic-constraints) as the _second_ type parameter to the `Agent` class definition.
```ts
import { Agent } from "agents-sdk";
interface Env {}
// Define a type for your Agent's state
interface FlightRecord {
id: string;
departureIata: string;
arrival: Date;;
arrivalIata: string;
price: number;
}
// Pass in the type of your Agent's state
export class MyAgent extends Agent {
// This allows this.setState and the onStateUpdate method to
// be typed:
async onStateUpdate(state: FlightRecord) {
console.log("state updated", state);
}
async someOtherMethod() {
this.setState({
...this.state,
price: this.state.price + 10,
});
}
}
```
### Synchronizing state
Clients can connect to an Agent and stay synchronized with its state using the React hooks provided as part of `agents-sdk/react`.
A React application can call `useAgent` to connect to a named Agent over WebSockets at
```ts
import { useState } from "react";
import { useAgent } from "agents-sdk/react";
function StateInterface() {
const [state, setState] = useState({ counter: 0 });
const agent = useAgent({
agent: "thinking-agent",
name: "my-agent",
onStateUpdate: (newState) => setState(newState),
});
const increment = () => {
agent.setState({ counter: state.counter + 1 });
};
return (
Count: {state.counter}
);
}
```
The state synchronization system:
* Automatically syncs the Agent's state to all connected clients
* Handles client disconnections and reconnections gracefully
* Provides immediate local updates
* Supports multiple simultaneous client connections
Common use cases:
* Real-time collaborative features
* Multi-window/tab synchronization
* Live updates across multiple devices
* Maintaining consistent UI state across clients
* When new clients connect, they automatically receive the current state from the Agent, ensuring all clients start with the latest data.
### SQL API
Every individual Agent instance has its own SQL (SQLite) database that runs _within the same context_ as the Agent itself. This means that inserting or querying data within your Agent is effectively zero-latency: the Agent doesn't have to round-trip across a continent or the world to access its own data.
You can access the SQL API within any method on an Agent via `this.sql`. The SQL API accepts template literals, and
```ts
export class MyAgent extends Agent {
async onRequest(request: Request) {
let userId = new URL(request.url).searchParams.get('userId');
// 'users' is just an example here: you can create arbitrary tables and define your own schemas
// within each Agent's database using SQL (SQLite syntax).
let user = await this.sql`SELECT * FROM users WHERE id = ${userId}`
return Response.json(user)
}
}
```
You can also supply a [TypeScript type argument](https://www.typescriptlang.org/docs/handbook/2/generics.html#using-type-parameters-in-generic-constraints) the query, which will be used to infer the type of the result:
```ts
type User = {
id: string;
name: string;
email: string;
};
export class MyAgent extends Agent {
async onRequest(request: Request) {
let userId = new URL(request.url).searchParams.get('userId');
// Supply the type paramter to the query when calling this.sql
// This assumes the results returns one or more User rows with "id", "name", and "email" columns
const user = await this.sql`SELECT * FROM users WHERE id = ${userId}`;
return Response.json(user)
}
}
```
You do not need to specify an array type (`User[]` or `Array`) as `this.sql` will always return an array of the specified type.
Providing a type parameter does not validate that the result matches your type definition. In TypeScript, properties (fields) that do not exist or conform to the type you provided will be dropped. If you need to validate incoming events, we recommend a library such as [zod](https://zod.dev/) or your own validator logic.
:::note
Learn more about the zero-latency SQL storage that powers both Agents and Durable Objects [on our blog](https://blog.cloudflare.com/sqlite-in-durable-objects/).
:::
The SQL API exposed to an Agent is similar to the one [within Durable Objects](/durable-objects/api/sql-storage/): Durable Object SQL methods available on `this.ctx.storage.sql`. You can use the same SQL queries with the Agent's database, create tables, and query data, just as you would with Durable Objects or [D1](/d1/).
### Use Agent state as model context
You can combine the state and SQL APIs in your Agent with its ability to [call AI models](/agents/examples/using-ai-models/) to include historical context within your prompts to a model. Modern Large Language Models (LLMs) often have very large context windows (up to millions of tokens), which allows you to pull relevant context into your prompt directly.
For example, you can use an Agent's built-in SQL database to pull history, query a model with it, and append to that history ahead of the next call to the model:
```ts
export class ReasoningAgent extends Agent {
async callReasoningModel(prompt: Prompt) {
let result = this.sql`SELECT * FROM history WHERE user = ${prompt.userId} ORDER BY timestamp DESC LIMIT 1000`;
let context = [];
for await (const row of result) {
context.push(row.entry);
}
const client = new OpenAI({
apiKey: this.env.OPENAI_API_KEY,
});
// Combine user history with the current prompt
const systemPrompt = prompt.system || 'You are a helpful assistant.';
const userPrompt = `${prompt.user}\n\nUser history:\n${context.join('\n')}`;
try {
const completion = await client.chat.completions.create({
model: this.env.MODEL || 'o3-mini',
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userPrompt },
],
temperature: 0.7,
max_tokens: 1000,
});
// Store the response in history
this
.sql`INSERT INTO history (timestamp, user, entry) VALUES (${new Date()}, ${prompt.userId}, ${completion.choices[0].message.content})`;
return completion.choices[0].message.content;
} catch (error) {
console.error('Error calling reasoning model:', error);
throw error;
}
}
}
```
This works because each instance of an Agent has its _own_ database, the state stored in that database is private to that Agent: whether it's acting on behalf of a single user, a room or channel, or a deep research tool. By default, you don't have to manage contention or reach out over the network to a centralized database to retrieve and store state.
---
# Retrieval Augmented Generation
URL: https://developers.cloudflare.com/agents/examples/rag/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
Agents can use Retrieval Augmented Generation (RAG) to retrieve relevant information and use it augment [calls to AI models](/agents/examples/using-ai-models/). Store a user's chat history to use as context for future conversations, summarize documents to bootstrap an Agent's knowledge base, and/or use data from your Agent's [web browsing](/agents/examples/browse-the-web/) tasks to enhance your Agent's capabilities.
You can use the Agent's own [SQL database](/agents/examples/manage-and-sync-state) as the source of truth for your data and store embeddings in [Vectorize](/vectorize/) (or any other vector-enabled database) to allow your Agent to retrieve relevant information.
### Vector search
:::note
If you're brand-new to vector databases and Vectorize, visit the [Vectorize tutorial](/vectorize/get-started/intro/) to learn the basics, including how to create an index, insert data, and generate embeddings.
:::
You can query a vector index (or indexes) from any method on your Agent: any Vectorize index you attach is available on `this.env` within your Agent. If you've [associated metadata](/vectorize/best-practices/insert-vectors/#metadata) with your vectors that maps back to data stored in your Agent, you can then look up the data directly within your Agent using `this.sql`.
Here's an example of how to give an Agent retrieval capabilties:
```ts
import { Agent } from "agents-sdk";
interface Env {
AI: Ai;
VECTOR_DB: Vectorize;
}
export class RAGAgent extends Agent {
// Other methods on our Agent
// ...
//
async queryKnowledge(userQuery: string) {
// Turn a query into an embedding
const queryVector = await this.env.AI.run('@cf/baai/bge-base-en-v1.5', {
text: [userQuery],
});
// Retrieve results from our vector index
let searchResults = await this.env.VECTOR_DB.query(queryVector.data[0], {
topK: 10,
returnMetadata: 'all',
});
let knowledge = [];
for (const match of searchResults.matches) {
console.log(match.metadata);
knowledge.push(match.metadata);
}
// Use the metadata to re-associate the vector search results
// with data in our Agent's SQL database
let results = this.sql`SELECT * FROM knowledge WHERE id IN (${knowledge.map((k) => k.id)})`;
// Return them
return results;
}
}
```
You'll also need to connect your Agent to your vector indexes:
```jsonc
{
// ...
"vectorize": [
{
"binding": "VECTOR_DB",
"index_name": "your-vectorize-index-name"
}
]
// ...
}
```
If you have multiple indexes you want to make available, you can provide an array of `vectorize` bindings.
#### Next steps
* Learn more on how to [combine Vectorize and Workers AI](/vectorize/get-started/embeddings/)
* Review the [Vectorize query API](/vectorize/reference/client-api/)
* Use [metadata filtering](/vectorize/reference/metadata-filtering/) to add context to your results
---
# Run Workflows
URL: https://developers.cloudflare.com/agents/examples/run-workflows/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
Agents can trigger asynchronous [Workflows](/workflows/), allowing your Agent to run complex, multi-step tasks in the background. This can include post-processing files that a user has uploaded, updating the embeddings in a [vector database](/vectorize/), and/or managing long-running user-lifecycle email or SMS notification workflows.
Because an Agent is just like a Worker script, it can create Workflows defined in the same project (script) as the Agent _or_ in a different project.
:::note[Agents vs. Workflows]
Agents and Workflows have some similarities: they can both run tasks asynchronously. For straightforward tasks that are linear or need to run to completion, a Workflow can be ideal: steps can be retried, they can be cancelled, and can act on events.
Agents do not have to run to completion: they can loop, branch and run forever, and they can also interact directly with users (over HTTP or WebSockets). An Agent can be used to trigger multiple Workflows as it runs, and can thus be used to co-ordinate and manage Workflows to achieve its goals.
:::
## Trigger a Workflow
An Agent can trigger one or more Workflows from within any method, whether from an incoming HTTP request, a WebSocket connection, on a delay or schedule, and/or from any other action the Agent takes.
Triggering a Workflow from an Agent is no different from [triggering a Workflow from a Worker script](/workflows/build/trigger-workflows/):
```ts
interface Env {
MY_WORKFLOW: Workflow;
MyAgent: AgentNamespace;
}
export class MyAgent extends Agent {
async onRequest(request: Request) {
let userId = request.headers.get("user-id");
// Trigger a schedule that runs a Workflow
// Pass it a payload
let { taskId } = await this.schedule(300, "runWorkflow", { id: userId, flight: "DL264", date: "2025-02-23" });
}
async runWorkflow(data) {
let instance = await env.MY_WORKFLOW.create({
id: data.id,
params: data,
})
// Schedule another task that checks the Workflow status every 5 minutes...
await this.schedule("*/5 * * * *", "checkWorkflowStatus", { id: instance.id });
}
}
export class MyWorkflow extends WorkflowEntrypoint {
async run(event: WorkflowEvent, step: WorkflowStep) {
// Your Workflow code here
}
}
```
You'll also need to make sure your Agent [has a binding to your Workflow](/workflows/build/trigger-workflows/#workers-api-bindings) so that it can call it:
```jsonc
{
// ...
// Create a binding between your Agent and your Workflow
"workflows": [
{
// Required:
"name": "EMAIL_WORKFLOW",
"class_name": "MyWorkflow",
// Optional: set the script_name field if your Workflow is defined in a
// different project from your Agent
"script_name": "email-workflows"
}
],
// ...
}
```
## Trigger a Workflow from another project
You can also call a Workflow that is defined in a different Workers script from your Agent by setting the `script_name` property in the `workflows` binding of your Agent:
```jsonc
{
// Required:
"name": "EMAIL_WORKFLOW",
"class_name": "MyWorkflow",
// Optional: set tthe script_name field if your Workflow is defined in a
// different project from your Agent
"script_name": "email-workflows"
}
```
Refer to the [cross-script calls](/workflows/build/workers-api/#cross-script-calls) section of the Workflows documentation for more examples.
---
# Schedule tasks
URL: https://developers.cloudflare.com/agents/examples/schedule-tasks/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
An Agent can schedule tasks to be run in the future by calling `this.schedule(when, callback, data)`, where `when` can be a delay, a `Date`, or a cron string; `callback` the function name to call, and `data` is an object of data to pass to the function.
Scheduled tasks can do anything a request or message from a user can: make requests, query databases, send emails, read+write state: scheduled tasks can invoke any regular method on your Agent.
### Scheduling tasks
You can call `this.schedule` within any method on an Agent, and schedule tens-of-thousands of tasks per individual Agent:
```ts
import { Agent } from "agents-sdk"
export class SchedulingAgent extends Agent {
async onRequest(request) {
// Handle an incoming request
// Schedule a task 5 minutes from now
// Calls the "checkFlights" method
let { taskId } = await this.schedule(600, "checkFlights", { flight: "DL264", date: "2025-02-23" });
return Response.json({ taskId });
}
async checkFlights(data) {
// Invoked when our scheduled task runs
// We can also call this.schedule here to schedule another task
}
}
```
:::caution
Tasks that set a callback for a method that does not exist will throw an exception: ensure that the method named in the `callback` argument of `this.schedule` exists on your `Agent` class.
:::
You can schedule tasks in multiple ways:
```ts
// schedule a task to run in 10 seconds
let task = await this.schedule(10, "someTask", { message: "hello" });
// schedule a task to run at a specific date
let task = await this.schedule(new Date("2025-01-01"), "someTask", {});
// schedule a task to run every 10 seconds
let { id } = await this.schedule("*/10 * * * *", "someTask", { message: "hello" });
// schedule a task to run every 10 seconds, but only on Mondays
let task = await this.schedule("0 0 * * 1", "someTask", { message: "hello" });
// cancel a scheduled task
this.cancelSchedule(task.id);
```
Calling `await this.schedule` returns a `Schedule`, which includes the task's randomly generated `id`. You can use this `id` to retrieve or cancel the task in the future. It also provides a `type` property that indicates the type of schedule, for example, one of `"scheduled" | "delayed" | "cron"`.
:::note[Maximum scheduled tasks]
Each task is mapped to a row in the Agent's underlying [SQLite database](/durable-objects/api/sql-storage/), which means that each task can be up to 2 MB in size. The maximum number of tasks must be `(task_size * tasks) + all_other_state < maximum_database_size` (currently 1GB per Agent).
:::
### Managing scheduled tasks
You can get, cancel and filter across scheduled tasks within an Agent using the scheduling API:
```ts
// Get a specific schedule by ID
// Returns undefined if the task does not exist
let task = await this.getSchedule(task.id)
// Get all scheduled tasks
// Returns an array of Schedule objects
let tasks = this.getSchedules();
// Cancel a task by its ID
// Returns true if the task was cancelled, false if it did not exist
await this.cancelSchedule(task.id);
// Filter for specific tasks
// e.g. all tasks starting in the next hour
let tasks = this.getSchedules({
timeRange: {
start: new Date(Date.now()),
end: new Date(Date.now() + 60 * 60 * 1000),
}
});
```
---
# Using AI Models
URL: https://developers.cloudflare.com/agents/examples/using-ai-models/
import { AnchorHeading, MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
Agents can communicate with AI models hosted on any provider, including [Workers AI](/workers-ai/), OpenAI, Anthropic, and Google's Gemini, and use the model routing features in [AI Gateway](/ai-gateway/) to route across providers, eval responses, and manage AI provider rate limits.
Because Agents are built on top of [Durable Objects](/durable-objects/), each Agent or chat session is associated with a stateful compute instance. Tradtional serverless architectures often present challenges for persistent connections needed in real-time applications like chat.
A user can disconnect during a long-running response from a modern reasoning model (such as `o3-mini` or DeepSeek R1), or lose conversational context when refreshing the browser. Instead of relying on request-response patterns and managing an external database to track & store conversation state, state can be stored directly within the Agent. If a client disconnects, the Agent can write to its own distributed storage, and catch the client up as soon as it reconnects: even if it's hours or days later.
## Calling AI Models
You can call models from any method within an Agent, including from HTTP requests using the [`onRequest`](/agents/api-reference/sdk/) handler, when a [scheduled task](/agents/examples/schedule-tasks/) runs, when handling a WebSocket message in the [`onMessage`](/agents/examples/websockets/) handler, or from any of your own methods.
Importantly, Agents can call AI models on their own — autonomously — and can handle long-running responses that can take minutes (or longer) to respond in full.
### Long-running model requests {/*long-running-model-requests*/}
Modern [reasoning models](https://platform.openai.com/docs/guides/reasoning) or "thinking" model can take some time to both generate a response _and_ stream the response back to the client.
Instead of buffering the entire response, or risking the client disconecting, you can stream the response back to the client by using the [WebSocket API](/agents/examples/websockets/).
```ts
import { Agent } from "agents-sdk"
import { OpenAI } from "openai"
export class MyAgent extends Agent {
async onConnect(connection: Connection, ctx: ConnectionContext) {
// Omitted for simplicity: authenticating the user
connection.accept()
}
async onMessage(connection: Connection, message: WSMessage) {
let msg = JSON.parse(message)
// This can run as long as it needs to, and return as many messages as it needs to!
await queryReasoningModel(connection, msg.prompt)
}
async queryReasoningModel(connection: Connection, userPrompt: string) {
const client = new OpenAI({
apiKey: this.env.OPENAI_API_KEY,
});
try {
const stream = await client.chat.completions.create({
model: this.env.MODEL || 'o3-mini',
messages: [{ role: 'user', content: userPrompt }],
stream: true,
});
// Stream responses back as WebSocket messages
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
if (content) {
connection.send(JSON.stringify({ type: 'chunk', content }));
}
}
// Send completion message
connection.send(JSON.stringify({ type: 'done' }));
} catch (error) {
connection.send(JSON.stringify({ type: 'error', error: error }));
}
}
}
```
You can also persist AI model responses back to [Agent's internal state](/agents/examples/manage-and-sync-state/) by using the `this.setState` method. For example, if you run a [scheduled task](/agents/examples/schedule-tasks/), you can store the output of the task and read it later. Or, if a user disconnects, read the message history back and send it to the user when they reconnect.
### Workers AI
### Hosted models
You can use [any of the models available in Workers AI](/workers-ai/models/) within your Agent by [configuring a binding](/workers-ai/configuration/bindings/).
Workers AI supports streaming responses out-of-the-box by setting `stream: true`, and we strongly recommend using them to avoid buffering and delaying responses, especially for larger models or reasoning models that require more time to generate a response.
```ts
import { Agent } from "agents-sdk"
interface Env {
AI: Ai;
}
export class MyAgent extends Agent {
async onRequest(request: Request) {
const response = await env.AI.run(
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
{
prompt: "Build me a Cloudflare Worker that returns JSON.",
stream: true, // Stream a response and don't block the client!
}
);
// Return the stream
return new Response(answer, {
headers: { "content-type": "text/event-stream" }
})
}
}
```
Your wrangler configuration will need an `ai` binding added:
```toml
[ai]
binding = "AI"
```
### Model routing
You can also use the model routing features in [AI Gateway](/ai-gateway/) directly from an Agent by specifying a [`gateway` configuration](/ai-gateway/providers/workersai/) when calling the AI binding.
:::note
Model routing allows you to route requests to different AI models based on whether they are reachable, rate-limiting your client, and/or if you've exceeded your cost budget for a specific provider.
:::
```ts
import { Agent } from "agents-sdk"
interface Env {
AI: Ai;
}
export class MyAgent extends Agent {
async onRequest(request: Request) {
const response = await env.AI.run(
"@cf/deepseek-ai/deepseek-r1-distill-qwen-32b",
{
prompt: "Build me a Cloudflare Worker that returns JSON."
},
{
gateway: {
id: "{gateway_id}", // Specify your AI Gateway ID here
skipCache: false,
cacheTtl: 3360,
},
},
);
return Response.json(response)
}
}
```
Your wrangler configuration will need an `ai` binding added. This is shared across both Workers AI and AI Gateway.
```toml
[ai]
binding = "AI"
```
Visit the [AI Gateway documentation](/ai-gateway/) to learn how to configure a gateway and retrieve a gateway ID.
### AI SDK
The [AI SDK](https://sdk.vercel.ai/docs/introduction) provides a unified API for using AI models, including for text generation, tool calling, structured responses, image generation, and more.
To use the AI SDK, install the `ai` package and use it within your Agent. The example below shows how it use it to generate text on request, but you can use it from any method within your Agent, including WebSocket handlers, as part of a scheduled task, or even when the Agent is initialized.
```sh
npm install ai @ai-sdk/openai
```
```ts
import { Agent } from "agents-sdk"
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
export class MyAgent extends Agent {
async onRequest(request: Request): Promise {
const { text } = await generateText({
model: openai("o3-mini"),
prompt: "Build me an AI agent on Cloudflare Workers",
});
return Response.json({modelResponse: text})
}
}
```
### OpenAI compatible endpoints
Agents can call models across any service, including those that support the OpenAI API. For example, you can use the OpenAI SDK to use one of [Google's Gemini models](https://ai.google.dev/gemini-api/docs/openai#node.js) directly from your Agent.
Agents can stream responses back over HTTP using Server Sent Events (SSE) from within an `onRequest` handler, or by using the native [WebSockets](/agents/examples/websockets/) API in your Agent to responses back to a client, which is especially useful for larger models that can take over 30+ seconds to reply.
```ts
import { Agent } from "agents-sdk"
import { OpenAI } from "openai"
export class MyAgent extends Agent {
async onRequest(request: Request): Promise {
const openai = new OpenAI({
apiKey: this.env.GEMINI_API_KEY,
baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/"
});
// Create a TransformStream to handle streaming data
let { readable, writable } = new TransformStream();
let writer = writable.getWriter();
const textEncoder = new TextEncoder();
// Use ctx.waitUntil to run the async function in the background
// so that it doesn't block the streaming response
ctx.waitUntil(
(async () => {
const stream = await openai.chat.completions.create({
model: "4o",
messages: [{ role: "user", content: "Write me a Cloudflare Worker." }],
stream: true,
});
// loop over the data as it is streamed and write to the writeable
for await (const part of stream) {
writer.write(
textEncoder.encode(part.choices[0]?.delta?.content || ""),
);
}
writer.close();
})(),
);
// Return the readable stream back to the client
return new Response(readable)
}
}
```
---
# Using WebSockets
URL: https://developers.cloudflare.com/agents/examples/websockets/
import { MetaInfo, Render, Type, TypeScriptExample, WranglerConfig } from "~/components";
Users and clients can connect to an Agent directly over WebSockets, allowing long-running, bi-directional communication with your Agent as it operates.
To enable an Agent to accept WebSockets, define `onConnect` and `onMessage` methods on your Agent.
* `onConnect(connection: Connection, ctx: ConnectionContext)` is called when a client establishes a new WebSocket connection. The original HTTP request, including request headers, cookies, and the URL itself, are available on `ctx.request`.
* `onMessage(connection: Connection, message: WSMessage)` is called for each incoming WebSocket message. Messages are one of `ArrayBuffer | ArrayBufferView | string`, and you can send messages back to a client using `connection.send()`. You can distinguish between client connections by checking `connection.id`, which is unique for each connected client.
Here's an example of an Agent that echoes back any message it receives:
```ts
import { Agent, Connection } from "agents-sdk";
export class ChatAgent extends Agent {
async onConnect(connection: Connection, ctx: ConnectionContext) {
// Access the request to verify any authentication tokens
// provided in headers or cookies
let token = ctx.request.headers.get("Authorization");
if (!token) {
await connection.close(4000, "Unauthorized");
return;
}
// Handle auth using your favorite library and/or auth scheme:
// try {
// await jwt.verify(token, env.JWT_SECRET);
// } catch (error) {
// connection.close(4000, 'Invalid Authorization header');
// return;
// }
// Accept valid connections
connection.accept()
}
async onMessage(connection: Connection, message: WSMessage) {
// const response = await longRunningAITask(message)
await connection.send(message)
}
}
```
## Connecting clients
The Agent framework includes a useful helper package for connecting directly to your Agent (or other Agents) from a client application. Import `agents-sdk/client`, create an instance of `AgentClient` and use it to connect to an instance of your Agent:
```ts
import { AgentClient } from "agents-sdk/client";
const connection = new AgentClient({
agent: "dialogue-agent",
name: "insight-seeker",
});
connection.addEventListener("message", (event) => {
console.log("Received:", event.data);
});
connection.send(
JSON.stringify({
type: "inquiry",
content: "What patterns do you see?",
})
);
```
## React clients
React-based applications can import `agents-sdk/react` and use the `useAgent` hook to connect to an instance of an Agent directly:
```ts
import { useAgent } from "agents-sdk/react";
function AgentInterface() {
const connection = useAgent({
agent: "dialogue-agent",
name: "insight-seeker",
onMessage: (message) => {
console.log("Understanding received:", message.data);
},
onOpen: () => console.log("Connection established"),
onClose: () => console.log("Connection closed"),
});
const inquire = () => {
connection.send(
JSON.stringify({
type: "inquiry",
content: "What insights have you gathered?",
})
);
};
return (
);
}
```
The `useAgent` hook automatically handles the lifecycle of the connection, ensuring that it is properly initialized and cleaned up when the component mounts and unmounts. You can also [combine `useAgent` with `useState`](/agents/examples/manage-and-sync-state/) to automatically synchronize state across all clients connected to your Agent.
## Handling WebSocket events
Define `onError` and `onClose` methods on your Agent to explicitly handle WebSocket client errors and close events. Log errors, clean up state, and/or emit metrics:
```ts
import { Agent, Connection } from "agents-sdk";
export class ChatAgent extends Agent {
// onConnect and onMessage methods
// ...
// WebSocket error and disconnection (close) handling.
async onError(connection: Connection, error: unknown): Promise {
console.error(`WS error: ${error}`);
}
async onClose(connection: Connection, code: number, reason: string, wasClean: boolean): Promise {
console.log(`WS closed: ${code} - ${reason} - wasClean: ${wasClean}`);
connection.close();
}
}
```
---
# Guides
URL: https://developers.cloudflare.com/agents/guides/
import { DirectoryListing } from "~/components"
---
# Reference
URL: https://developers.cloudflare.com/agents/platform/
import { DirectoryListing } from "~/components";
Build AI Agents on Cloudflare
---
# Limits
URL: https://developers.cloudflare.com/agents/platform/limits/
import { Render } from "~/components"
Limits that apply to authoring, deploying, and running Agents are detailed below.
Many limits are inherited from those applied to Workers scripts and/or Durable Objects, and are detailed in the [Workers limits](/workers/platform/limits/) documentation.
| Feature | Limit |
| ----------------------------------------- | ----------------------- |
| Max concurrent (running) Agents per account | Tens of millions+ [^1]
| Max definitions per account | ~250,000+ [^2]
| Max state stored per unique Agent | 1 GB |
| Max compute time per Agent | 30 seconds (refreshed per HTTP request / incoming WebSocket message) [^3] |
| Duration (wall clock) per step [^3] | Unlimited (e.g. waiting on a database call or an LLM response) |
---
[^1]: Yes, really. You can have tens of millions of Agents running concurrently, as each Agent is mapped to a [unique Durable Object](/durable-objects/what-are-durable-objects/) (actor).
[^2]: You can deploy up to [500 scripts per account](/workers/platform/limits/), but each script (project) can define multiple Agents. Each deployed script can be up to 10 MB on the [Workers Paid Plan](/workers/platform/pricing/#workers)
[^3]: Compute (CPU) time per Agent is limited to 30 seconds, but this is refreshed when an Agent receives a new HTTP request, runs a [scheduled task](/agents/examples/schedule-tasks/), or an incoming WebSocket message.
---