Workers binding
Workers provides a serverless execution environment that allows you to create new applications or augment existing ones. Use a Workers binding to search and chat with your AI Search instances from a Cloudflare Worker.
To use AI Search with Workers, you must create an AI Search binding. You create bindings by updating your Wrangler configuration. AI Search provides two types of bindings:
- Namespace binding:
ai_search_namespaces - Instance binding:
ai_search
Access all instances within a namespace. You can get, create, list, and delete instances at runtime.
{ "$schema": "./node_modules/wrangler/config-schema.json", "compatibility_date": "2026-03-27", "ai_search_namespaces": [ { "binding": "AI_SEARCH", "namespace": "my-namespace" } ]}compatibility_date = "2026-03-27"
[[ai_search_namespaces]]binding = "AI_SEARCH"namespace = "my-namespace"| Field | Type | Required | Description |
|---|---|---|---|
binding | string | Yes | The variable name available on env. For example, "AI_SEARCH" makes it accessible as env.AI_SEARCH. |
namespace | string | Yes | The namespace to bind to. A default namespace is created automatically for every account. If the namespace does not exist, Wrangler creates it on deploy. |
remote | boolean | No | Set to true for local development with wrangler dev. |
Bind directly to a single instance in the default namespace. Use this when you know which instance you need at deploy time.
{ "$schema": "./node_modules/wrangler/config-schema.json", "compatibility_date": "2026-03-27", "ai_search": [ { "binding": "MY_SEARCH", "instance_name": "my-instance" } ]}compatibility_date = "2026-03-27"
[[ai_search]]binding = "MY_SEARCH"instance_name = "my-instance"| Field | Type | Required | Description |
|---|---|---|---|
binding | string | Yes | The variable name available on env. For example, "MY_SEARCH" makes it accessible as env.MY_SEARCH. |
instance_name | string | Yes | The name of the AI Search instance. Must exist in the default namespace at deploy time. |
remote | boolean | No | Set to true for local development with wrangler dev. |
The following methods are available on both the ai_search_namespaces and ai_search bindings. With the namespace binding, call methods on the handle returned by get(). With the instance binding, call methods directly on the binding (for example, env.MY_SEARCH.search()).
The examples below use the namespace binding.
Search for relevant content chunks from your indexed data source. Returns scored chunks with source references.
const instance = env.AI_SEARCH.get("my-instance");
const results = await instance.search({ messages: [{ role: "user", content: "What is Cloudflare?" }],});messages array required
An array of message objects representing the conversation. Each message has a role and content field.
-
rolestringrequired- The role of the message sender. Valid values:
system,developer,user,assistant,tool.
- The role of the message sender. Valid values:
-
contentstringrequired- The content of the message.
query string optional
A simple text query string. Alternative to messages. Provide either query or messages, not both.
ai_search_options object optional
Configuration options for the search operation.
-
retrievalobjectoptional-
retrieval_typestringoptional- The type of retrieval to perform. Valid values:
vector,keyword,hybrid. Defaults tohybrid.
- The type of retrieval to perform. Valid values:
-
match_thresholdnumberoptional- The minimum match score required for a result to be considered a match. Must be between
0and1. Defaults to0.4.
- The minimum match score required for a result to be considered a match. Must be between
-
max_num_resultsintegeroptional- The maximum number of results to return. Must be between
1and50. Defaults to10.
- The maximum number of results to return. Must be between
-
filtersobjectoptional- Filter search results based on metadata. Supports comparison filters (
eq,ne,gt,gte,lt,lte) and compound filters (and,or). For more details, refer to Metadata filtering.
- Filter search results based on metadata. Supports comparison filters (
-
context_expansionintegeroptional- The number of surrounding chunks to include for additional context. Must be between
0and3. Defaults to0.
- The number of surrounding chunks to include for additional context. Must be between
-
fusion_methodstringoptional- Controls how vector and keyword scores are combined when using hybrid retrieval. Valid values:
rrf(Reciprocal Rank Fusion),max(takes the maximum score). Defaults to the instance-level setting.
- Controls how vector and keyword scores are combined when using hybrid retrieval. Valid values:
-
keyword_match_modestringoptional- Controls how keyword (BM25) matching selects candidate documents.
andrequires all terms to match.orrequires any term to match. Defaults toand.
- Controls how keyword (BM25) matching selects candidate documents.
-
boost_byarrayoptional- Boost results by metadata fields. Maximum 3 items. Each item has:
fieldstringrequired - The metadata field name to boost by (for example,timestamp). Maximum 64 characters.directionstringoptional - The boost direction. Valid values:asc,desc,exists,not_exists. Defaults toascfor numeric fields andexistsfor text fields.
- Boost results by metadata fields. Maximum 3 items. Each item has:
-
metadata_onlybooleanoptional- Return only metadata for each chunk without the text content.
-
return_on_failurebooleanoptional- Whether to return partial results if some processing steps fail. Defaults to
true.
- Whether to return partial results if some processing steps fail. Defaults to
-
-
query_rewriteobjectoptional-
enabledbooleanoptional- Rewrites the query to improve retrieval accuracy. Defaults to
false.
- Rewrites the query to improve retrieval accuracy. Defaults to
-
modelstringoptional- The model to use for query rewriting.
-
rewrite_promptstringoptional- A custom prompt to guide query rewriting.
-
-
rerankingobjectoptional-
enabledbooleanoptional- Reorders retrieved results based on semantic relevance using a reranking model. Defaults to
false.
- Reorders retrieved results based on semantic relevance using a reranking model. Defaults to
-
modelstringoptional- The reranking model to use. Valid value:
@cf/baai/bge-reranker-base.
- The reranking model to use. Valid value:
-
match_thresholdnumberoptional- The minimum score for reranked results. Must be between
0and1. Defaults to0.4.
- The minimum score for reranked results. Must be between
-
-
cacheobjectoptional-
enabledbooleanoptional- Override the instance-level cache setting for this request.
-
cache_thresholdstringoptional- The similarity threshold for cache hits. Valid values:
super_strict_match,close_enough,flexible_friend,anything_goes.
- The similarity threshold for cache hits. Valid values:
-
The response contains the following fields:
| Field | Type | Description |
|---|---|---|
search_query | string | The query used for the search, which may be rewritten if query rewriting is enabled. |
chunks | array | An array of matching content chunks. |
chunks[].id | string | The unique identifier for the chunk. |
chunks[].type | string | The type of content, typically text. |
chunks[].score | number | The overall match score between 0 and 1. |
chunks[].text | string | The text content of the chunk. |
chunks[].item | object | Information about the source item. |
chunks[].item.key | string | The file path or URL of the source document. |
chunks[].item.timestamp | number | Unix timestamp of when the item was last modified. |
chunks[].item.metadata | object | Custom metadata associated with the source item. |
chunks[].scoring_details | object | Breakdown of how the chunk was scored. |
chunks[].scoring_details.vector_score | number | The semantic similarity score (0 to 1). |
chunks[].scoring_details.keyword_score | number | The keyword (BM25) match score. Present when using hybrid or keyword retrieval. |
chunks[].scoring_details.keyword_rank | number | The keyword rank position. |
chunks[].scoring_details.vector_rank | number | The vector rank position. |
chunks[].scoring_details.reranking_score | number | The reranking score (0 to 1). Present when reranking is enabled. |
chunks[].scoring_details.fusion_method | string | The fusion method used (rrf or max). Present when using hybrid retrieval. |
Generate chat completions using your AI Search instance as context. This method retrieves relevant content and uses it to generate a response.
const instance = env.AI_SEARCH.get("my-instance");
const response = await instance.chatCompletions({ messages: [ { role: "system", content: "You are a helpful documentation assistant." }, { role: "user", content: "What is Cloudflare?" }, ], model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast", ai_search_options: { retrieval: { max_num_results: 5, }, query_rewrite: { enabled: true, }, },});Set stream: true to receive responses as Server-Sent Events (SSE) as they are generated:
const instance = env.AI_SEARCH.get("my-instance");
const stream = await instance.chatCompletions({ messages: [{ role: "user", content: "What is Cloudflare?" }], stream: true,});
return new Response(stream, { headers: { "content-type": "text/event-stream", "cache-control": "no-cache", },});When stream is enabled, the method returns a ReadableStream of SSE events. Each event contains a JSON object with choices[0].delta.content for incremental text. The stream ends with a data: [DONE] event.
messages array required
An array of message objects representing the conversation. Each message has a role and content field.
-
rolestringrequired- The role of the message sender. Valid values:
system,developer,user,assistant,tool.
- The role of the message sender. Valid values:
-
contentstringrequired- The content of the message.
model string optional
The text-generation model used to generate responses. Defaults to the generation model configured in the AI Search instance settings. For a list of supported models, refer to Supported models.
stream boolean optional
Returns a stream of results as they are generated. When enabled, returns a Response object with a readable stream. Defaults to false.
ai_search_options object optional
Configuration options for the search and generation operation.
-
retrievalobjectoptional-
retrieval_typestringoptional- The type of retrieval to perform. Valid values:
vector,keyword,hybrid. Defaults tohybrid.
- The type of retrieval to perform. Valid values:
-
match_thresholdnumberoptional- The minimum match score required for a result to be considered a match. Must be between
0and1. Defaults to0.4.
- The minimum match score required for a result to be considered a match. Must be between
-
max_num_resultsintegeroptional- The maximum number of results to return. Must be between
1and50. Defaults to10.
- The maximum number of results to return. Must be between
-
filtersobjectoptional- Filter search results based on metadata. Supports comparison filters (
eq,ne,gt,gte,lt,lte) and compound filters (and,or). For more details, refer to Metadata filtering.
- Filter search results based on metadata. Supports comparison filters (
-
context_expansionintegeroptional- The number of surrounding chunks to include for additional context. Must be between
0and3. Defaults to0.
- The number of surrounding chunks to include for additional context. Must be between
-
fusion_methodstringoptional- Controls how vector and keyword scores are combined when using hybrid retrieval. Valid values:
rrf(Reciprocal Rank Fusion),max(takes the maximum score). Defaults to the instance-level setting.
- Controls how vector and keyword scores are combined when using hybrid retrieval. Valid values:
-
keyword_match_modestringoptional- Controls how keyword (BM25) matching selects candidate documents.
andrequires all terms to match.orrequires any term to match. Defaults toand.
- Controls how keyword (BM25) matching selects candidate documents.
-
boost_byarrayoptional- Boost results by metadata fields. Maximum 3 items. Each item has:
fieldstringrequired - The metadata field name to boost by (for example,timestamp). Maximum 64 characters.directionstringoptional - The boost direction. Valid values:asc,desc,exists,not_exists. Defaults toascfor numeric fields andexistsfor text fields.
- Boost results by metadata fields. Maximum 3 items. Each item has:
-
metadata_onlybooleanoptional- Return only metadata for each chunk without the text content.
-
return_on_failurebooleanoptional- Whether to return partial results if some processing steps fail. Defaults to
true.
- Whether to return partial results if some processing steps fail. Defaults to
-
-
query_rewriteobjectoptional-
enabledbooleanoptional- Rewrites the query to improve retrieval accuracy. Defaults to
false.
- Rewrites the query to improve retrieval accuracy. Defaults to
-
modelstringoptional- The model to use for query rewriting.
-
rewrite_promptstringoptional- A custom prompt to guide query rewriting.
-
-
rerankingobjectoptional-
enabledbooleanoptional- Reorders retrieved results based on semantic relevance using a reranking model. Defaults to
false.
- Reorders retrieved results based on semantic relevance using a reranking model. Defaults to
-
modelstringoptional- The reranking model to use. Valid value:
@cf/baai/bge-reranker-base.
- The reranking model to use. Valid value:
-
match_thresholdnumberoptional- The minimum score for reranked results. Must be between
0and1. Defaults to0.4.
- The minimum score for reranked results. Must be between
-
-
cacheobjectoptional-
enabledbooleanoptional- Override the instance-level cache setting for this request.
-
cache_thresholdstringoptional- The similarity threshold for cache hits. Valid values:
super_strict_match,close_enough,flexible_friend,anything_goes.
- The similarity threshold for cache hits. Valid values:
-
| Field | Type | Description |
|---|---|---|
id | string | Unique identifier for the completion. |
object | string | Always chat.completion. |
created | number | Unix timestamp of when the completion was created. |
model | string | The model used to generate the response. |
choices | array | Array of completion choices. |
choices[].message.role | string | Always assistant. |
choices[].message.content | string | The generated response text. |
choices[].finish_reason | string | Why the model stopped generating. Typically stop. |
usage.prompt_tokens | number | Number of tokens in the prompt. |
usage.completion_tokens | number | Number of tokens in the generated response. |
usage.total_tokens | number | Total tokens used. |
chunks | array | The source chunks used as context. Same format as the search response. |
When stream: true, the method returns a ReadableStream of Server-Sent Events. The retrieved chunks are sent first as a chunks event, followed by the streamed response.
event: chunksdata: [{"id":"chunk-001","type":"text","score":0.85,"text":"...","item":{"key":"about-cloudflare.md","timestamp":1775925540000},"scoring_details":{"vector_score":0.85}}]
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" document"}}]}
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" you provided doesn"}}]}
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"'t contain"}}]}
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" information"}}]}
data: [DONE]The following methods are only available when using the ai_search_namespaces binding. Search and chat across multiple instances in a single call using the namespace handle directly (env.AI_SEARCH).
Pass instance_ids in ai_search_options to specify which instances to query. Results are merged and ranked, and each chunk includes an instance_id field identifying which instance it came from.
const results = await env.AI_SEARCH.search({ messages: [{ role: "user", content: "What is Cloudflare?" }], ai_search_options: { instance_ids: ["product-docs", "customer-abc123"], },});Same as instance-level search, with one additional required field:
| Parameter | Type | Required | Description |
|---|---|---|---|
ai_search_options | object | Yes | Required for namespace-level search. |
ai_search_options.instance_ids | array | Yes | Instance IDs to search across. Minimum 1, maximum 10. |
Same as instance-level search, with additional fields:
| Field | Type | Description |
|---|---|---|
chunks[].instance_id | string | The instance this chunk came from. |
errors | array | Per-instance errors if any instances failed. Each object has instance_id and message. |
Generate chat completions using context retrieved from multiple instances.
const response = await env.AI_SEARCH.chatCompletions({ messages: [{ role: "user", content: "What is Cloudflare?" }], ai_search_options: { instance_ids: ["product-docs", "customer-abc123"], },});Streaming is supported with stream: true.
Same as instance-level chat completions, with one additional required field:
| Parameter | Type | Required | Description |
|---|---|---|---|
ai_search_options | object | Yes | Required for namespace-level chat completions. |
ai_search_options.instance_ids | array | Yes | Instance IDs to search across. Minimum 1, maximum 10. |
Same as instance-level chat completions, with additional fields on each chunk:
| Field | Type | Description |
|---|---|---|
chunks[].instance_id | string | The instance this chunk came from. |
errors | array | Per-instance errors if any instances failed. Each object has instance_id and message. |
Local development is supported by proxying requests to your deployed AI Search instance. Add remote: true to your binding configuration to enable local development with wrangler dev.
// wrangler.jsonc{ "ai_search": [ { "binding": "MY_SEARCH", "instance_name": "my-instance", "remote": true, }, ],}