Skip to content

Workers binding

Workers provides a serverless execution environment that allows you to create new applications or augment existing ones. Use a Workers binding to search and chat with your AI Search instances from a Cloudflare Worker.

Configure the binding

To use AI Search with Workers, you must create an AI Search binding. You create bindings by updating your Wrangler configuration. AI Search provides two types of bindings:

  • Namespace binding: ai_search_namespaces
  • Instance binding: ai_search

Namespace binding

Access all instances within a namespace. You can get, create, list, and delete instances at runtime.

JSONC
{
"$schema": "./node_modules/wrangler/config-schema.json",
"compatibility_date": "2026-03-27",
"ai_search_namespaces": [
{
"binding": "AI_SEARCH",
"namespace": "my-namespace"
}
]
}
FieldTypeRequiredDescription
bindingstringYesThe variable name available on env. For example, "AI_SEARCH" makes it accessible as env.AI_SEARCH.
namespacestringYesThe namespace to bind to. A default namespace is created automatically for every account. If the namespace does not exist, Wrangler creates it on deploy.
remotebooleanNoSet to true for local development with wrangler dev.

Instance binding

Bind directly to a single instance in the default namespace. Use this when you know which instance you need at deploy time.

JSONC
{
"$schema": "./node_modules/wrangler/config-schema.json",
"compatibility_date": "2026-03-27",
"ai_search": [
{
"binding": "MY_SEARCH",
"instance_name": "my-instance"
}
]
}
FieldTypeRequiredDescription
bindingstringYesThe variable name available on env. For example, "MY_SEARCH" makes it accessible as env.MY_SEARCH.
instance_namestringYesThe name of the AI Search instance. Must exist in the default namespace at deploy time.
remotebooleanNoSet to true for local development with wrangler dev.

Instance methods

The following methods are available on both the ai_search_namespaces and ai_search bindings. With the namespace binding, call methods on the handle returned by get(). With the instance binding, call methods directly on the binding (for example, env.MY_SEARCH.search()).

The examples below use the namespace binding.

Search for relevant content chunks from your indexed data source. Returns scored chunks with source references.

TypeScript
const instance = env.AI_SEARCH.get("my-instance");
const results = await instance.search({
messages: [{ role: "user", content: "What is Cloudflare?" }],
});

Parameters

messages array required

An array of message objects representing the conversation. Each message has a role and content field.

  • role string required

    • The role of the message sender. Valid values: system, developer, user, assistant, tool.
  • content string required

    • The content of the message.

query string optional

A simple text query string. Alternative to messages. Provide either query or messages, not both.


ai_search_options object optional

Configuration options for the search operation.

  • retrieval object optional

    • retrieval_type string optional

      • The type of retrieval to perform. Valid values: vector, keyword, hybrid. Defaults to hybrid.
    • match_threshold number optional

      • The minimum match score required for a result to be considered a match. Must be between 0 and 1. Defaults to 0.4.
    • max_num_results integer optional

      • The maximum number of results to return. Must be between 1 and 50. Defaults to 10.
    • filters object optional

      • Filter search results based on metadata. Supports comparison filters (eq, ne, gt, gte, lt, lte) and compound filters (and, or). For more details, refer to Metadata filtering.
    • context_expansion integer optional

      • The number of surrounding chunks to include for additional context. Must be between 0 and 3. Defaults to 0.
    • fusion_method string optional

      • Controls how vector and keyword scores are combined when using hybrid retrieval. Valid values: rrf (Reciprocal Rank Fusion), max (takes the maximum score). Defaults to the instance-level setting.
    • keyword_match_mode string optional

      • Controls how keyword (BM25) matching selects candidate documents. and requires all terms to match. or requires any term to match. Defaults to and.
    • boost_by array optional

      • Boost results by metadata fields. Maximum 3 items. Each item has:
        • field string required - The metadata field name to boost by (for example, timestamp). Maximum 64 characters.
        • direction string optional - The boost direction. Valid values: asc, desc, exists, not_exists. Defaults to asc for numeric fields and exists for text fields.
    • metadata_only boolean optional

      • Return only metadata for each chunk without the text content.
    • return_on_failure boolean optional

      • Whether to return partial results if some processing steps fail. Defaults to true.
  • query_rewrite object optional

    • enabled boolean optional

      • Rewrites the query to improve retrieval accuracy. Defaults to false.
    • model string optional

      • The model to use for query rewriting.
    • rewrite_prompt string optional

      • A custom prompt to guide query rewriting.
  • reranking object optional

    • enabled boolean optional

      • Reorders retrieved results based on semantic relevance using a reranking model. Defaults to false.
    • model string optional

      • The reranking model to use. Valid value: @cf/baai/bge-reranker-base.
    • match_threshold number optional

      • The minimum score for reranked results. Must be between 0 and 1. Defaults to 0.4.
  • cache object optional

    • enabled boolean optional

      • Override the instance-level cache setting for this request.
    • cache_threshold string optional

      • The similarity threshold for cache hits. Valid values: super_strict_match, close_enough, flexible_friend, anything_goes.

Response

The response contains the following fields:

FieldTypeDescription
search_querystringThe query used for the search, which may be rewritten if query rewriting is enabled.
chunksarrayAn array of matching content chunks.
chunks[].idstringThe unique identifier for the chunk.
chunks[].typestringThe type of content, typically text.
chunks[].scorenumberThe overall match score between 0 and 1.
chunks[].textstringThe text content of the chunk.
chunks[].itemobjectInformation about the source item.
chunks[].item.keystringThe file path or URL of the source document.
chunks[].item.timestampnumberUnix timestamp of when the item was last modified.
chunks[].item.metadataobjectCustom metadata associated with the source item.
chunks[].scoring_detailsobjectBreakdown of how the chunk was scored.
chunks[].scoring_details.vector_scorenumberThe semantic similarity score (0 to 1).
chunks[].scoring_details.keyword_scorenumberThe keyword (BM25) match score. Present when using hybrid or keyword retrieval.
chunks[].scoring_details.keyword_ranknumberThe keyword rank position.
chunks[].scoring_details.vector_ranknumberThe vector rank position.
chunks[].scoring_details.reranking_scorenumberThe reranking score (0 to 1). Present when reranking is enabled.
chunks[].scoring_details.fusion_methodstringThe fusion method used (rrf or max). Present when using hybrid retrieval.

chatCompletions()

Generate chat completions using your AI Search instance as context. This method retrieves relevant content and uses it to generate a response.

TypeScript
const instance = env.AI_SEARCH.get("my-instance");
const response = await instance.chatCompletions({
messages: [
{ role: "system", content: "You are a helpful documentation assistant." },
{ role: "user", content: "What is Cloudflare?" },
],
model: "@cf/meta/llama-3.3-70b-instruct-fp8-fast",
ai_search_options: {
retrieval: {
max_num_results: 5,
},
query_rewrite: {
enabled: true,
},
},
});

Stream responses

Set stream: true to receive responses as Server-Sent Events (SSE) as they are generated:

TypeScript
const instance = env.AI_SEARCH.get("my-instance");
const stream = await instance.chatCompletions({
messages: [{ role: "user", content: "What is Cloudflare?" }],
stream: true,
});
return new Response(stream, {
headers: {
"content-type": "text/event-stream",
"cache-control": "no-cache",
},
});

When stream is enabled, the method returns a ReadableStream of SSE events. Each event contains a JSON object with choices[0].delta.content for incremental text. The stream ends with a data: [DONE] event.

Parameters

messages array required

An array of message objects representing the conversation. Each message has a role and content field.

  • role string required

    • The role of the message sender. Valid values: system, developer, user, assistant, tool.
  • content string required

    • The content of the message.

model string optional

The text-generation model used to generate responses. Defaults to the generation model configured in the AI Search instance settings. For a list of supported models, refer to Supported models.


stream boolean optional

Returns a stream of results as they are generated. When enabled, returns a Response object with a readable stream. Defaults to false.


ai_search_options object optional

Configuration options for the search and generation operation.

  • retrieval object optional

    • retrieval_type string optional

      • The type of retrieval to perform. Valid values: vector, keyword, hybrid. Defaults to hybrid.
    • match_threshold number optional

      • The minimum match score required for a result to be considered a match. Must be between 0 and 1. Defaults to 0.4.
    • max_num_results integer optional

      • The maximum number of results to return. Must be between 1 and 50. Defaults to 10.
    • filters object optional

      • Filter search results based on metadata. Supports comparison filters (eq, ne, gt, gte, lt, lte) and compound filters (and, or). For more details, refer to Metadata filtering.
    • context_expansion integer optional

      • The number of surrounding chunks to include for additional context. Must be between 0 and 3. Defaults to 0.
    • fusion_method string optional

      • Controls how vector and keyword scores are combined when using hybrid retrieval. Valid values: rrf (Reciprocal Rank Fusion), max (takes the maximum score). Defaults to the instance-level setting.
    • keyword_match_mode string optional

      • Controls how keyword (BM25) matching selects candidate documents. and requires all terms to match. or requires any term to match. Defaults to and.
    • boost_by array optional

      • Boost results by metadata fields. Maximum 3 items. Each item has:
        • field string required - The metadata field name to boost by (for example, timestamp). Maximum 64 characters.
        • direction string optional - The boost direction. Valid values: asc, desc, exists, not_exists. Defaults to asc for numeric fields and exists for text fields.
    • metadata_only boolean optional

      • Return only metadata for each chunk without the text content.
    • return_on_failure boolean optional

      • Whether to return partial results if some processing steps fail. Defaults to true.
  • query_rewrite object optional

    • enabled boolean optional

      • Rewrites the query to improve retrieval accuracy. Defaults to false.
    • model string optional

      • The model to use for query rewriting.
    • rewrite_prompt string optional

      • A custom prompt to guide query rewriting.
  • reranking object optional

    • enabled boolean optional

      • Reorders retrieved results based on semantic relevance using a reranking model. Defaults to false.
    • model string optional

      • The reranking model to use. Valid value: @cf/baai/bge-reranker-base.
    • match_threshold number optional

      • The minimum score for reranked results. Must be between 0 and 1. Defaults to 0.4.
  • cache object optional

    • enabled boolean optional

      • Override the instance-level cache setting for this request.
    • cache_threshold string optional

      • The similarity threshold for cache hits. Valid values: super_strict_match, close_enough, flexible_friend, anything_goes.

Response (non-streaming)

FieldTypeDescription
idstringUnique identifier for the completion.
objectstringAlways chat.completion.
creatednumberUnix timestamp of when the completion was created.
modelstringThe model used to generate the response.
choicesarrayArray of completion choices.
choices[].message.rolestringAlways assistant.
choices[].message.contentstringThe generated response text.
choices[].finish_reasonstringWhy the model stopped generating. Typically stop.
usage.prompt_tokensnumberNumber of tokens in the prompt.
usage.completion_tokensnumberNumber of tokens in the generated response.
usage.total_tokensnumberTotal tokens used.
chunksarrayThe source chunks used as context. Same format as the search response.

Response (streaming)

When stream: true, the method returns a ReadableStream of Server-Sent Events. The retrieved chunks are sent first as a chunks event, followed by the streamed response.

event: chunks
data: [{"id":"chunk-001","type":"text","score":0.85,"text":"...","item":{"key":"about-cloudflare.md","timestamp":1775925540000},"scoring_details":{"vector_score":0.85}}]
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" document"}}]}
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" you provided doesn"}}]}
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"'t contain"}}]}
data: {"id":"id-1776072781845","created":1776072781,"model":"@cf/meta/llama-3.3-70b-instruct-fp8-fast","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" information"}}]}
data: [DONE]

Namespace methods

The following methods are only available when using the ai_search_namespaces binding. Search and chat across multiple instances in a single call using the namespace handle directly (env.AI_SEARCH).

search()

Pass instance_ids in ai_search_options to specify which instances to query. Results are merged and ranked, and each chunk includes an instance_id field identifying which instance it came from.

TypeScript
const results = await env.AI_SEARCH.search({
messages: [{ role: "user", content: "What is Cloudflare?" }],
ai_search_options: {
instance_ids: ["product-docs", "customer-abc123"],
},
});

Parameters

Same as instance-level search, with one additional required field:

ParameterTypeRequiredDescription
ai_search_optionsobjectYesRequired for namespace-level search.
ai_search_options.instance_idsarrayYesInstance IDs to search across. Minimum 1, maximum 10.

Response

Same as instance-level search, with additional fields:

FieldTypeDescription
chunks[].instance_idstringThe instance this chunk came from.
errorsarrayPer-instance errors if any instances failed. Each object has instance_id and message.

chatCompletions()

Generate chat completions using context retrieved from multiple instances.

TypeScript
const response = await env.AI_SEARCH.chatCompletions({
messages: [{ role: "user", content: "What is Cloudflare?" }],
ai_search_options: {
instance_ids: ["product-docs", "customer-abc123"],
},
});

Streaming is supported with stream: true.

Parameters

Same as instance-level chat completions, with one additional required field:

ParameterTypeRequiredDescription
ai_search_optionsobjectYesRequired for namespace-level chat completions.
ai_search_options.instance_idsarrayYesInstance IDs to search across. Minimum 1, maximum 10.

Response

Same as instance-level chat completions, with additional fields on each chunk:

FieldTypeDescription
chunks[].instance_idstringThe instance this chunk came from.
errorsarrayPer-instance errors if any instances failed. Each object has instance_id and message.

Local development

Local development is supported by proxying requests to your deployed AI Search instance. Add remote: true to your binding configuration to enable local development with wrangler dev.

JSONC
// wrangler.jsonc
{
"ai_search": [
{
"binding": "MY_SEARCH",
"instance_name": "my-instance",
"remote": true,
},
],
}