Result filtering
Result filtering controls how many results are returned and the minimum score required. To filter results by metadata attributes like folder or category, refer to Metadata.
The match_threshold sets the minimum vector similarity score that a chunk must meet to be included in the results. Threshold values range from 0 to 1. The threshold filters on the vector similarity score, not the fused score returned in the response.
- A higher threshold means stricter filtering, returning only highly similar matches.
- A lower threshold allows broader matches, increasing recall but possibly reducing precision.
The max_num_results setting controls the number of top-matching chunks returned. The maximum allowed value is 50.
- Use a higher value if you want to synthesize across multiple documents. However, providing more input to the model can increase latency and cost.
- Use a lower value if you prefer concise answers with minimal context.
- Your query is embedded using the configured embedding model.
- The search index is queried. For hybrid search, vector and keyword results are fused into a single ranked list.
- Chunks with a vector similarity score below
match_thresholdare filtered out. - The filtered results are limited to
max_num_resultsand passed into the generation step as context.
If no results meet the threshold, AI Search will not generate a response.
If reranking is enabled, a separate reranking.match_threshold can be configured to filter chunks by their reranking score.
These values can be configured at the instance level or overridden per request:
const instance = env.AI_SEARCH.get("my-instance");
const results = await instance.search({ messages: [{ role: "user", content: "What is Cloudflare?" }], ai_search_options: { retrieval: { match_threshold: 0.5, max_num_results: 10, }, },});