Filtering
Metadata filtering narrows down search results based on metadata, so only relevant content is retrieved. The filter is applied before retrieval, so you only query the documents that matter.
Filtering uses the metadata attributes extracted during indexing. To define custom attributes or use the built-in metadata attributes, refer to Metadata attributes.
Here is an example of metadata filtering using the Workers binding:
const instance = env.AI_SEARCH.get("my-instance");
const results = await instance.search({ messages: [{ role: "user", content: "What is Cloudflare?" }], ai_search_options: { retrieval: { filters: { folder: "docs/getting-started/", timestamp: { $gte: 1735689600 }, }, }, },});Filters are JSON objects where keys are metadata attribute names and values specify the filter condition.
| Operator | Description |
|---|---|
$eq | Equals |
$ne | Not equals |
$in | In (matches any value in array) |
$nin | Not in (excludes values in array) |
$lt | Less than |
$lte | Less than or equal to |
$gt | Greater than |
$gte | Greater than or equal to |
When you provide a direct value without an operator, it is treated as an equality check:
{ "ai_search_options": { "retrieval": { "filters": { "folder": "docs/getting-started/" } } }}This is equivalent to:
{ "ai_search_options": { "retrieval": { "filters": { "folder": { "$eq": "docs/getting-started/" } } } }}Combine upper and lower bound operators to filter by ranges:
{ "ai_search_options": { "retrieval": { "filters": { "timestamp": { "$gte": 1735689600, "$lt": 1735900000 } } } }}When you specify multiple keys, all conditions must match:
{ "ai_search_options": { "retrieval": { "filters": { "folder": "docs/getting-started/", "timestamp": { "$gte": 1735689600 } } } }}Match any value in an array:
{ "ai_search_options": { "retrieval": { "filters": { "folder": { "$in": ["docs/guides/", "docs/tutorials/"] } } } }}Use range queries to filter for all files within a folder and its subfolders.
For example, consider this file structure:
Directorydocs
- guide.pdf
Directorytutorials
Directorygetting-started
- intro.pdf
Using { "folder": "docs/" } only matches files directly in that folder (like guide.pdf), not files in subfolders.
To match all files starting with docs/, use a range query:
{ "ai_search_options": { "retrieval": { "filters": { "folder": { "$gte": "docs/", "$lt": "docs0" } } } }}This works because:
$gteincludes all paths starting withdocs/$ltwithdocs0excludes paths that do not start withdocs/(since0comes after/in ASCII)