Skip to content

Filtering

Metadata filtering narrows down search results based on metadata, so only relevant content is retrieved. The filter is applied before retrieval, so you only query the documents that matter.

Filtering uses the metadata attributes extracted during indexing. To define custom attributes or use the built-in metadata attributes, refer to Metadata attributes.

Here is an example of metadata filtering using the Workers binding:

TypeScript
const instance = env.AI_SEARCH.get("my-instance");
const results = await instance.search({
messages: [{ role: "user", content: "What is Cloudflare?" }],
ai_search_options: {
retrieval: {
filters: {
folder: "docs/getting-started/",
timestamp: { $gte: 1735689600 },
},
},
},
});

Filter syntax

Filters are JSON objects where keys are metadata attribute names and values specify the filter condition.

Supported operators

OperatorDescription
$eqEquals
$neNot equals
$inIn (matches any value in array)
$ninNot in (excludes values in array)
$ltLess than
$lteLess than or equal to
$gtGreater than
$gteGreater than or equal to

Implicit $eq

When you provide a direct value without an operator, it is treated as an equality check:

{
"ai_search_options": {
"retrieval": {
"filters": { "folder": "docs/getting-started/" }
}
}
}

This is equivalent to:

{
"ai_search_options": {
"retrieval": {
"filters": { "folder": { "$eq": "docs/getting-started/" } }
}
}
}

Range queries

Combine upper and lower bound operators to filter by ranges:

{
"ai_search_options": {
"retrieval": {
"filters": { "timestamp": { "$gte": 1735689600, "$lt": 1735900000 } }
}
}
}

Multiple conditions (implicit AND)

When you specify multiple keys, all conditions must match:

{
"ai_search_options": {
"retrieval": {
"filters": {
"folder": "docs/getting-started/",
"timestamp": { "$gte": 1735689600 }
}
}
}
}

$in operator

Match any value in an array:

{
"ai_search_options": {
"retrieval": {
"filters": { "folder": { "$in": ["docs/guides/", "docs/tutorials/"] } }
}
}
}

"Starts with" filter for folders

Use range queries to filter for all files within a folder and its subfolders.

For example, consider this file structure:

  • Directorydocs
    • guide.pdf
    • Directorytutorials
      • Directorygetting-started
        • intro.pdf

Using { "folder": "docs/" } only matches files directly in that folder (like guide.pdf), not files in subfolders.

To match all files starting with docs/, use a range query:

{
"ai_search_options": {
"retrieval": {
"filters": { "folder": { "$gte": "docs/", "$lt": "docs0" } }
}
}
}

This works because:

  • $gte includes all paths starting with docs/
  • $lt with docs0 excludes paths that do not start with docs/ (since 0 comes after / in ASCII)