---
title: Metadata
description: Use metadata to filter documents before retrieval and provide context to guide AI responses. This page covers built-in metadata attributes, custom metadata schemas, and filter syntax.
image: https://developers.cloudflare.com/dev-products-preview.png
---

[Skip to content](#%5Ftop) 

Was this helpful?

YesNo

[ Edit page ](https://github.com/cloudflare/cloudflare-docs/edit/production/src/content/docs/ai-search/configuration/indexing/metadata.mdx) [ Report issue ](https://github.com/cloudflare/cloudflare-docs/issues/new/choose) 

Copy page

# Metadata

Use metadata to filter documents before retrieval and provide context to guide AI responses. This page covers built-in metadata attributes, custom metadata schemas, and filter syntax.

## Built-in metadata attributes

AI Search automatically extracts the following metadata attributes from your indexed documents:

| Attribute | Description                                                                                         | Example                                                                 |
| --------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- |
| filename  | The name of the file.                                                                               | guide.pdf or docs/getting-started/guide.pdf                             |
| folder    | The folder or prefix to the object.                                                                 | For docs/getting-started/guide.pdf, the folder is docs/getting-started/ |
| timestamp | Unix timestamp (milliseconds) when the object was last modified. Comparisons round down to seconds. | 1735689600000 (2025-01-01 00:00:00 UTC)                                 |

## Custom metadata

Custom metadata allows you to define additional fields for filtering search results. You can attach structured metadata to documents and filter queries by attributes such as category, version, or any custom field.

### Supported data types

| Type     | Description                        | Example values               |
| -------- | ---------------------------------- | ---------------------------- |
| text     | String values (max 500 characters) | "documentation", "blog-post" |
| number   | Numeric values (parsed as float)   | 2.5, 100, \-3.14             |
| boolean  | Boolean values                     | true, false, 1, 0, yes, no   |
| datetime | Date and time values               | "2026-01-15T00:00:00Z"       |

### Define a schema

Before custom metadata can be extracted, define a schema in your AI Search configuration using the `custom_metadata` field. The schema specifies which fields to extract and their data types.

TypeScript

```

custom_metadata: [

  { field_name: "category", data_type: "text" },

  { field_name: "version", data_type: "number" },

  { field_name: "is_public", data_type: "boolean" },

];


```

**Schema constraints:**

* Maximum of 5 custom metadata fields per AI Search instance
* Field names are case-insensitive and stored as lowercase
* Field names cannot use reserved names: `timestamp`, `folder`, `filename`
* Text values are truncated to 500 characters
* Changing the schema triggers a full re-index of all documents

### Add custom metadata to documents

How you attach custom metadata depends on your data source:

* **R2 bucket**: Set metadata using S3-compatible custom headers (`x-amz-meta-*`). Refer to [R2 custom metadata](https://developers.cloudflare.com/ai-search/configuration/data-source/r2/#custom-metadata) for examples.
* **Website**: Add `<meta>` tags to your HTML pages. Refer to [Website custom metadata](https://developers.cloudflare.com/ai-search/configuration/data-source/website/#custom-metadata) for details.
* **Built-in storage**: Attach metadata when uploading files through the [Items API](https://developers.cloudflare.com/ai-search/api/items/workers-binding/#upload-with-metadata).

## Metadata filtering

Metadata filtering narrows down search results based on metadata, so only relevant content is retrieved. The filter is applied before retrieval, so you only query the documents that matter.

Note

If you are using the legacy AutoRAG API, refer to [Metadata filter format (legacy)](https://developers.cloudflare.com/ai-search/api/migration/autorag-filter-format/) for the filter syntax.

Here is an example of metadata filtering using the [Workers binding](https://developers.cloudflare.com/ai-search/api/search/workers-binding/):

TypeScript

```

const instance = env.AI_SEARCH.get("my-instance");


const results = await instance.search({

  messages: [{ role: "user", content: "What is Cloudflare?" }],

  ai_search_options: {

    retrieval: {

      filters: {

        folder: "docs/getting-started/",

        timestamp: { $gte: 1735689600 },

      },

    },

  },

});


```

Explain Code

### Filter syntax

Filters are JSON objects where keys are metadata attribute names and values specify the filter condition.

#### Supported operators

| Operator | Description                       |
| -------- | --------------------------------- |
| $eq      | Equals                            |
| $ne      | Not equals                        |
| $in      | In (matches any value in array)   |
| $nin     | Not in (excludes values in array) |
| $lt      | Less than                         |
| $lte     | Less than or equal to             |
| $gt      | Greater than                      |
| $gte     | Greater than or equal to          |

#### Implicit `$eq`

When you provide a direct value without an operator, it is treated as an equality check:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": "docs/getting-started/" }

    }

  }

}


```

This is equivalent to:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$eq": "docs/getting-started/" } }

    }

  }

}


```

#### Range queries

Combine upper and lower bound operators to filter by ranges:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "timestamp": { "$gte": 1735689600, "$lt": 1735900000 } }

    }

  }

}


```

#### Multiple conditions (implicit AND)

When you specify multiple keys, all conditions must match:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": {

        "folder": "docs/getting-started/",

        "timestamp": { "$gte": 1735689600 }

      }

    }

  }

}


```

Explain Code

#### `$in` operator

Match any value in an array:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$in": ["docs/guides/", "docs/tutorials/"] } }

    }

  }

}


```

### "Starts with" filter for folders

Use range queries to filter for all files within a folder and its subfolders.

For example, consider this file structure:

* Directorydocs  
   * guide.pdf  
   * Directorytutorials  
         * Directorygetting-started  
                  * intro.pdf

Using `{ "folder": "docs/" }` only matches files directly in that folder (like `guide.pdf`), not files in subfolders.

To match all files starting with `docs/`, use a range query:

```

{

  "ai_search_options": {

    "retrieval": {

      "filters": { "folder": { "$gte": "docs/", "$lt": "docs0" } }

    }

  }

}


```

This works because:

* `$gte` includes all paths starting with `docs/`
* `$lt` with `docs0` excludes paths that do not start with `docs/` (since `0` comes after `/` in ASCII)

## Re-indexing behavior

When you modify the `custom_metadata` schema:

1. New fields are added to the search index.
2. Removed fields are deleted from the search index.
3. A full re-index is triggered for all documents.
4. Existing vectors are updated with the new metadata structure.

## Limitations

| Constraint                | Limit                       |
| ------------------------- | --------------------------- |
| Maximum custom fields     | 5 per AI Search instance    |
| Maximum text value length | 500 characters              |
| Reserved field names      | timestamp, folder, filename |
| Field name matching       | Case-insensitive            |

If file metadata exceeds size limits, the metadata is replaced with an error indicator:

```

{

  "file": { "error": "metadata is too large" }

}


```

To avoid this, keep individual metadata values concise.

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai-search/","name":"AI Search"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai-search/configuration/","name":"Configuration"}},{"@type":"ListItem","position":4,"item":{"@id":"/ai-search/configuration/indexing/","name":"Indexing"}},{"@type":"ListItem","position":5,"item":{"@id":"/ai-search/configuration/indexing/metadata/","name":"Metadata"}}]}
```
