Syncing
AI Search automatically indexes your content for search. How indexing works depends on your data source.
For instances connected to a website or R2 bucket, AI Search creates jobs to sync your data source. Jobs run automatically every 6 hours and process new, modified, or deleted files to keep your search index up to date.
You can view job status and history in the Jobs tab in the dashboard or using the Instances API.
Files uploaded to built-in storage are indexed immediately. There are no sync jobs. Each file is processed individually as it is uploaded.
| Action | Description |
|---|---|
| Trigger sync | Manually start a sync job to scan your external data source for changes. Can be triggered every 30 seconds. |
| Cancel job | Cancel a running sync job. |
| Pause indexing | Temporarily stop all scheduled sync jobs. |
| Resume indexing | Resume scheduled sync jobs. |
| Sync individual file | Re-index a specific file. |
You can perform these actions from the dashboard, the REST API, or the Workers binding.
The total time to index depends on the number and type of files. Factors that affect performance include:
- Total number of files and their sizes
- File formats (for example, images take longer than plain text)
- Latency of Workers AI models used for embedding and image processing
To ensure smooth and reliable indexing:
- Make sure your files are within the size limit and in a supported format to avoid being skipped.
- For R2-backed instances, keep your service API token valid to prevent indexing failures.
- Regularly clean up outdated or unnecessary content to stay within instance limits.