Skip to content

Syncing

AI Search automatically indexes your content for search. How indexing works depends on your data source.

External data sources

For instances connected to a website or R2 bucket, AI Search creates jobs to sync your data source. Jobs run automatically every 6 hours and process new, modified, or deleted files to keep your search index up to date.

You can view job status and history in the Jobs tab in the dashboard or using the Instances API.

Built-in storage

Files uploaded to built-in storage are indexed immediately. There are no sync jobs. Each file is processed individually as it is uploaded.

Controls

ActionDescription
Trigger syncManually start a sync job to scan your external data source for changes. Can be triggered every 30 seconds.
Cancel jobCancel a running sync job.
Pause indexingTemporarily stop all scheduled sync jobs.
Resume indexingResume scheduled sync jobs.
Sync individual fileRe-index a specific file.

You can perform these actions from the dashboard, the REST API, or the Workers binding.

Performance

The total time to index depends on the number and type of files. Factors that affect performance include:

  • Total number of files and their sizes
  • File formats (for example, images take longer than plain text)
  • Latency of Workers AI models used for embedding and image processing

Best practices

To ensure smooth and reliable indexing: