Skip to content
Cloudflare Docs

Configuration

When creating an AutoRAG instance, you can customize how your RAG pipeline ingests, processes, and responds to data using a set of configuration options. Some settings can be updated after the instance is created, while others are fixed at creation time.

The table below lists all available configuration options:

ConfigurationEditable after creationDescription
Data sourcenoThe source where your knowledge base is stored
Chunk sizeyesNumber of tokens per chunk
Chunk overlapyesNumber of overlapping tokens between chunks
Embedding modelnoModel used to generate vector embeddings
Query rewriteyesEnable or disable query rewriting before retrieval
Query rewrite modelyesModel used for query rewriting
Query rewrite system promptyesCustom system prompt to guide query rewriting behavior
Match thresholdyesMinimum similarity score required for a vector match
Maximum number of resultsyesMaximum number of vector matches returned (top_k)
Generation modelyesModel used to generate the final response
Generation system promptyesCustom system prompt to guide response generation
Similarity cachingyesEnable or disable caching of responses for similar (not just exact) prompts
Similarity caching thresholdyesControls how similar a new prompt must be to a previous one to reuse its cached response
AI GatewayyesAI Gateway for monitoring and controlling model usage
AutoRAG namenoName of your AutoRAG instance
Service API tokenyesAPI token granted to AutoRAG to give it permission to configure resources on your account.