Configuration
When creating an AutoRAG instance, you can customize how your RAG pipeline ingests, processes, and responds to data using a set of configuration options. Some settings can be updated after the instance is created, while others are fixed at creation time.
The table below lists all available configuration options:
Configuration | Editable after creation | Description |
---|---|---|
Data source | no | The source where your knowledge base is stored |
Chunk size | yes | Number of tokens per chunk |
Chunk overlap | yes | Number of overlapping tokens between chunks |
Embedding model | no | Model used to generate vector embeddings |
Query rewrite | yes | Enable or disable query rewriting before retrieval |
Query rewrite model | yes | Model used for query rewriting |
Query rewrite system prompt | yes | Custom system prompt to guide query rewriting behavior |
Match threshold | yes | Minimum similarity score required for a vector match |
Maximum number of results | yes | Maximum number of vector matches returned (top_k ) |
Generation model | yes | Model used to generate the final response |
Generation system prompt | yes | Custom system prompt to guide response generation |
Similarity caching | yes | Enable or disable caching of responses for similar (not just exact) prompts |
Similarity caching threshold | yes | Controls how similar a new prompt must be to a previous one to reuse its cached response |
AI Gateway | yes | AI Gateway for monitoring and controlling model usage |
AutoRAG name | no | Name of your AutoRAG instance |
Service API token | yes | API token granted to AutoRAG to give it permission to configure resources on your account. |
Was this helpful?
- Resources
- API
- New to Cloudflare?
- Products
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark