Configuration

When creating an AutoRAG instance, you can customize how your RAG pipeline ingests, processes, and responds to data using a set of configuration options. Some settings can be updated after the instance is created, while others are fixed at creation time.

The table below lists all available configuration options:

Configuration	Editable after creation	Description
Data source	no	The source where your knowledge base is stored
Chunk size	yes	Number of tokens per chunk
Chunk overlap	yes	Number of overlapping tokens between chunks
Embedding model	no	Model used to generate vector embeddings
Query rewrite	yes	Enable or disable query rewriting before retrieval
Query rewrite model	yes	Model used for query rewriting
Query rewrite system prompt	yes	Custom system prompt to guide query rewriting behavior
Match threshold	yes	Minimum similarity score required for a vector match
Maximum number of results	yes	Maximum number of vector matches returned (`top_k`)
Generation model	yes	Model used to generate the final response
Generation system prompt	yes	Custom system prompt to guide response generation
Similarity caching	yes	Enable or disable caching of responses for similar (not just exact) prompts
Similarity caching threshold	yes	Controls how similar a new prompt must be to a previous one to reuse its cached response
AI Gateway	yes	AI Gateway for monitoring and controlling model usage
AutoRAG name	no	Name of your AutoRAG instance
Service API token	yes	API token granted to AutoRAG to give it permission to configure resources on your account.

Was this helpful?

Community
X
Discord
YouTube
GitHub