Query rewriting
Query rewriting is an optional step in the AutoRAG pipeline that improves retrieval quality by transforming the original user query into a more effective search query.
Instead of embedding the raw user input directly, AutoRAG can use a large language model (LLM) to rewrite the query based on a system prompt. The rewritten query is then used to perform the vector search.
The wording of a user’s question may not match how your documents are written. Query rewriting helps bridge this gap by:
- Rephrasing informal or vague queries into precise, information-dense terms
- Adding synonyms or related keywords
- Removing filler words or irrelevant details
- Incorporating domain-specific terminology
This leads to more relevant vector matches which improves the accuracy of the final generated response.
Original query: how do i make this work when my api call keeps failing?
Rewritten query: API call failure troubleshooting authentication headers rate limiting network timeout 500 error
In this example, the original query is conversational and vague. The rewritten version extracts the core problem (API call failure) and expands it with relevant technical terms and likely causes. These terms are much more likely to appear in documentation or logs, improving semantic matching during vector search.
If query rewriting is enabled, AutoRAG performs the following:
- Sends the original user query and the query rewrite system prompt to the configured LLM
- Receives the rewritten query from the model
- Embeds the rewritten query using the selected embedding model
- Performs vector search in your AutoRAG’s Vectorize index
For details on how to guide model behavior during this step, see the system prompt documentation.
Was this helpful?
- Resources
- API
- New to Cloudflare?
- Products
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark