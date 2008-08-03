Chunking is the process of splitting large data into smaller segments before embedding them for search. AutoRAG uses recursive chunking, which breaks your content at natural boundaries (like paragraphs or sentences), and then further splits it if the chunks are too large.

What is recurisve chunking

Recursive chunking tries to keep chunks meaningful by:

Splitting at natural boundaries: like paragraphs, then sentences.

Checking the size: if a chunk is too long (based on token count), it's split again into smaller parts.

This way, chunks are easy to embed and retrieve, without cutting off thoughts mid-sentence.

Chunking controls

AutoRAG exposes two parameters to help you control chunking behavior:

Chunk size : The number of tokens per chunk. Minimum: 64 Maximum: 512

: The number of tokens per chunk. Chunk overlap : The percentage of overlapping tokens between adjacent chunks. Minimum: 0% Maximum: 30%

: The percentage of overlapping tokens between adjacent chunks.

These settings apply during the indexing step, before your data is embedded and stored in Vectorize.

Choosing chunk size and overlap

Chunking affects both how your content is retrieved and how much context is passed into the generation model. Try out this external chunk visualizer tool ↗ to help understand how different chunk settings could look.

For chunk size, consider how:

Smaller chunks create more precise vector matches, but may split relevant ideas across multiple chunks.

Larger chunks retain more context, but may dilute relevance and reduce retrieval precision.

For chunk overlap, consider how:

More overlap helps preserve continuity across boundaries, especially in flowing or narrative content.

Less overlap reduces indexing time and cost, but can miss context if key terms are split between chunks.

Additional considerations: