Cunking came into picture to address large contexts or large piece of information which is being fed to LLMs, due to this API rate limit used to hit. As you create chunks of original message, chunks loose context when split from original document.
Example :
Original Document : Company Financial Report Q3 2026
MARKDOWN{
1. Revenue Increased 15% YOY to $2.5B
2. Cost of Goods sold remains stable at 40% of revenue
3. Operating Expenses decreased by 5% due to the efficiency improvements
}
Traditional Chunking -
- Chunk 1 - Revenue increased 15% year on year to $2.5B
- Chunk 2 - Cost of goods sold remained stable at 40% of revenue
- Chunk 3 - Operating expenses decreased 5% due to efficiency improvements
Problems with traditional chunking
Query - 'What were the Q3 operating expenses?'
Chunk 3 doesn't mention Q3 2024, it might not be retrieved or might be ambiguous.
Anthropic Solution : Add Context to each chunk
JavaScript<document>
{whole document}
</document>
Here is the chunk we want to situate within the whole document :
<chunk>
{ chunk content }
</chunk>
Please give a short succint context (50-100 tokens) to situate this chunk within the overall document for the retreival purpose
Answer only with the succint context and nothing else
Generated contextual chunk :
ORIGINAL CHUNK - "Operating expenses decreased 5% due to efficiency improvements"
ENHANCED CHUNK - " This chunk is from Company financial report Q3 2024, operating expenses decreased 5% due to efficiency improvements"
Now the retrieval works
QUERY - "What were the Q3 2024 operating expenses ?"
- Traditional chunk : might miss or rank low
- Contextual chunk : high relevance ( contains "Q3 2024" and "Operating expenses")
COST BENEFIT ANALYSIS:
- Document : consists of 10,000 tokens which are translated from the words contained in the document
- Chunks : 40 chunks of 250 tokens each
CONTEXT GENERATION
- Input : 10,000 (document + 250 (chunk) = 10,250 tokens per chunk
- Output : 75 tokens context per chunk
- Total Input : 40 x 10,250 = 4,10,000 tokens
- Total Output : 40 x 75 - 3,000 tokens
At $0.01/1K inputs , $0.03/1K output :
- Input cost : 410 x $0.01 = $ 4.10
- Output cost : 3 x $0.03 = $0.09
Total $4.19 per document ( one time indexing cost)
Benefits : Improved retrieval accuracy will have 20-25% accuracy gains
Before: 67-70% context precision and After : 85-90% accuracy gains
Improvement is around 30-40% reduction in retrieval errors
SWIFTReal world example : Customer support of a company
- 10,000 queries coming per day
- 30% fewer wrong contexts -> 3,000 fewer escalations
- cost per escalation for example is $5
- Daily savings of $15000
- Monthly savings : $450,000
ROI : $4.19 INDEXING COST vs $450K SAVINGS translates in a 108,000% ROI
*NOTE : all the numbers are for reference purpose, once you try to implement these changes in your applications, the ROI and returns might vary but they will surely bring you well needed optimizations in terms of customer satisfaction, performance and cost savings in cost AI affairs *

