How Query rewriting worked for us in improving accuracy and bringing cost savings in followup token consumption in queries

The problem current sytems and AI products are facing is - "User queries are often ambiguous or poorly phrased", as product managers you will face this situation where business and technology leadership will be looking for answers around :

How do we reduce user frustration specially in B2B space where users are impatient and want accurate solutions ?
How do we reduce the cost of exploring the right solution ?
How do we improve the overall customer experience in this conversational AI space ?

To address this, one way to look at is Query rewriting, example conversation :

YAML
USER - "When was the last time Rahul bought something from our website ? 
AI - "Rahul last bought a fruity fedora hat on Jan 3, 2030.
USER - How about emily ?
---
Challenge : "How about Emily ? Lacks context . The System needs to understand this

Solution : Query rewriting with context

Here what the prompt to model is : Given this conversation history, rewrite the last user input to be a standalone question.

SQL

Conversation
---

USER - When was the last time Rahul bought something from our website ?
AI - Rahul last bought a fruity fedora hat on Jan 3, 2030.
User - How about emily ?

REWRITTEN QUERY - Model Output - When was the last time Emily bought something
 from our website?

Now the Retreival system can properly search

MULTI-STEP Query Decomposition

YAML
Complex Query ; Compare the return policy for electronics vs clothing and tell 
from customer perspective ?
---
Decomposition in 3 queries 
1. What is the return policy for electronics ?
2. What is the return policy for clothing ?
3. Compare these policies from a customer perspective ?

Now the math behind the token expansion

Original Query for example took 12-20 tokens
Rewritten query took -25-40 tokens

Increase is more than 100%

COST FACTOR

Original - 12 input tokens x $ 0.01/1K tokens = $0.00012

Rewritten - 25 input tokens x $0.01/1K tokens = $0.00025 (Additional cost of $0.00013 per query

For a 1M queries per month - cost increase is $130 (1,000,000 x $0.00013)

But this improves retreival accuracy from 60-65% to 75-80%. (MEANS FEWER FOLLOW UPS)

Net Cost Savings comes from the reduced query volumes

How Query rewriting worked for us in improving accuracy and bringing cost savings in followup token consumption in queries

Now the math behind the token expansion

Be the first to comment

Leave a comment

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Be the first to comment

Leave a comment

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Now the math behind the token expansion#

Be the first to comment

Leave a comment

More to read

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Be the first to comment

Leave a comment

More to read

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Now the math behind the token expansion