All posts
Featured

How Query rewriting worked for us in improving accuracy and bringing cost savings in followup token consumption in queries

Query rewriting brings token expansion but eventually brings accuracy and savings in overall queries made to the system

2 min read7 views0 comments
How Query rewriting worked for us in improving accuracy and bringing cost savings in followup token consumption in queries

The problem current sytems and AI products are facing is - "User queries are often ambiguous or poorly phrased", as product managers you will face this situation where business and technology leadership will be looking for answers around :

  • How do we reduce user frustration specially in B2B space where users are impatient and want accurate solutions ?
  • How do we reduce the cost of exploring the right solution ?
  • How do we improve the overall customer experience in this conversational AI space ?

To address this, one way to look at is Query rewriting, example conversation :

YAML
USER - "When was the last time Rahul bought something from our website ? AI - "Rahul last bought a fruity fedora hat on Jan 3, 2030. USER - How about emily ? --- Challenge : "How about Emily ? Lacks context . The System needs to understand this

Solution : Query rewriting with context

Here what the prompt to model is : Given this conversation history, rewrite the last user input to be a standalone question.

SQL
Conversation --- USER - When was the last time Rahul bought something from our website ? AI - Rahul last bought a fruity fedora hat on Jan 3, 2030. User - How about emily ? REWRITTEN QUERY - Model Output - When was the last time Emily bought something from our website?

Now the Retreival system can properly search

MULTI-STEP Query Decomposition

YAML
Complex Query ; Compare the return policy for electronics vs clothing and tell from customer perspective ? --- Decomposition in 3 queries 1. What is the return policy for electronics ? 2. What is the return policy for clothing ? 3. Compare these policies from a customer perspective ?

Now the math behind the token expansion

  1. Original Query for example took 12-20 tokens
  2. Rewritten query took -25-40 tokens

Increase is more than 100%

COST FACTOR

Original - 12 input tokens x $ 0.01/1K tokens = $0.00012

Rewritten - 25 input tokens x $0.01/1K tokens = $0.00025 (Additional cost of $0.00013 per query

For a 1M queries per month - cost increase is $130 (1,000,000 x $0.00013)

But this improves retreival accuracy from 60-65% to 75-80%. (MEANS FEWER FOLLOW UPS)

Net Cost Savings comes from the reduced query volumes