How does a product manager go about taking RAG architecture decision

Lets take an example : You are tasked with building a clinical decision support system for doctors, doctors will ask questions like "What's the recommended treatment for a 45 year-old diabetic patient with kidney disease?

Your database has 500k medical papers
Treatment guidelines are present
Case studies are present

How would you architect the RAG system and what will be your key tradeoffs from a PM standpoint?

Dealing with medical industry, accuracy has to be paramount and the requirement should be 99% accurate because stakes are high, latency is something we can compromise on as long as accuracy is intact so lets say under 3 seconds. Our results should provide citations and also Compliance related informations in the recommendations and answers. This will build the trust

For searching the documents, guidelines and case studies :

Hybrid search (BM25 + Embdeddings)
Why ? --> Medical terms need exact matches but symptom search can be semantic so hybrid search works best
Specialized medical embeddings (example : BioBERT) because domain specific traning is crucial for accuracy
Metadata filtering first --> Why ? --> Because we have 500k documents which is huge dataset, we can filter by speciality, recency and evidence level. This will reduce your unnecessary token consumption for going through all document everytime, instead narrowing down your search path.

Chunking Strategy :

Hierarchical chinking preserving paper structure. Why ? --> because context matters in medical domain (methodology, patient cohort, conclusions)
Keep Abstract and relevant sections together. Why ? --> because conclusions need context of study designs in Medical domain

Safety Mechanism :

Confidence Scoring - Multiple evidence requirement ( atleast 3 or more sources)
Conflicting evidence detection
Uncertainty flagging.
Human in loop (High stake decisions require physician reviews etc. Anomaly detection triggers review)

Evaluation Strategy :

Offline Metrics - Medical expert evaluations by a good number of physicians, citiations accuracy checks, comparison to treatment guidelines
Online Metrics - Physicians feedback loop, outcome tracking (where permitted), false positive / negative rate

Key Trade-Offs

Accuracy vs Latency : Prioritize accuracy - MITIGATION FOR LATENCY ? --> Pre-compute for common queries, progressive loading etc.
Recency vs Evidence qualitty : weight by evidence level + recency - MITIGATION --> Mark "Emerging research" vs "Established"
Cost vs Coverage : Start with high evidence papers (lets say 100k), expand based on usage - MITIGATION FOR COVERAGE ? --> Track coverage gaps, prioritization indexing.

Success Metrics

Clinical Metrics : Diagnostic Accuracy improvements , treatment plan adherence, patient outcome correlations
Operations Metrics : Physician adoption rate, Time saved per case, error rate reduction
Business Metrics : Cost per query , ROI vs hiring more specialists

Essentially under these 6 buckets, I design my mental model to take decisions around a RAG architecture and this can be expanded to any domain and industry. There are lot of concepts and processes under RAG which my other blogs will cover in depth. I try to keep the learning simple to retain them easily and apply faster in real world :)

How does a product manager go about taking RAG architecture decision

Be the first to comment

Leave a comment

RAG Made Ridiculously Simple: How AI Looks Up Answers

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Be the first to comment

Leave a comment

RAG Made Ridiculously Simple: How AI Looks Up Answers

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Be the first to comment

Leave a comment

More to read

RAG Made Ridiculously Simple: How AI Looks Up Answers

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach

Be the first to comment

Leave a comment

More to read

RAG Made Ridiculously Simple: How AI Looks Up Answers

What metrics should product manager or engineer should focus on, for their RAG performance

Contextual Retrieval - Anthropic's Approach