How would you handle out-of-scope queries in the RAG system?
#gen-ai#system-design#rag
Answer
Handling Out-of-Scope Queries in RAG
Out-of-scope queries fall into two categories: topic drift (asking about something unrelated) and knowledge gaps (the answer isn't in the knowledge base). Both need different handling.
Detection Strategies
1. Semantic Similarity Threshold
pythonfrom sentence_transformers import SentenceTransformer import numpy as np model = SentenceTransformer("all-MiniLM-L6-v2") # Pre-embed scope-defining sentences SCOPE_SENTENCES = [ "product pricing and plans", "account management and billing", "technical troubleshooting and bugs", "shipping and delivery", "returns and refunds policy", ] scope_embeddings = model.encode(SCOPE_SENTENCES, normalize_embeddings=True) def is_in_scope(query: str, threshold: float = 0.35) -> bool: query_emb = model.encode([query], normalize_embeddings=True)[0] similarities = scope_embeddings @ query_emb max_sim = float(similarities.max()) return max_sim >= threshold # Test print(is_in_scope("How do I cancel my subscription?")) # True print(is_in_scope("What is the capital of France?")) # False
2. LLM-Based Classifier (More Accurate)
pythonfrom openai import OpenAI client = OpenAI() def classify_query(query: str, scope_description: str) -> dict: response = client.chat.completions.create( model="gpt-4o-mini", messages=[{ "role": "user", "content": f'''Classify if this query is in-scope for a customer support chatbot. Scope: {scope_description} Query: "{query}" Respond with JSON: {{"in_scope": true/false, "reason": "brief explanation", "confidence": 0.0-1.0}}''' }], response_format={"type": "json_object"}, temperature=0, ) import json return json.loads(response.choices[0].message.content)
3. Retrieval Score Threshold
pythondef rag_with_scope_check(query: str, vectorstore, min_score: float = 0.6) -> dict: # If best retrieved chunk has low similarity, query is likely out-of-scope docs_with_scores = vectorstore.similarity_search_with_score(query, k=3) if not docs_with_scores or docs_with_scores[0][1] < min_score: return { "answer": "I don't have information about that topic. " "Please contact our support team at support@acme.com.", "source": "fallback", "in_scope": False, } # Proceed with RAG generation context = "\n\n".join([doc.page_content for doc, _ in docs_with_scores]) answer = generate_answer(query, context) return {"answer": answer, "source": "rag", "in_scope": True}
Fallback Response Strategies
| Situation | Response Strategy |
|---|---|
| Off-topic query | Politely decline + redirect to scope |
| Answer not in KB | Acknowledge + offer human handoff |
| Ambiguous query | Ask clarifying question |
| Sensitive topic | Escalate to human agent |
| Repeat failure | Create support ticket automatically |
Graceful Degradation Flow
pythondef handle_query(query: str, user_id: str) -> str: # Step 1: Safety filter if is_harmful(query): return "I cannot help with that request." # Step 2: Scope check if not is_in_scope(query): return ("That's outside my area of expertise. I can help with " "billing, account management, and product questions. " "For other inquiries, please visit our help center.") # Step 3: RAG retrieval result = rag_with_scope_check(query, vectorstore) # Step 4: Low-confidence fallback if not result["in_scope"]: create_ticket(user_id, query) # Auto-create support ticket return result["answer"] return result["answer"]
Production insight: Track out-of-scope query patterns — they reveal gaps in your knowledge base. If 20% of users ask about a topic not covered, add it to your docs.