Concept #36Hardsystem-design

How would you handle out-of-scope queries in the RAG system?

#gen-ai#system-design#rag

Answer

Handling Out-of-Scope Queries in RAG

Out-of-scope queries fall into two categories: topic drift (asking about something unrelated) and knowledge gaps (the answer isn't in the knowledge base). Both need different handling.

Detection Strategies

1. Semantic Similarity Threshold

python
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")

# Pre-embed scope-defining sentences
SCOPE_SENTENCES = [
    "product pricing and plans",
    "account management and billing",
    "technical troubleshooting and bugs",
    "shipping and delivery",
    "returns and refunds policy",
]
scope_embeddings = model.encode(SCOPE_SENTENCES, normalize_embeddings=True)

def is_in_scope(query: str, threshold: float = 0.35) -> bool:
    query_emb = model.encode([query], normalize_embeddings=True)[0]
    similarities = scope_embeddings @ query_emb
    max_sim = float(similarities.max())
    return max_sim >= threshold

# Test
print(is_in_scope("How do I cancel my subscription?"))  # True
print(is_in_scope("What is the capital of France?"))    # False

2. LLM-Based Classifier (More Accurate)

python
from openai import OpenAI

client = OpenAI()

def classify_query(query: str, scope_description: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f'''Classify if this query is in-scope for a customer support chatbot.
Scope: {scope_description}
Query: "{query}"
Respond with JSON: {{"in_scope": true/false, "reason": "brief explanation", "confidence": 0.0-1.0}}'''
        }],
        response_format={"type": "json_object"},
        temperature=0,
    )
    import json
    return json.loads(response.choices[0].message.content)

3. Retrieval Score Threshold

python
def rag_with_scope_check(query: str, vectorstore, min_score: float = 0.6) -> dict:
    # If best retrieved chunk has low similarity, query is likely out-of-scope
    docs_with_scores = vectorstore.similarity_search_with_score(query, k=3)

    if not docs_with_scores or docs_with_scores[0][1] < min_score:
        return {
            "answer": "I don't have information about that topic. "
                      "Please contact our support team at support@acme.com.",
            "source": "fallback",
            "in_scope": False,
        }

    # Proceed with RAG generation
    context = "\n\n".join([doc.page_content for doc, _ in docs_with_scores])
    answer = generate_answer(query, context)
    return {"answer": answer, "source": "rag", "in_scope": True}

Fallback Response Strategies

SituationResponse Strategy
Off-topic queryPolitely decline + redirect to scope
Answer not in KBAcknowledge + offer human handoff
Ambiguous queryAsk clarifying question
Sensitive topicEscalate to human agent
Repeat failureCreate support ticket automatically

Graceful Degradation Flow

python
def handle_query(query: str, user_id: str) -> str:
    # Step 1: Safety filter
    if is_harmful(query):
        return "I cannot help with that request."

    # Step 2: Scope check
    if not is_in_scope(query):
        return ("That's outside my area of expertise. I can help with "
                "billing, account management, and product questions. "
                "For other inquiries, please visit our help center.")

    # Step 3: RAG retrieval
    result = rag_with_scope_check(query, vectorstore)

    # Step 4: Low-confidence fallback
    if not result["in_scope"]:
        create_ticket(user_id, query)  # Auto-create support ticket
        return result["answer"]

    return result["answer"]

Production insight: Track out-of-scope query patterns — they reveal gaps in your knowledge base. If 20% of users ask about a topic not covered, add it to your docs.