How would you handle out-of-scope queries in the RAG system?

Question

Accepted Answer

## Handling Out-of-Scope Queries in RAG

Out-of-scope queries fall into two categories: **topic drift** (asking about something unrelated) and **knowledge gaps** (the answer isn't in the knowledge base). Both need different handling.

### Detection Strategies

**1. Semantic Similarity Threshold**

```python
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")

# Pre-embed scope-defining sentences
SCOPE_SENTENCES = [
    "product pricing and plans",
    "account management and billing",
    "technical troubleshooting and bugs",
    "shipping and delivery",
    "returns and refunds policy",
]
scope_embeddings = model.encode(SCOPE_SENTENCES, normalize_embeddings=True)

def is_in_scope(query: str, threshold: float = 0.35) -> bool:
    query_emb = model.encode([query], normalize_embeddings=True)[0]
    similarities = scope_embeddings @ query_emb
    max_sim = float(similarities.max())
    return max_sim >= threshold

# Test
print(is_in_scope("How do I cancel my subscription?"))  # True
print(is_in_scope("What is the capital of France?"))    # False
```

**2. LLM-Based Classifier (More Accurate)**

```python
from openai import OpenAI

client = OpenAI()

def classify_query(query: str, scope_description: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f'''Classify if this query is in-scope for a customer support chatbot.
Scope: {scope_description}
Query: "{query}"
Respond with JSON: {{"in_scope": true/false, "reason": "brief explanation", "confidence": 0.0-1.0}}'''
        }],
        response_format={"type": "json_object"},
        temperature=0,
    )
    import json
    return json.loads(response.choices[0].message.content)
```

**3. Retrieval Score Threshold**

```python
def rag_with_scope_check(query: str, vectorstore, min_score: float = 0.6) -> dict:
    # If best retrieved chunk has low similarity, query is likely out-of-scope
    docs_with_scores = vectorstore.similarity_search_with_score(query, k=3)

if not docs_with_scores or docs_with_scores[0][1] < min_score:
        return {
            "answer": "I don't have information about that topic. "
                      "Please contact our support team at support@acme.com.",
            "source": "fallback",
            "in_scope": False,
        }

# Proceed with RAG generation
    context = "

".join([doc.page_content for doc, _ in docs_with_scores])
    answer = generate_answer(query, context)
    return {"answer": answer, "source": "rag", "in_scope": True}
```

### Fallback Response Strategies

| Situation | Response Strategy |
|-----------|------------------|
| **Off-topic query** | Politely decline + redirect to scope |
| **Answer not in KB** | Acknowledge + offer human handoff |
| **Ambiguous query** | Ask clarifying question |
| **Sensitive topic** | Escalate to human agent |
| **Repeat failure** | Create support ticket automatically |

### Graceful Degradation Flow

```python
def handle_query(query: str, user_id: str) -> str:
    # Step 1: Safety filter
    if is_harmful(query):
        return "I cannot help with that request."

# Step 2: Scope check
    if not is_in_scope(query):
        return ("That's outside my area of expertise. I can help with "
                "billing, account management, and product questions. "
                "For other inquiries, please visit our help center.")

# Step 3: RAG retrieval
    result = rag_with_scope_check(query, vectorstore)

# Step 4: Low-confidence fallback
    if not result["in_scope"]:
        create_ticket(user_id, query)  # Auto-create support ticket
        return result["answer"]

return result["answer"]
```

> **Production insight:** Track out-of-scope query patterns — they reveal gaps in your knowledge base. If 20% of users ask about a topic not covered, add it to your docs.

How would you handle out-of-scope queries in the RAG system?

Answer

Handling Out-of-Scope Queries in RAG

Detection Strategies

Fallback Response Strategies

Graceful Degradation Flow

Related Concepts

Design a RAG system for a customer support chatbot. Walk through your architecture.

How would you evaluate if your RAG system is better than fine-tuning?

Design a multi-agent system for research paper analysis.

How would you implement tool use in agents?

Provide me complete architecture of how a Chat LLM works in detail

Situation	Response Strategy
Off-topic query	Politely decline + redirect to scope
Answer not in KB	Acknowledge + offer human handoff
Ambiguous query	Ask clarifying question
Sensitive topic	Escalate to human agent
Repeat failure	Create support ticket automatically