What is the difference between Basic RAG and Agentic RAG?
#gen-ai#rag#agents
Answer
Basic RAG vs Agentic RAG
RAG has evolved from a simple fixed pipeline to a sophisticated agent-controlled retrieval system.
Basic RAG
A fixed, linear pipeline — always retrieves, always generates:
textQuery → Embed → Search → Retrieve top-k → Generate
pythondef basic_rag(question: str) -> str: results = vector_db.similarity_search(question, k=4) context = "\n".join([r.page_content for r in results]) return llm.invoke(f"Context: {context}\n\nQuestion: {question}")
Limitations:
- Single retrieval step — misses multi-hop questions
- Same process for all queries — no decision-making
- Can't choose between data sources
- No self-correction if retrieval quality is poor
Agentic RAG
The LLM decides when, what, and how many times to retrieve:
pythonfrom langchain.agents import AgentExecutor from langchain.tools import Tool # Retrieval is just one tool the agent can choose tools = [ Tool( name="search_knowledge_base", description="Search internal docs for company-specific info", func=lambda q: vector_db.similarity_search(q, k=3) ), Tool( name="search_web", description="Search internet for current public information", func=web_search ), Tool( name="query_database", description="Get specific records from SQL database", func=sql_query ) ] # Agent decides: which tool? how many times? stop when confident agent = create_react_agent(llm, tools, react_prompt) executor = AgentExecutor(agent=agent, tools=tools, max_iterations=10) result = executor.invoke({"input": "Compare our Q3 revenue to industry average"})
Comparison
| Dimension | Basic RAG | Agentic RAG |
|---|---|---|
| Control | Fixed pipeline | LLM decides |
| Retrievals | Exactly one | 0, 1, or many |
| Multi-hop | Poor | Excellent |
| Tool selection | No | Yes |
| Self-correction | No | Yes (re-retrieve on poor results) |
| Latency | Lower | Higher |
| Cost | Lower | Higher |
| Complexity | Low | High |
Agentic RAG Patterns
| Pattern | Description |
|---|---|
| Adaptive RAG | Agent decides if retrieval is even needed |
| Iterative RAG | Retrieve → assess → retrieve again if needed |
| Multi-source RAG | Agent chooses between vector DB, web, SQL |
| Self-RAG | Model generates text text |
| Corrective RAG | Evaluate retrieved docs; search web if quality is poor |
| Query decomposition | Break complex query into sub-queries |
Self-RAG Decision Logic
pythondef adaptive_rag(question: str) -> str: # First: does this question need retrieval? decision = llm.invoke( f"Does answering '{question}' require looking up specific documents? " "Answer YES or NO only." ) if "YES" in decision.upper(): # Retrieve and assess quality chunks = vector_db.search(question, k=5) quality = llm.invoke(f"Are these chunks relevant to '{question}'? {chunks}") if "NO" in quality.upper(): chunks = web_search(question) # Fallback to web return llm.invoke(f"Context: {chunks}\nQuestion: {question}") else: return llm.invoke(question) # Answer from training knowledge
When to Use Each
| Use Basic RAG | Use Agentic RAG |
|---|---|
| Simple Q&A, FAQ bot | Complex research tasks |
| Single knowledge source | Multiple data sources |
| Consistent, predictable queries | Diverse, open-ended queries |
| Low latency requirement | High accuracy requirement |
| Small team, prototype | Production enterprise |