Answer
What is Graph RAG?
Graph RAG (Graph-based Retrieval-Augmented Generation) is an advanced RAG approach that uses a knowledge graph instead of (or in addition to) a vector database to organize and retrieve information, enabling multi-hop reasoning and better understanding of relationships between entities.
Why Graph RAG?
Standard RAG retrieves text chunks based on similarity — it works well for direct questions but struggles with:
- Multi-hop questions: "Who is the CEO of the company that acquired OpenAI's competitor?"
- Relational questions: "What are all the products from companies that Apple acquired?"
- Structural queries: "How does entity A connect to entity B?"
Graph RAG solves this by storing information as a knowledge graph where entities are nodes and relationships are edges.
Knowledge Graph Structure
textStandard Vector RAG: Documents → Chunks → Embeddings → Vector DB Query → Similar chunks → LLM answers Graph RAG: Documents → Entity extraction → Knowledge Graph Query → Graph traversal + vector search → LLM answers
textExample Knowledge Graph: [Anthropic] --founded_by--> [Dario Amodei] [Anthropic] --created--> [Claude] [Claude] --is_a--> [LLM] [Dario Amodei] --previously_at--> [OpenAI] [OpenAI] --created--> [GPT-4] [GPT-4] --is_a--> [LLM]
Microsoft GraphRAG (Popular Implementation)
Microsoft released GraphRAG as an open-source library:
bashpip install graphrag
python# GraphRAG workflow from graphrag.index import create_pipeline_config from graphrag.index.run import run_pipeline_with_config # 1. Indexing: extract entities and relationships from documents await run_pipeline_with_config( config=create_pipeline_config( input_dir="./documents", output_dir="./graphrag_output", llm_model="gpt-4o" ) ) # 2. Query: uses both global (graph) and local (vector) search from graphrag.query.indexer_adapters import read_indexer_entities from graphrag.query.llm.oai.chat_openai import ChatOpenAI # Local search (specific entity questions) result = await local_search.asearch( "What are Anthropic's main AI safety initiatives?" ) # Global search (broad thematic questions) result = await global_search.asearch( "What are the major themes in AI safety research?" )
Building a Simple Graph RAG
pythonimport networkx as nx from anthropic import Anthropic import chromadb client = Anthropic() class SimpleGraphRAG: def __init__(self): self.graph = nx.DiGraph() self.vector_db = chromadb.Client().create_collection("docs") def add_document(self, doc_id: str, text: str): # 1. Extract entities and relationships using LLM extraction = client.messages.create( model="claude-opus-4-6", max_tokens=512, messages=[{"role": "user", "content": f"Extract entities and relationships from this text as JSON. " f"Format: {{entities: [], relationships: [[entity1, relation, entity2]]}}\n\n{text}"}] ) import json data = json.loads(extraction.content[0].text) # 2. Add to knowledge graph for entity in data.get("entities", []): self.graph.add_node(entity) for rel in data.get("relationships", []): if len(rel) == 3: self.graph.add_edge(rel[0], rel[2], relation=rel[1]) # 3. Also add to vector store for similarity search self.vector_db.add(documents=[text], ids=[doc_id]) def query(self, question: str) -> str: # 1. Vector search for relevant chunks results = self.vector_db.query(query_texts=[question], n_results=3) # 2. Extract entities from question entities_in_question = [ node for node in self.graph.nodes if node.lower() in question.lower() ] # 3. Traverse graph for related entities graph_context = [] for entity in entities_in_question: neighbors = list(self.graph.neighbors(entity)) for neighbor in neighbors: edge_data = self.graph.edges[entity, neighbor] graph_context.append(f"{entity} --{edge_data.get('relation', 'relates_to')}--> {neighbor}") # 4. Combine vector + graph context context = "\n".join(results["documents"][0] + graph_context) response = client.messages.create( model="claude-opus-4-6", max_tokens=512, messages=[{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}] ) return response.content[0].text
Graph RAG vs Standard RAG
| Standard RAG | Graph RAG | |
|---|---|---|
| Structure | Flat chunks | Entities + relationships |
| Multi-hop | Poor | Excellent |
| Relationship queries | Poor | Excellent |
| Setup complexity | Low | High |
| Cost | Lower | Higher (entity extraction) |
| Best for | Direct fact lookup | Complex reasoning, analysis |