What is Graph RAG?

Question

What is Graph RAG?

Accepted Answer

## What is Graph RAG?

**Graph RAG** (Graph-based Retrieval-Augmented Generation) is an advanced RAG approach that uses a **knowledge graph** instead of (or in addition to) a vector database to organize and retrieve information, enabling multi-hop reasoning and better understanding of relationships between entities.

### Why Graph RAG?

Standard RAG retrieves text chunks based on similarity — it works well for direct questions but struggles with:
- Multi-hop questions: "Who is the CEO of the company that acquired OpenAI's competitor?"
- Relational questions: "What are all the products from companies that Apple acquired?"
- Structural queries: "How does entity A connect to entity B?"

Graph RAG solves this by storing information as a knowledge graph where entities are nodes and relationships are edges.

### Knowledge Graph Structure

```
Standard Vector RAG:
  Documents → Chunks → Embeddings → Vector DB
  Query → Similar chunks → LLM answers

Graph RAG:
  Documents → Entity extraction → Knowledge Graph
  Query → Graph traversal + vector search → LLM answers
```

```
Example Knowledge Graph:
  [Anthropic] --founded_by--> [Dario Amodei]
  [Anthropic] --created--> [Claude]
  [Claude] --is_a--> [LLM]
  [Dario Amodei] --previously_at--> [OpenAI]
  [OpenAI] --created--> [GPT-4]
  [GPT-4] --is_a--> [LLM]
```

### Microsoft GraphRAG (Popular Implementation)

Microsoft released **GraphRAG** as an open-source library:

```bash
pip install graphrag
```

```python
# GraphRAG workflow
from graphrag.index import create_pipeline_config
from graphrag.index.run import run_pipeline_with_config

# 1. Indexing: extract entities and relationships from documents
await run_pipeline_with_config(
    config=create_pipeline_config(
        input_dir="./documents",
        output_dir="./graphrag_output",
        llm_model="gpt-4o"
    )
)

# 2. Query: uses both global (graph) and local (vector) search
from graphrag.query.indexer_adapters import read_indexer_entities
from graphrag.query.llm.oai.chat_openai import ChatOpenAI

# Local search (specific entity questions)
result = await local_search.asearch(
    "What are Anthropic's main AI safety initiatives?"
)

# Global search (broad thematic questions)
result = await global_search.asearch(
    "What are the major themes in AI safety research?"
)
```

### Building a Simple Graph RAG

```python
import networkx as nx
from anthropic import Anthropic
import chromadb

client = Anthropic()

class SimpleGraphRAG:
    def __init__(self):
        self.graph = nx.DiGraph()
        self.vector_db = chromadb.Client().create_collection("docs")

def add_document(self, doc_id: str, text: str):
        # 1. Extract entities and relationships using LLM
        extraction = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=512,
            messages=[{"role": "user", "content":
                f"Extract entities and relationships from this text as JSON. "
                f"Format: {{entities: [], relationships: [[entity1, relation, entity2]]}}

{text}"}]
        )
        import json
        data = json.loads(extraction.content[0].text)

# 2. Add to knowledge graph
        for entity in data.get("entities", []):
            self.graph.add_node(entity)
        for rel in data.get("relationships", []):
            if len(rel) == 3:
                self.graph.add_edge(rel[0], rel[2], relation=rel[1])

# 3. Also add to vector store for similarity search
        self.vector_db.add(documents=[text], ids=[doc_id])

def query(self, question: str) -> str:
        # 1. Vector search for relevant chunks
        results = self.vector_db.query(query_texts=[question], n_results=3)

# 2. Extract entities from question
        entities_in_question = [
            node for node in self.graph.nodes
            if node.lower() in question.lower()
        ]

# 3. Traverse graph for related entities
        graph_context = []
        for entity in entities_in_question:
            neighbors = list(self.graph.neighbors(entity))
            for neighbor in neighbors:
                edge_data = self.graph.edges[entity, neighbor]
                graph_context.append(f"{entity} --{edge_data.get('relation', 'relates_to')}--> {neighbor}")

# 4. Combine vector + graph context
        context = "
".join(results["documents"][0] + graph_context)

response = client.messages.create(
            model="claude-opus-4-6",
            max_tokens=512,
            messages=[{"role": "user", "content":
                f"Context:
{context}

Question: {question}"}]
        )
        return response.content[0].text
```

### Graph RAG vs Standard RAG

| | Standard RAG | Graph RAG |
|--|-------------|----------|
| **Structure** | Flat chunks | Entities + relationships |
| **Multi-hop** | Poor | Excellent |
| **Relationship queries** | Poor | Excellent |
| **Setup complexity** | Low | High |
| **Cost** | Lower | Higher (entity extraction) |
| **Best for** | Direct fact lookup | Complex reasoning, analysis |

What is Graph RAG?

Answer

What is Graph RAG?

Why Graph RAG?

Knowledge Graph Structure

Microsoft GraphRAG (Popular Implementation)

Building a Simple Graph RAG

Graph RAG vs Standard RAG

Related Concepts

What is AI?

What are all the current types of AI?

What is Machine Learning (ML)?

What is Deep Learning in AI?

What is an LLM?

	Standard RAG	Graph RAG
Structure	Flat chunks	Entities + relationships
Multi-hop	Poor	Excellent
Relationship queries	Poor	Excellent
Setup complexity	Low	High
Cost	Lower	Higher (entity extraction)
Best for	Direct fact lookup	Complex reasoning, analysis