What is a vector database in AI agents, and what is its use?

Question

Accepted Answer

## Vector Databases in AI Agents

A **vector database** stores data as numerical vectors and retrieves entries based on semantic similarity — enabling AI agents to efficiently search large knowledge bases.

### Why Vector Databases?

LLMs have two hard limitations:
1. **Context window** — can't load millions of documents at once
2. **Knowledge cutoff** — don't know recent information

Vector databases solve both:

```
Question: "What's our refund policy?"
       ↓
Convert to vector embedding
       ↓
Search vector DB for semantically similar content
       ↓
Retrieve: "Refund within 30 days for all products..."
       ↓
Feed to LLM → Generate accurate, grounded answer
```

### How Vector Search Works

```python
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer('all-MiniLM-L6-v2')

# Documents to index
documents = [
    "Refunds accepted within 30 days of purchase",
    "Shipping takes 3-5 business days",
    "Customer support: help@company.com"
]

# Convert to embeddings
embeddings = model.encode(documents)  # shape: (3, 384)

# Query
query = "Can I return my order?"
query_vec = model.encode([query])  # shape: (1, 384)

# Cosine similarity search
similarities = np.dot(embeddings, query_vec.T).flatten()
best_idx = similarities.argmax()
print(f"Most relevant: {documents[best_idx]}")
# → "Refunds accepted within 30 days of purchase"
```

### Popular Vector Databases

| Database | Type | Best For |
|---------|------|---------|
| **Pinecone** | Managed cloud | Production at scale |
| **Chroma** | Local/embedded | Development, prototypes |
| **Weaviate** | Self-hosted/cloud | Hybrid search |
| **Qdrant** | Self-hosted/cloud | High performance |
| **pgvector** | PostgreSQL extension | Existing Postgres users |
| **FAISS** | Library (Meta) | Large-scale local |
| **Milvus** | Self-hosted | Billion-scale vectors |

### Quick Chroma Example

```python
import chromadb

client = chromadb.Client()
collection = client.create_collection("docs")

# Add documents (Chroma handles embedding automatically)
collection.add(
    documents=["Refunds: 30 days", "Shipping: 3-5 days", "Support: help@co.com"],
    ids=["d1", "d2", "d3"]
)

# Semantic search
results = collection.query(
    query_texts=["How do I return something?"],
    n_results=2
)
print(results['documents'])
# → [["Refunds: 30 days", "Shipping: 3-5 days"]]
```

### Roles in AI Agent Systems

| Agent Task | Vector DB Role |
|-----------|----------------|
| RAG Q&A | Knowledge base retrieval |
| Long-term memory | Store past conversations, retrieve relevant ones |
| Semantic routing | Route query to correct agent/tool |
| Deduplication | Find near-duplicate documents |
| Personalization | Retrieve user-specific past interactions |

### Key Parameters

| Parameter | Description | Typical Value |
|-----------|-------------|--------------|
| **Dimensions** | Size of embedding vectors | 384, 768, 1536 |
| **Metric** | Similarity measurement | cosine, dot product, L2 |
| **top-k** | Number of results to retrieve | 3-10 |
| **Score threshold** | Minimum similarity to include | 0.7+ |
| **Index type** | HNSW (fast, approx) vs flat (exact) | HNSW for production |

What is a vector database in AI agents, and what is its use?

Answer

Vector Databases in AI Agents

Why Vector Databases?

How Vector Search Works

Popular Vector Databases

Quick Chroma Example

Roles in AI Agent Systems

Key Parameters

Related Concepts

What is AI?

What are all the current types of AI?

What is Machine Learning (ML)?

What is Deep Learning in AI?

What is an LLM?

Database	Type	Best For
Pinecone	Managed cloud	Production at scale
Chroma	Local/embedded	Development, prototypes
Weaviate	Self-hosted/cloud	Hybrid search
Qdrant	Self-hosted/cloud	High performance
pgvector	PostgreSQL extension	Existing Postgres users
FAISS	Library (Meta)	Large-scale local
Milvus	Self-hosted	Billion-scale vectors

Agent Task	Vector DB Role
RAG Q&A	Knowledge base retrieval
Long-term memory	Store past conversations, retrieve relevant ones
Semantic routing	Route query to correct agent/tool
Deduplication	Find near-duplicate documents
Personalization	Retrieve user-specific past interactions

Parameter	Description	Typical Value
Dimensions	Size of embedding vectors	384, 768, 1536
Metric	Similarity measurement	cosine, dot product, L2
top-k	Number of results to retrieve	3-10
Score threshold	Minimum similarity to include	0.7+
Index type	HNSW (fast, approx) vs flat (exact)	HNSW for production