What is a vector database in AI agents, and what is its use?
#gen-ai#vector-db
Answer
Vector Databases in AI Agents
A vector database stores data as numerical vectors and retrieves entries based on semantic similarity — enabling AI agents to efficiently search large knowledge bases.
Why Vector Databases?
LLMs have two hard limitations:
- Context window — can't load millions of documents at once
- Knowledge cutoff — don't know recent information
Vector databases solve both:
textQuestion: "What's our refund policy?" ↓ Convert to vector embedding ↓ Search vector DB for semantically similar content ↓ Retrieve: "Refund within 30 days for all products..." ↓ Feed to LLM → Generate accurate, grounded answer
How Vector Search Works
pythonfrom sentence_transformers import SentenceTransformer import numpy as np model = SentenceTransformer('all-MiniLM-L6-v2') # Documents to index documents = [ "Refunds accepted within 30 days of purchase", "Shipping takes 3-5 business days", "Customer support: help@company.com" ] # Convert to embeddings embeddings = model.encode(documents) # shape: (3, 384) # Query query = "Can I return my order?" query_vec = model.encode([query]) # shape: (1, 384) # Cosine similarity search similarities = np.dot(embeddings, query_vec.T).flatten() best_idx = similarities.argmax() print(f"Most relevant: {documents[best_idx]}") # → "Refunds accepted within 30 days of purchase"
Popular Vector Databases
| Database | Type | Best For |
|---|---|---|
| Pinecone | Managed cloud | Production at scale |
| Chroma | Local/embedded | Development, prototypes |
| Weaviate | Self-hosted/cloud | Hybrid search |
| Qdrant | Self-hosted/cloud | High performance |
| pgvector | PostgreSQL extension | Existing Postgres users |
| FAISS | Library (Meta) | Large-scale local |
| Milvus | Self-hosted | Billion-scale vectors |
Quick Chroma Example
pythonimport chromadb client = chromadb.Client() collection = client.create_collection("docs") # Add documents (Chroma handles embedding automatically) collection.add( documents=["Refunds: 30 days", "Shipping: 3-5 days", "Support: help@co.com"], ids=["d1", "d2", "d3"] ) # Semantic search results = collection.query( query_texts=["How do I return something?"], n_results=2 ) print(results['documents']) # → [["Refunds: 30 days", "Shipping: 3-5 days"]]
Roles in AI Agent Systems
| Agent Task | Vector DB Role |
|---|---|
| RAG Q&A | Knowledge base retrieval |
| Long-term memory | Store past conversations, retrieve relevant ones |
| Semantic routing | Route query to correct agent/tool |
| Deduplication | Find near-duplicate documents |
| Personalization | Retrieve user-specific past interactions |
Key Parameters
| Parameter | Description | Typical Value |
|---|---|---|
| Dimensions | Size of embedding vectors | 384, 768, 1536 |
| Metric | Similarity measurement | cosine, dot product, L2 |
| top-k | Number of results to retrieve | 3-10 |
| Score threshold | Minimum similarity to include | 0.7+ |
| Index type | HNSW (fast, approx) vs flat (exact) | HNSW for production |