Concept #84Mediumextended-ai-concepts

What is a vector database in AI agents, and what is its use?

#gen-ai#vector-db

Answer

Vector Databases in AI Agents

A vector database stores data as numerical vectors and retrieves entries based on semantic similarity — enabling AI agents to efficiently search large knowledge bases.

Why Vector Databases?

LLMs have two hard limitations:

  1. Context window — can't load millions of documents at once
  2. Knowledge cutoff — don't know recent information

Vector databases solve both:

text
Question: "What's our refund policy?"
Convert to vector embedding
Search vector DB for semantically similar content
Retrieve: "Refund within 30 days for all products..."
Feed to LLM → Generate accurate, grounded answer

How Vector Search Works

python
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer('all-MiniLM-L6-v2')

# Documents to index
documents = [
    "Refunds accepted within 30 days of purchase",
    "Shipping takes 3-5 business days",
    "Customer support: help@company.com"
]

# Convert to embeddings
embeddings = model.encode(documents)  # shape: (3, 384)

# Query
query = "Can I return my order?"
query_vec = model.encode([query])  # shape: (1, 384)

# Cosine similarity search
similarities = np.dot(embeddings, query_vec.T).flatten()
best_idx = similarities.argmax()
print(f"Most relevant: {documents[best_idx]}")
# → "Refunds accepted within 30 days of purchase"

Popular Vector Databases

DatabaseTypeBest For
PineconeManaged cloudProduction at scale
ChromaLocal/embeddedDevelopment, prototypes
WeaviateSelf-hosted/cloudHybrid search
QdrantSelf-hosted/cloudHigh performance
pgvectorPostgreSQL extensionExisting Postgres users
FAISSLibrary (Meta)Large-scale local
MilvusSelf-hostedBillion-scale vectors

Quick Chroma Example

python
import chromadb

client = chromadb.Client()
collection = client.create_collection("docs")

# Add documents (Chroma handles embedding automatically)
collection.add(
    documents=["Refunds: 30 days", "Shipping: 3-5 days", "Support: help@co.com"],
    ids=["d1", "d2", "d3"]
)

# Semantic search
results = collection.query(
    query_texts=["How do I return something?"],
    n_results=2
)
print(results['documents'])
# → [["Refunds: 30 days", "Shipping: 3-5 days"]]

Roles in AI Agent Systems

Agent TaskVector DB Role
RAG Q&AKnowledge base retrieval
Long-term memoryStore past conversations, retrieve relevant ones
Semantic routingRoute query to correct agent/tool
DeduplicationFind near-duplicate documents
PersonalizationRetrieve user-specific past interactions

Key Parameters

ParameterDescriptionTypical Value
DimensionsSize of embedding vectors384, 768, 1536
MetricSimilarity measurementcosine, dot product, L2
top-kNumber of results to retrieve3-10
Score thresholdMinimum similarity to include0.7+
Index typeHNSW (fast, approx) vs flat (exact)HNSW for production