Explain embeddings. How would you choose an embedding model?
#gen-ai#embeddings#vector-db
Answer
Embeddings & Choosing an Embedding Model
An embedding is a dense numerical vector that represents the semantic meaning of text. Similar texts have similar vectors — enabling semantic search, clustering, and RAG.
How Embeddings Work
pythonfrom sentence_transformers import SentenceTransformer model = SentenceTransformer("all-MiniLM-L6-v2") texts = [ "How do I reset my password?", "I forgot my login credentials", "The weather is sunny today", ] embeddings = model.encode(texts) print(embeddings.shape) # (3, 384) — 384-dimensional vectors # Similar meaning → similar vectors from sklearn.metrics.pairwise import cosine_similarity import numpy as np sim_matrix = cosine_similarity(embeddings) print(sim_matrix[0, 1]) # ~0.85 — "password reset" and "login credentials" are similar print(sim_matrix[0, 2]) # ~0.12 — "password" and "weather" are unrelated
Popular Embedding Models
| Model | Dimensions | Speed | Quality | Cost | Best For |
|---|---|---|---|---|---|
text | 1536 | Fast | Good | Low ($0.02/1M) | General RAG |
text | 3072 | Fast | Best | Medium ($0.13/1M) | High accuracy |
text | 1536 | Fast | Good | Low | Legacy OpenAI |
text | 384 | Very fast | Moderate | Free (local) | Low-latency apps |
text | 1024 | Moderate | Excellent | Free (local) | Best open-source |
text | 1024 | Moderate | Excellent | Free (local) | Multilingual |
How to Choose an Embedding Model
pythonfrom sentence_transformers import SentenceTransformer from sentence_transformers.evaluation import InformationRetrievalEvaluator # 1. Evaluate on your specific domain data queries = {"q1": "What is the refund policy?"} corpus = {"d1": "Refunds are accepted within 30 days."} relevant_docs = {"q1": {"d1"}} for model_name in ["all-MiniLM-L6-v2", "BAAI/bge-large-en-v1.5"]: model = SentenceTransformer(model_name) evaluator = InformationRetrievalEvaluator(queries, corpus, relevant_docs) results = evaluator(model) print(f"{model_name}: NDCG@10 = {results['cosine_ndcg@10']:.4f}")
Choosing Based on Requirements
| Requirement | Recommended Model |
|---|---|
| Lowest cost, OpenAI API | text |
| Highest quality, OpenAI API | text |
| Free, best quality | text |
| Fastest local inference | text |
| Multilingual documents | text |
| Code search | text |
| Long documents (>512 tokens) | text |
Matryoshka Embeddings (OpenAI text-embedding-3)
OpenAI's
text
text-embedding-3pythonfrom openai import OpenAI client = OpenAI() # Full 1536 dimensions full_emb = client.embeddings.create( model="text-embedding-3-small", input="What is RAG?" ).data[0].embedding # Reduce to 512 dimensions (faster search, lower storage) small_emb = client.embeddings.create( model="text-embedding-3-small", input="What is RAG?", dimensions=512 ).data[0].embedding
Decision framework: Start with
for OpenAI-based apps (best cost-quality ratio). If you need local/private embeddings, usetexttext-embedding-3-small. Always evaluate on a sample of your domain data — benchmark results don't always reflect real-world performance.textBAAI/bge-large-en-v1.5