Concept #146Mediumextended-ai-concepts

What is Semantic Search?

#gen-ai#semantic-search#embeddings#vector-search

Answer

What is Semantic Search?

Semantic search finds results based on meaning and intent, not just exact keyword matches. It uses vector embeddings to represent text as high-dimensional vectors, then measures similarity between vectors to surface conceptually related content — even when different words are used.

How It Works

  1. Embed the query and all documents using an embedding model (e.g.,
    text
    text-embedding-3-small
    )
  2. Index document vectors in a vector store (FAISS, Pinecone, Chroma)
  3. Search by computing cosine similarity or dot product between query and stored vectors
  4. Return the top-K most similar documents

Semantic Search vs Keyword Search

FeatureKeyword Search (BM25/TF-IDF)Semantic Search
MatchingExact word matchMeaning / intent
"Car" vs "Automobile"No matchMatch
Handles synonymsNoYes
SpeedFasterSlightly slower
RelevanceLexicalConceptual
InfrastructureElasticsearch, SolrVector DB (Pinecone, FAISS)

Code Example

python
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")

# Document corpus
docs = [
    "How to train a neural network",
    "Best practices for LLM fine-tuning",
    "Introduction to gradient descent",
]

# Embed and index documents
doc_embeddings = model.encode(docs)
index = faiss.IndexFlatL2(doc_embeddings.shape[1])
index.add(np.array(doc_embeddings))

# Query by meaning, not keywords
query = "How do I teach a model new skills?"
query_vec = model.encode([query])

distances, indices = index.search(query_vec, k=2)
for i in indices[0]:
    print(docs[i])
# Output: "Best practices for LLM fine-tuning", "How to train a neural network"

Use Cases

  • RAG pipelines — retrieve relevant documents for LLM context grounding
  • Enterprise search — search internal knowledge bases by meaning
  • Recommendation systems — find similar products, articles, or users
  • Duplicate detection — identify near-duplicate content semantically

Hybrid Search (Best of Both Worlds)

Combine semantic + keyword search using Reciprocal Rank Fusion (RRF):

python
# LangChain hybrid retriever example
from langchain.retrievers import EnsembleRetriever

retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, faiss_retriever],
    weights=[0.4, 0.6]  # 40% keyword, 60% semantic
)

Best Practice: Use hybrid search in production RAG systems — semantic handles meaning, keyword handles exact matches (product IDs, names, codes).

Popular tools: FAISS, Pinecone, Weaviate, Chroma, Qdrant