AI & AutomationFree to read

Vector Databases (Pinecone, Weaviate, ChromaDB, Qdrant)

Where AI Stores Its Memory - The Database Built for Embeddings

Learn how vector databases store, index, and search through millions of embeddings in milliseconds. Understand Pinecone, Weaviate, ChromaDB, and Qdrant - the infrastructure backbone of every RAG system and semantic search engine.

What is a Vector Database?

A Database Designed for AI - Not Your Grandfather's SQL

The Spotify Analogy

When Spotify recommends songs, it does not search by song title or artist name (that would be keyword search). Instead, every song is represented as a vector - a list of numbers capturing its vibe, tempo, mood, genre. When you play a sad Hindi song, Spotify finds other songs whose vectors are close to it in this numerical space. A vector database is the engine that makes this "find similar things" search blazing fast, even across millions of items.

Why Not Just Use PostgreSQL?

Traditional databases are built for exact matches: "Find user where email = tanuj@gmail.com." Vector databases are built for nearest neighbor search: "Find the 10 documents most similar to this query embedding." This is fundamentally different:

Regular DB: Exact match on structured data using B-tree indexes
Vector DB: Approximate similarity search on high-dimensional vectors using ANN indexes (HNSW, IVF)
Scale: Comparing a 768-dim vector against 10 million vectors using brute force takes seconds. With vector DB indexes, it takes milliseconds.

Core Operations of a Vector DB

Insert/Upsert: Store vectors with metadata (document text, source URL, timestamps)
Search (KNN/ANN): Given a query vector, find the K nearest neighbors
Filter: Combine vector search with metadata filters ("similar documents from last 30 days only")
Delete: Remove vectors by ID or metadata filter
Update: Modify vectors or their metadata

Note: Vector databases are to AI what SQL databases are to web apps - the essential storage layer. Every RAG system, semantic search engine, and recommendation system needs one.

How Vector Search Actually Works - ANN Indexes

The Algorithms Behind Millisecond Search

The Problem: Brute Force is Too Slow

If you have 10 million vectors with 768 dimensions each, comparing your query against every single vector means 10 million distance calculations of 768-dimensional vectors. That is 7.68 billion floating-point operations per query. Impossible for real-time search. Vector databases solve this using Approximate Nearest Neighbor (ANN) algorithms.

HNSW (Hierarchical Navigable Small World) - Most Popular

Think of it like a multi-level city map. Level 3 has major highways (few nodes, long-distance connections). Level 2 has state roads. Level 1 has city streets. Level 0 has every single address. To find a destination, start at the top level (highway), jump to the right region, then descend through levels until you reach the exact neighborhood. Instead of checking all 10M vectors, HNSW typically checks only 100-200.

Speed: Sub-millisecond for millions of vectors
Accuracy: 95-99% recall (finds 95-99 of true top-100 nearest neighbors)
Tradeoff: Uses more memory (stores graph structure in RAM)

IVF (Inverted File Index) - The Clustering Approach

Imagine organizing a library by topic shelves. First, cluster all books into 100 topic shelves. When someone asks for a book about cooking, go to the "food" shelf and search only there instead of the entire library. IVF clusters vectors into groups (Voronoi cells), then only searches the nearest clusters.

Speed: Fast, especially with quantization
Accuracy: Depends on number of clusters probed
Tradeoff: Less memory than HNSW, but needs retraining when data distribution changes

Product Quantization (PQ) - Compression

Shrinks vectors to use much less memory. A 768-dim float32 vector uses 3KB. With PQ, it can be compressed to 96 bytes - 32x smaller. Some accuracy is lost, but for large datasets the memory savings are essential.

Note: HNSW is the most widely used algorithm in production vector databases. It offers the best speed-accuracy tradeoff for most use cases at the cost of higher memory usage.

The Big Four - Pinecone, Weaviate, ChromaDB, Qdrant

Choosing Your Vector Database

Pinecone - The Managed King

Type: Fully managed cloud service (no self-hosting option)
Best For: Teams that want zero infrastructure headache
Strengths: Dead simple API, automatic scaling, serverless tier, great docs
Weaknesses: Vendor lock-in, no self-hosting, can get expensive at scale
Think of it as: Zomato cloud kitchen - you just place orders, they handle everything

Weaviate - The Feature-Rich Hybrid

Type: Open-source + managed cloud option
Best For: Teams needing hybrid search (vector + keyword) and built-in ML modules
Strengths: GraphQL API, built-in vectorizers, hybrid search, multi-tenancy
Weaknesses: Heavier resource footprint, steeper learning curve
Think of it as: Full-service restaurant - does everything, but more complex to manage

ChromaDB - The Developer Favorite

Type: Open-source, embeddable (runs in your process)
Best For: Prototyping, small-medium datasets, local development
Strengths: Simplest API, in-memory mode, Python-native, great for notebooks
Weaknesses: Not designed for massive scale, limited production features
Think of it as: Street food stall - fast, simple, perfect for quick meals but not a banquet

Qdrant - The Performance Champion

Type: Open-source (Rust-based) + managed cloud
Best For: Teams needing high performance and advanced filtering
Strengths: Written in Rust (fast), excellent filtering, payload indexes, quantization built-in
Weaknesses: Smaller community than Pinecone/Weaviate
Think of it as: German-engineered sports car - pure performance focus

Note: ChromaDB for prototyping, Pinecone for managed simplicity, Qdrant for performance, Weaviate for feature-richness. There is no single best - it depends on your specific needs.

Vector DB in a RAG Pipeline

How Vector Databases Power Real AI Applications

The RAG Pipeline Flow

INGESTION PHASE (once):
[Documents] --> [Chunking] --> [Embedding Model] --> [Vector DB]
  1000 PDFs    500-word       all-mpnet-v2        Qdrant
               chunks          768-dim vectors     with metadata

QUERY PHASE (every request):
[User Query] --> [Embed Query] --> [Vector DB Search] --> [Top 5 chunks]
 "What is         768-dim          cosine similarity    Most relevant
  GST rate?"      vector           + metadata filter     document pieces

[Top 5 chunks] --> [LLM Prompt] --> [Generated Answer]
   context          "Based on        "The GST rate
   documents         context..."      for laptops is 18%"

Metadata Filtering - The Secret Weapon

Vector similarity alone is not enough. You often need to combine it with metadata filters:

Time filter: "Similar documents from last 7 days only" - for news or time-sensitive data
Category filter: "Similar products in Electronics category only" - for Flipkart-style search
Access control: "Documents this user has permission to see" - for enterprise apps
Language filter: "Hindi documents only" - for multilingual systems

Hybrid Search (Vector + Keyword)

Sometimes pure vector search misses exact terms. If someone searches for "error code ERR_2847", vector search might return documents about generic errors. Hybrid search combines:

Vector search: Finds semantically similar documents (understands meaning)
Keyword search (BM25): Finds documents with exact terms (catches specific codes, names)
Fusion: Combine both result sets using Reciprocal Rank Fusion (RRF) for best results

Note: Metadata filtering and hybrid search are what separate production-quality RAG from demo-quality RAG. Always include metadata with your vectors and consider hybrid search for technical content.

Common Vector DB Mistakes

Pitfalls That Can Sink Your AI Application

Mistake 1: Not Storing Enough Metadata

Many teams store only the vector and document text. Then they realize they need to filter by date, category, source, or user access level - but the metadata is not there. Fix: Store rich metadata from day one. It is much easier to add it upfront than to re-index millions of vectors later.

Mistake 2: Wrong Distance Metric

Cosine similarity, dot product, and Euclidean distance give different results. Most embedding models are trained for cosine similarity. Using dot product without normalized vectors gives wrong rankings. Fix: Check your embedding model documentation for the recommended distance metric.

Mistake 3: Over-Engineering for Scale You Do Not Have

Deploying a distributed Qdrant cluster for 50,000 vectors is like renting a godown to store 10 boxes. ChromaDB or even pgvector can handle small datasets perfectly. Fix: Start simple. ChromaDB for under 100K vectors, dedicated vector DB for millions.

Mistake 4: Ignoring Index Tuning

Default HNSW parameters (ef_construction, M) are a compromise. For your specific data, tuning these can improve recall by 5-10% or reduce latency by 50%. Fix: Benchmark different parameter values on your data and tune for your quality-speed requirements.

Mistake 5: No Backup or Versioning Strategy

Vector databases can corrupt or lose data like any database. If you cannot re-embed everything quickly (expensive, slow), you need backups. Fix: Always keep the original documents and embedding model version. Regular snapshots of the vector DB.

Note: The most painful mistake is not storing metadata upfront. Re-indexing millions of vectors because you forgot to add a timestamp field can take days and cost real money.

Interview Questions

Q: Why do we need a specialized vector database instead of using PostgreSQL?

Traditional databases like PostgreSQL use B-tree indexes optimized for exact matches. Vector databases use ANN (Approximate Nearest Neighbor) indexes like HNSW that are designed for high-dimensional similarity search. Comparing a 768-dim query against 10M vectors with brute force takes seconds; with HNSW it takes milliseconds. PostgreSQL has pgvector extension for small datasets, but purpose-built vector DBs handle millions of vectors with better performance, filtering, and scaling.

Q: Explain how HNSW index works.

HNSW builds a multi-level graph. Top levels have few nodes with long-range connections (like highways). Lower levels have more nodes with short-range connections (like local streets). To find nearest neighbors, start at the top level, greedily navigate to the closest node, then descend to the next level and repeat. This narrows the search space from millions to a few hundred comparisons, achieving sub-millisecond search with 95-99% recall.

Q: When would you use ChromaDB vs Pinecone vs Qdrant?

ChromaDB: Prototyping, local development, small datasets (under 100K vectors), Python notebooks. Pinecone: Production apps where you want zero infrastructure management, auto-scaling, and are okay with vendor lock-in and higher costs. Qdrant: Production apps needing high performance, advanced filtering, and you want open-source with self-hosting option. Qdrant (Rust-based) often has the best raw performance.

Q: What is hybrid search and why is it important?

Hybrid search combines vector (semantic) search with keyword (BM25) search. Vector search understands meaning but may miss exact terms like error codes or product IDs. Keyword search finds exact matches but misses semantic understanding. Combining them with Reciprocal Rank Fusion gives the best of both worlds. Critical for technical documentation, e-commerce search, and any system where both meaning and exact terms matter.

Frequently Asked Questions

What is Vector Databases?

Learn how vector databases store, index, and search through millions of embeddings in milliseconds. Understand Pinecone, Weaviate, ChromaDB, and Qdrant - the infrastructure backbone of every RAG system and semantic search engine.

How does Vector Databases work?

A Database Designed for AI - Not Your Grandfather's SQL The Spotify Analogy When Spotify recommends songs, it does not search by song title or artist name (that would be keyword search). Instead, every song is represented as a vector - a list of numbers capturing its vibe, tempo, mood, genre.

Browse all AI & Automation topics →

Practice this on DevInterviewMaster

Read the full Vector Databases (Pinecone, Weaviate, ChromaDB, Qdrant) breakdown with interactive demos, quizzes, and Hinglish notes.

Open the interactive topic →

800+ system-design, LLD, coding, and design-pattern topics. Unlock everything with Pro (₹499, one-time) or Ultimate (₹999, one-time) — lifetime access, no subscription.

Vector Databases (Pinecone, Weaviate, ChromaDB, Qdrant)

What is a Vector Database?

How Vector Search Actually Works - ANN Indexes

The Big Four - Pinecone, Weaviate, ChromaDB, Qdrant

Vector DB in a RAG Pipeline

Common Vector DB Mistakes

Interview Questions

Frequently Asked Questions

Related topics

Practice this on DevInterviewMaster