Vector Database
The infrastructure layer that makes semantic search and RAG possible. Every vendor has one now. Most enterprises don't need a standalone.
The Technical Definition
A vector database stores and retrieves data based on semantic similarity rather than exact keyword matches. When you convert text (or images, or other data) into vectors—long arrays of numbers representing meaning—you can find similar items by calculating distance between vectors in that numerical space.
Traditional databases answer “Does this data match this query?” Vector databases answer “What data is semantically closest to this query?” It’s the infrastructure layer that makes semantic search, RAG (retrieval-augmented generation), and recommendation systems possible.
What This Actually Means for Your Business
In 2024-2025, vector databases are experiencing peak hype. Every startup launched a vector database. Every established database (Postgres, MongoDB, Elasticsearch) added vector capabilities. The narrative is: you need a vector database for your AI strategy.
Here’s the operational truth: most enterprises don’t need a standalone vector database. They need vector capabilities. Very different thing.
Your actual scenario: you have documents (contracts, knowledge bases, RFP responses). You want to search them semantically. You want to feed relevant context to an LLM. You need vectors. But do you need a purpose-built vector database, or do you need vectors inside the database you already own?
If you’re using Postgres, you can use pgvector—a Postgres extension. Your data stays in one place. No new tool. No separate infrastructure. You trade some query performance optimization for operational simplicity. For most enterprises, that’s the right call.
If you’re at scale—millions of documents, millions of daily queries, sub-100ms latency requirements, very specific similarity metrics—then a specialized vector database (Pinecone, Weaviate, Milvus) makes sense. You get performance optimization and tooling built for vector workloads. You pay operational complexity: another data store, another sync problem, another thing that can break.
The trap: teams see “vector database” in the tech stack of a well-funded startup and buy the same tool, even though the problem isn’t the same. The startup has 50M vectors and needs 50ms latency. You have 100K vectors and need 500ms latency. Postgres with vectors is faster to ship and simpler to operate.
Reality Check
What vendors say: “You need a dedicated vector database to build production AI applications at scale.”
What that means in practice: You need vector capabilities. Whether those live in a standalone database or inside your existing data infrastructure depends on scale and performance requirements. Most enterprises choose embedded vectors for 18-24 months, then evaluate standalone databases when the cost of performance becomes real. That’s a sensible migration path.
What Operators Actually Do
The pattern at companies shipping AI fast: start with your existing database. Postgres + pgvector, Elasticsearch with dense vectors, DynamoDB with vector support—whatever you already own. Get your RAG system working. Measure actual latency and cost.
When metrics get painful (queries are slow, cost is high), evaluate specialized databases. At that point you have real data: “We’re running 100K queries/day on 2M vectors. Current latency is 800ms. Cost is $X/month.” Now you can build a business case.
The teams that buy vector databases first and ask “what do we use it for” later end up with expensive infrastructure powering small use cases. The teams that get the core use case working first on existing infrastructure can make smart infrastructure decisions later.
One more pattern: when teams do move to a standalone vector database, they often keep Postgres or MongoDB alongside it. The vector database stores vectors and similarity queries. The operational database stores everything else (metadata, audit logs, permissions, source data). No single database is best at everything. Most mature systems use multiple databases strategically.
The Questions to Ask
-
Can we solve this problem with vector capabilities in the database we already own? What’s the actual performance requirement that would force us to a specialized tool? (If you’re honest about this, you probably don’t need a new database yet.)
-
If we buy a vector database, what breaks if we lose it? Can we rebuild the vectors, or does data become inaccessible? (This determines whether the tool is critical infrastructure or convenience infrastructure. Different governance for each.)
-
What’s our actual retrieval latency requirement, and how do we know we’re hitting it? Is 200ms acceptable, or do we need 20ms? (Most teams discover they don’t actually know. Measure first. Buy the specialized tool if measurement proves you need it.)