Vector database & embeddings

What is a vector database (and embeddings)?

Alex GrygorievUpdated 1 min read

Definition

A vector database stores data as embeddings — lists of numbers that capture meaning — and finds results by similarity rather than exact keywords. It lets AI retrieve the most semantically relevant text, images or records, which is what makes retrieval-augmented generation (RAG) and semantic search work.

Table of contents

A vector database is the memory layer behind most serious AI applications. It's what lets a system find "the passage that means roughly this", even when none of the exact words match.

What are embeddings?

An embedding is a piece of text (or an image) converted by an AI model into a list of numbers — a vector — that represents its meaning. Texts with similar meaning end up close together in this numeric space. "Cancel my subscription" and "I want to end my plan" land near each other even though they share no keywords.

How a vector database works

  • Store: each chunk of your content is embedded and saved as a vector.
  • Query: the incoming question is embedded the same way.
  • Search: the database returns the vectors nearest to the query — the most semantically similar content — in milliseconds, even across millions of records.

Why hybrid search wins

Pure vector search is great at meaning but can miss exact terms (a product code, a name). Pure keyword search is the opposite. The strongest systems use hybrid search — combining vector similarity with keyword matching (e.g. BM25) and fusing the results — for both precision and recall. This is the retrieval quality that makes RAG reliable.

Where it's used

Semantic search, "chat with your documents", recommendation, deduplication, and the retrieval step of every AI agent. Popular options include pgvector (Postgres), which keeps vectors right next to your relational data.

Summary

A vector database stores meaning as numbers and searches by similarity. Together with embeddings and hybrid search, it's the retrieval engine that lets AI answer accurately from your own knowledge.

Frequently asked questions

What is the difference between a vector database and a normal database?

A normal database finds exact matches (this ID, this keyword). A vector database finds the closest in meaning, ranking results by semantic similarity — ideal for natural-language search and AI retrieval.

Do I need a separate vector database?

Not always. Extensions like pgvector add vector search to PostgreSQL, so you can keep embeddings beside your existing relational data instead of running a separate system — often simpler to operate.