Integrating semantic search capabilities directly into a standard PostgreSQL relational database is a game-changer for full-stack developers. Instead of deploying separate, costly vector database nodes (like Pinecone or Milvus), we can harness the native power of pgvector inside PostgreSQL. In this post, I'll walk through how to configure vector capabilities, execute similarity queries, and optimize your database for custom semantic AI assistants.
🛠️ Activating the pgvector Extension
The first step in any PostgreSQL environment is registering the vector extension. Since my portfolio AI backend connects directly to a self-hosted PostgreSQL database, enabling support is as simple as enabling the extension:
-- Enable the vector extension to support high-dimensional data types
CREATE EXTENSION IF NOT EXISTS vector;
Once enabled, database tables gain access to a new VECTOR column type. You must declare the exact dimensions of your vector embeddings. For example, since my portfolio's local embedding model all-MiniLM-L6-v2 encodes text to 384 dimensions, the schema is defined as:
CREATE TABLE IF NOT EXISTS portfolio_embeddings (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB,
embedding VECTOR(384), -- Fixed-dimension semantic vector
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
🔍 Understanding Vector Distance Operations
To search for the most semantically relevant facts when a user asks a question, PostgreSQL needs to calculate the distance between the query's embedding vector and the stored vectors.
PostgreSQL's pgvector extension supports three main operators for measuring vector similarity:
<->(Euclidean/L2 Distance): Best for normalized coordinates, but sensitive to text chunk lengths.<#>(Negative Inner Product): Extremely fast, but vectors must be pre-normalized.<=>(Cosine Distance): The gold standard for text semantics, measuring the angular difference between vector directions rather than absolute length.
For our AI assistant, we query using Cosine Distance (<=>), returning the smallest distances (closest matches) up to our limit:
-- Retrieve the 3 closest matches based on angular similarity
SELECT content, (1 - (embedding <=> $1)) as similarity
FROM portfolio_embeddings
ORDER BY embedding <=> $1
LIMIT 3;
⚡ Querying from Python using asyncpg
To connect our FastAPI backend to the PostgreSQL database efficiently, we use the high-performance async database driver asyncpg. To map Python lists directly to PostgreSQL vector types, we serialize the floats and cast them appropriately:
import json
def list_to_vector_literal(v: list[float]) -> str:
# Formats a Python list of floats as a SQL vector literal like '[0.1, 0.2, 0.3]'
return "[" + ",".join(str(x) for x in v) + "]"
Then we run the asynchronous search query:
async def search_vectors(query_embedding: list[float]) -> list[dict]:
vector_str = list_to_vector_literal(query_embedding)
rows = await conn.fetch(
"""
SELECT content, (1 - (embedding <=> $1::vector)) as similarity
FROM portfolio_embeddings
ORDER BY embedding <=> $1::vector
LIMIT 3
""",
vector_str
)
return [dict(r) for r in rows]
This is the exact, high-performance semantic retrieval system running under the hood of my portfolio AI, ensuring sub-10ms response times for the chatbot!
🚀 Conclusion
By pairing PostgreSQL with semantic queries via pgvector, you keep your stack lean, reduce maintenance surface area, and leverage robust relational queries side-by-side with high-performance vector operations. It is a fantastic pattern that proves PostgreSQL remains the most versatile database tool in modern development!