AI & Machine Learning
Vector Databases Explained: Pinecone vs Weaviate vs pgvector
TL;DR
If you are building an AI application that needs vector search, you have three serious options: Pinecone (fully managed, zero ops), Weaviate (feature-rich, self-hostable), and pgvector (a PostgreSQL extension that lives inside your existing database). I have shipped production systems on all three. My default is pgvector via Supabase because it eliminates an entire service from your architecture — your vectors live in the same database as your users, your products, and your business logic. You get JOINs across relational and vector data, one connection string, one backup strategy, one bill. Pinecone makes sense at extreme scale (billions of vectors) where you need a dedicated team managing vector infrastructure. Weaviate makes sense when you need built-in multimodal search or complex hybrid retrieval pipelines. For the other 80% of use cases — which is most startups, most SaaS products, most internal tools — pgvector is the right call.
My recommendation: Start with pgvector. You will know when you have outgrown it, and by then you will have the revenue to justify a dedicated vector database.
What Vector Databases Do
Before comparing solutions, let me make sure we are on the same page about what problem we are solving.
Traditional databases search by exact match. You query WHERE email = 'user@example.com' and get a precise result. Vector databases search by similarity. You provide a vector (an array of floating-point numbers that represents the meaning of a piece of content), and the database returns the vectors that are closest to it in high-dimensional space.
This is the foundation of every modern AI search system. When you ask an LLM a question about your data, something needs to find the most relevant documents first. That something is vector search.
The workflow looks like this:
- You take your content (text, images, audio) and pass it through an embedding model
- The model outputs a vector — typically 768 to 3072 dimensions of floating-point numbers
- You store that vector alongside its metadata in a vector database
- At query time, you embed the user's query the same way
- The database performs an approximate nearest neighbor (ANN) search to find the closest vectors
- You pass those results to your LLM as context
The key algorithms that make this fast are HNSW (Hierarchical Navigable Small World) and IVFFlat (Inverted File with Flat Compression). Every vector database uses one or both. The difference between solutions is not the math — it is everything around it: hosting, integrations, hybrid search capabilities, operational complexity, and cost.
Let me walk through each option based on my experience building with them.
Pinecone -- Managed Simplicity
Pinecone is the vector database that wants you to forget it exists. You create an index, push vectors to it, query against it, and Pinecone handles everything else — scaling, replication, indexing, infrastructure.
I first used Pinecone on a client project that needed semantic search across 2 million product descriptions. The onboarding experience was genuinely impressive. I had a working prototype in under an hour:
import { Pinecone } from '@pinecone-database/pinecone';
const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const index = pc.index('product-search');
// Upsert vectors
await index.upsert([
{
id: 'product-001',
values: embedding, // from your embedding model
metadata: {
title: 'Wireless Bluetooth Headphones',
category: 'electronics',
price: 79.99,
},
},
]);
// Query
const results = await index.query({
vector: queryEmbedding,
topK: 10,
filter: { category: { $eq: 'electronics' } },
includeMetadata: true,
});Clean API. Metadata filtering works well. Query latency in the low milliseconds. For a managed service, the developer experience is top tier.
Where Pinecone shines:
- Zero infrastructure management. No servers, no scaling decisions, no index tuning. It just works.
- Serverless tier. You pay per query, which is excellent for prototyping and low-traffic apps.
- Namespace isolation. You can partition data within an index, which maps well to multi-tenant applications.
- Scale ceiling. Pinecone handles billions of vectors. If you are building at that scale, the managed approach makes sense.
Where Pinecone falls short:
- No relational data. Your vectors live in Pinecone and your relational data lives in PostgreSQL. That means two round trips for any query that needs both. You fetch vector IDs from Pinecone, then query your database for the full records. This adds latency and complexity.
- Vendor lock-in. Pinecone is proprietary. There is no self-hosted option. Your data lives on their servers under their terms.
- Cost at scale. The serverless pricing is generous for small workloads, but once you need dedicated pods for production performance, costs climb fast. I have seen monthly bills exceed $500 for what amounts to a few million vectors.
- Limited query flexibility. Metadata filtering covers basic use cases, but you cannot do anything resembling a SQL JOIN, aggregation, or full-text search within Pinecone itself.
The fundamental trade-off with Pinecone is operational simplicity versus architectural complexity. You do not manage any infrastructure, but you now have two databases to coordinate, two sets of data to keep in sync, and two services to monitor.
Weaviate -- Feature Rich
Weaviate is the Swiss Army knife of vector databases. It handles text, images, audio, and video embeddings natively. It has built-in hybrid search (combining vector similarity with BM25 keyword search). It can run embedding models locally so your data never leaves your infrastructure. And it can be self-hosted or used as a managed cloud service.
I evaluated Weaviate for a client building a research platform that needed to search across academic papers (text), diagrams (images), and presentation recordings (audio). Weaviate was the only option that handled all three modalities in a single query pipeline.
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'your-cluster.weaviate.network',
apiKey: new weaviate.ApiKey(process.env.WEAVIATE_API_KEY),
});
// Define a collection with vectorizer
await client.schema
.classCreator()
.withClass({
class: 'ResearchPaper',
vectorizer: 'text2vec-openai',
moduleConfig: {
'text2vec-openai': {
model: 'text-embedding-3-small',
},
},
properties: [
{ name: 'title', dataType: ['text'] },
{ name: 'abstract', dataType: ['text'] },
{ name: 'authors', dataType: ['text[]'] },
{ name: 'year', dataType: ['int'] },
],
})
.do();
// Hybrid search (vector + keyword)
const result = await client.graphql
.get()
.withClassName('ResearchPaper')
.withHybrid({
query: 'transformer attention mechanisms',
alpha: 0.75, // 75% vector, 25% keyword
})
.withFields('title abstract authors year')
.withLimit(10)
.do();That alpha parameter is powerful. You can blend vector similarity and keyword matching to get results that are both semantically relevant and contain specific terms. For research papers where exact terminology matters, this is a significant advantage.
Where Weaviate shines:
- Multimodal search. Native support for text, image, audio, and video vectors in the same schema. No other solution does this as cleanly.
- Hybrid search. BM25 plus vector similarity in a single query. Configurable weighting. This solves the "semantic search missed the exact keyword" problem.
- Built-in vectorization. Weaviate can call embedding models automatically on ingest. You push raw text, it handles the embedding. Less code for you.
- Self-hosting. Run it on your own infrastructure via Docker or Kubernetes. Full control over your data.
- GraphQL API. If your team already thinks in GraphQL, the query interface feels natural.
Where Weaviate falls short:
- Operational overhead. Self-hosting Weaviate means managing a JVM-based service with its own storage engine, backup strategy, and scaling challenges. It is not trivial.
- Learning curve. The schema design, module system, and GraphQL query language have their own idioms. It takes a few days to feel productive.
- Resource hungry. Weaviate's in-memory HNSW index requires significant RAM. For large datasets, memory costs can be substantial.
- Same two-database problem. Like Pinecone, your relational data still lives somewhere else. You still need to coordinate between systems.
Weaviate is the right choice when your search requirements genuinely demand its unique capabilities — multimodal, hybrid search, or self-hosted vector search with built-in vectorization. If you do not need those features, it is over-engineered for the job.
pgvector -- Already in Your Database
Here is where my bias lives, and I am not going to pretend otherwise. pgvector is my default for every new project. It is a PostgreSQL extension that adds vector data types and similarity search operators. If you are on Supabase — which I am for most projects — pgvector is already installed and waiting.
The pitch is simple: your vectors live in the same database as everything else.
-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Create a table with a vector column
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
title TEXT NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(1536),
user_id UUID REFERENCES users(id),
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Create an HNSW index for fast similarity search
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- Query: find similar documents for a specific user
SELECT
d.id,
d.title,
d.content,
u.name AS author_name,
1 - (d.embedding <=> query_embedding) AS similarity
FROM documents d
JOIN users u ON u.id = d.user_id
WHERE d.user_id = '...'
ORDER BY d.embedding <=> query_embedding
LIMIT 10;Look at that query. I am doing a vector similarity search, joining against the users table, filtering by user ID, and returning enriched results — all in a single query. No second round trip. No data synchronization between services. No additional bill.
This is why I keep coming back to pgvector. The integration story is unbeatable.
In my Supabase + Next.js stack, the full flow looks like this:
import { createClient } from '@supabase/supabase-js';
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!
);
// Store a document with its embedding
async function storeDocument(
title: string,
content: string,
embedding: number[],
userId: string
) {
const { data, error } = await supabase
.from('documents')
.insert({ title, content, embedding, user_id: userId })
.select()
.single();
if (error) throw new Error(`Failed to store document: ${error.message}`);
return data;
}
// Semantic search with RLS (Row Level Security)
async function searchDocuments(queryEmbedding: number[], limit = 10) {
const { data, error } = await supabase.rpc('match_documents', {
query_embedding: queryEmbedding,
match_threshold: 0.78,
match_count: limit,
});
if (error) throw new Error(`Search failed: ${error.message}`);
return data;
}The Supabase RPC function handles the vector search on the database side, which means Row Level Security still applies. Your users only see their own documents. You get vector search and multi-tenant data isolation with zero additional code.
Where pgvector shines:
- No additional service. Your vectors, your relational data, your auth data — all in one place. One connection string, one backup, one bill.
- SQL power. JOINs, CTEs, window functions, aggregations — the full PostgreSQL toolkit works alongside vector search. This is a massive advantage for real applications.
- Row Level Security. On Supabase, RLS policies apply to vector queries automatically. Multi-tenancy comes free.
- Transactional consistency. Your vector upserts participate in the same transaction as your relational writes. No eventual consistency headaches.
- Cost. It is included in your PostgreSQL instance. On Supabase, even the free tier includes pgvector. There is no per-query pricing, no separate meter running.
Where pgvector falls short:
- Scale ceiling. pgvector works well up to roughly 5-10 million vectors on a well-provisioned instance. Beyond that, query performance degrades unless you invest heavily in tuning (partitioning, parallel workers, dedicated hardware). Purpose-built vector databases handle billions of vectors more gracefully.
- No built-in vectorization. You handle embedding generation yourself. Unlike Weaviate, pgvector does not call an embedding model on your behalf.
- Index build time. HNSW index creation on large datasets (millions of rows) can be slow and memory-intensive. You need to plan for this.
- No multimodal search. pgvector stores and compares vectors, period. It does not understand what those vectors represent. There is no native hybrid search combining vector similarity with full-text relevance ranking (though you can build this manually with
ts_rankand some SQL).
The limitations are real, but they only matter at a scale that most applications never reach. For the startup building its first RAG pipeline, the SaaS product adding semantic search, the internal tool that needs to find relevant documents — pgvector is more than enough.
Performance Comparison
I ran benchmarks on all three using the same dataset: 500,000 document embeddings at 1536 dimensions (OpenAI text-embedding-3-small). Same queries, same hardware tier where applicable.
| Metric | Pinecone (Serverless) | Weaviate (Cloud) | pgvector (Supabase Pro) |
|---|---|---|---|
| Avg query latency (p50) | 8ms | 12ms | 15ms |
| Avg query latency (p99) | 45ms | 65ms | 85ms |
| Recall @ 10 | 0.97 | 0.98 | 0.95 |
| Index build time | Managed (seconds) | ~4 min | ~8 min |
| Max vectors tested | 10M+ | 5M+ | 2M |
| Hybrid search | No | Yes (native) | Manual (ts_rank + cosine) |
| Metadata filtering | Yes | Yes | Yes (SQL WHERE) |
| Concurrent query throughput | High | Medium-High | Medium |
A few observations:
Latency differences are negligible for most apps. The difference between 8ms and 15ms disappears inside the 200-500ms you spend on the embedding API call and LLM generation. If your bottleneck is vector search latency, you are solving the wrong problem.
Recall is close across all three. With properly tuned HNSW parameters, pgvector's recall approaches Pinecone and Weaviate. The default settings are conservative — tuning ef_search and m parameters closes the gap.
Throughput matters at scale. If you are serving thousands of concurrent vector queries per second, Pinecone's distributed architecture handles it more gracefully than a single PostgreSQL instance. But most applications are not Twitter.
Cost Comparison
This is where the conversation gets interesting. Same scenario: 500,000 vectors, 1536 dimensions, 100 queries per hour average.
| Cost Factor | Pinecone | Weaviate Cloud | pgvector (Supabase) |
|---|---|---|---|
| Monthly base cost | $70 (serverless) | $125 (sandbox) | $25 (Pro plan) |
| At 1M vectors | $150-300 | $250-400 | $25 (same plan) |
| At 5M vectors | $500-1,200 | $500-800 | $75 (Large plan) |
| At 10M+ vectors | $1,500+ (pods) | $1,000+ | Needs dedicated ($300+) |
| Additional services | Need separate DB | Need separate DB | Included |
| Total infra cost | Pinecone + DB | Weaviate + DB | Just Supabase |
The "Additional services" row is the killer. With Pinecone or Weaviate, you still need a PostgreSQL database for your relational data, user management, authentication, and business logic. That is another $25-100/month minimum. With pgvector on Supabase, your vector database IS your application database. There is no additional cost.
At the 500K vector mark, pgvector on Supabase costs roughly one-third of the next cheapest option when you factor in total infrastructure.
When to Use Each
I have shipped projects on all three. Here is my honest framework for choosing:
Choose Pinecone when:
- You have 10M+ vectors and growing fast
- Your team has zero database operations experience
- You need the absolute lowest query latency at massive scale
- Budget is not a constraint and you value operational simplicity above all else
- You are building a search-only product where vector search is the core feature
Choose Weaviate when:
- You need multimodal search (text + images + audio in the same pipeline)
- Hybrid search (vector + keyword) is a core product requirement, not a nice-to-have
- You need to self-host for data sovereignty or compliance reasons
- You want built-in vectorization to minimize code
- You have a DevOps team comfortable running stateful services on Kubernetes
Choose pgvector when:
- You are already using PostgreSQL (you probably are)
- You need JOINs between vector data and relational data
- You want one database, one bill, one backup strategy
- Your vector count is under 5 million (which covers most applications)
- You are on Supabase and want RLS to apply to vector queries automatically
- You are a startup or small team that values simplicity over theoretical scale
My Default Choice and Why
I reach for pgvector on Supabase every single time unless the project requirements explicitly demand something else. Here is why:
Architectural simplicity. Every additional service in your stack is a potential point of failure, a synchronization headache, a new set of credentials to manage, and a bill to pay. pgvector eliminates one entire service from the equation. Your AI features use the same database as the rest of your application.
Developer velocity. I can add vector search to an existing Supabase project in under 30 minutes. Enable the extension, add a column, create an index, write a function. No new SDK to learn, no new service to provision, no new deployment pipeline to maintain.
Real-world adequacy. Every AI product I have built — from RAG-powered documentation search to the Europarts Lanka AI Part Finder — has been well-served by pgvector at Supabase scale. I have not hit a ceiling yet on a production project. When I do, I will migrate with the revenue that ceiling implies.
Cost efficiency. For a startup shipping an MVP, the difference between $25/month (Supabase Pro with pgvector included) and $200+/month (Pinecone + separate database) is real money. That is almost $2,000/year you can spend on an embedding API, marketing, or keeping the lights on one more month.
The only times I have reached for Pinecone or Weaviate were client projects with specific requirements that pgvector could not meet: one needed search across 50 million product embeddings (Pinecone), another needed multimodal search across documents and images simultaneously (Weaviate). Those are legitimate use cases for purpose-built solutions. But they are the exception, not the rule.
My advice: start with pgvector. Build your product. Get users. If you hit a scale ceiling — and that is a great problem to have — you will have the revenue and the data to make an informed migration decision. Do not over-engineer your vector infrastructure before you have product-market fit.
Key Takeaways
- All three are production-ready. Pinecone, Weaviate, and pgvector can all serve real applications. The choice is about your specific trade-offs, not about quality.
- The two-database problem is real. Pinecone and Weaviate both require a separate relational database. pgvector eliminates this entirely. Do not underestimate how much complexity this removes.
- Cost compounds. A dedicated vector database plus a relational database plus synchronization logic is expensive in both money and engineering time. pgvector costs you nothing extra.
- pgvector's scale ceiling is higher than you think. With proper HNSW tuning, partitioning, and a well-provisioned Supabase instance, 5 million vectors at sub-100ms latency is achievable. Most products never need more.
- Hybrid search can be built on pgvector. Combine
ts_rankfor full-text relevance with cosine similarity for semantic matching. It is not as elegant as Weaviate's built-in solution, but it works.
- Start boring, upgrade when necessary. The best architecture is the one you can ship today and evolve tomorrow. pgvector lets you ship vector search without adding a single new service to your stack.
Need help choosing the right vector database for your project or building a production RAG pipeline? Check out my services — I have shipped AI-powered search for startups and enterprises across multiple vector database stacks.
*Written by Uvin Vindula↗ — full-stack engineer and AI builder based between Sri Lanka and the UK. I build production AI systems, Web3 applications, and everything in between. Follow my work at @IAMUVIN↗ or explore more at uvin.lk↗.*
Working on a Web3 or AI project?

Uvin Vindula
Web3 and AI engineer based in Sri Lanka and the UK. Author of The Rise of Bitcoin. Director of Blockchain and Software Solutions at Terra Labz. Founder of uvin.lk — Sri Lanka's Bitcoin education platform with 10,000+ learners.