HNSW Vector Index on unified_rag_content.embedding

Decision

Add an HNSW ANN (Approximate Nearest Neighbor) index to public.unified_rag_content(embedding) using vector_cosine_ops.

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_unified_rag_content_embedding
  ON public.unified_rag_content
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

Context

The unified_rag_content table holds ~327K rows and is the backing store for both the SermonWise AI RAG pipeline and the ChurchWiseAI chatbot RAG retrieval path. The DBA-audit subagent (2026-05-05) confirmed that no ANN vector index existed — every cosine similarity search without a pericope ILIKE pre-filter was executing an O(N) exact KNN scan across all 327K embedding vectors.

At this scale, exact KNN scans take approximately 600–700ms p90. This is a silent performance drain: queries "work" but are an order of magnitude slower than they need to be, and the cost compounds as the table grows.

Why HNSW over IVFFlat

Criterion	HNSW	IVFFlat
Training step required	No	Yes (`CREATE INDEX` requires a pre-built centroid list via `VACUUM ANALYZE`)
Recall @ ef_search=40	~99%	~95–97%
Sensitivity to bulk inserts	Low (inserts re-wire graph incrementally)	High (bulk inserts degrade recall until VACUUM+ANALYZE re-clusters)
Build time for ~327K rows	~5–15 min	Similar
Memory at query time	Higher (`ef_search` parameter trades RAM for recall)	Lower

HNSW is the better choice for this workload because:

No training step — index can be created live with CONCURRENTLY
Better recall at the same query latency budget
unified_rag_content receives continuous incremental inserts (new church KB entries, sermon illustrations); IVFFlat recall degrades between VACUUM cycles, HNSW does not
pgvector HNSW support is stable as of pgvector ≥ 0.5.0

Build Parameters

Parameter	Value	Rationale
`m`	16	Connections per node. Standard for 300K-1M vectors. Higher = better recall, more RAM
`ef_construction`	64	Build-time search depth. Standard for ~300K vectors. Higher = better graph quality, slower build
`ef_search`	(default 40)	Can be tuned per-query via `SET hnsw.ef_search = N` if recall vs. latency tradeoff needs adjustment

Expected Impact

Metric	Before	After
RAG query p90 (unfiltered cosine search)	~650ms	<50ms
RAG query p90 (with pericope pre-filter)	Already using btree on pericope; unchanged
Table write throughput	Unchanged (index maintained incrementally)
Table read throughput during build	Unchanged (CONCURRENTLY guarantee)

Application Notes

Critical: CREATE INDEX CONCURRENTLY cannot run inside a transaction. The Supabase SQL Editor wraps all statements in a transaction by default — do NOT use the SQL Editor unless you uncheck "Wrap in transaction."

Preferred application method:

psql "postgresql://postgres.wrwkszmobuhvcfjipasi:<DB_PASSWORD>@aws-0-<region>.pooler.supabase.com:5432/postgres" \
  -f churchwiseai-web/migrations/2026-05-05_unified_rag_hnsw_index.sql

The DB password is available in the Supabase dashboard under Project Settings → Database → Connection string.

Alternative: Supabase dashboard → SQL Editor → uncheck "Wrap in transaction" checkbox → paste the CREATE INDEX CONCURRENTLY ... statement only.

Monitor build progress:

SELECT * FROM pg_stat_progress_create_index;

Post-Build Verification

After the index appears in pg_indexes, verify the query planner uses it:

-- Get a real probe vector first:
SELECT embedding FROM unified_rag_content WHERE embedding IS NOT NULL LIMIT 1;

-- Then run EXPLAIN ANALYZE with that vector:
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, embedding <=> '<probe-vector>'::vector AS dist
FROM unified_rag_content
ORDER BY dist
LIMIT 6;
-- Expected: "Index Scan using idx_unified_rag_content_embedding"
-- Expected runtime: <50ms

Migration File

churchwiseai-web/migrations/2026-05-05_unified_rag_hnsw_index.sql

Decision​

Context​

Why HNSW over IVFFlat​

Build Parameters​

Expected Impact​

Application Notes​

Post-Build Verification​

Migration File​