Chatbot Unified Endpoint

Summary

All church chatbot traffic flows through a single production endpoint: POST /api/chatbot/stream (Vercel AI SDK 6, streaming SSE). The legacy /api/chatbot/chat endpoint was deleted on 2026-04-09 after it caused $10–30/day in surprise Sonnet escalation bills during QA runs; its Sonnet fallback paths were silently triggered far more often than intended. The unified endpoint uses Claude Haiku 4.5 as the primary LLM (direct Anthropic provider, no gateway — gateway was removed to eliminate markup costs), with OpenAI gpt-4o-mini as an automatic failover. At request time it queries product_knowledge for current pricing and feature facts, and unified_rag_content (327K rows, vector search) for theological content. Agent types and available tools are gated by plan tier (Starter = 2 agents, Pro/Suite = 4 agents). A secondary route at /api/chatbot/unified handles support and demo routing, then delegates church traffic to /stream.

Flow

Phase tracker

Phase 1 — Tier-gated unified endpoint: one POST handler, all church traffic, tool gating by plan (merged ~2026-03-15)
Phase 2 — HEAR protocol + agent type specialization: Starter=2 agents, Pro/Suite=4 agents, persona templates, domain rulesets (merged 2026-04-01)
Phase 3 — Direct-provider migration: legacy /api/chatbot/chat deleted, AI gateway removed, direct Anthropic + OpenAI fallover, Sonnet escalation cost eliminated (merged 2026-04-09)
Phase 4 — Tool deferral pattern refinement: smarter tool-use round management, deferred-tool eligibility rules (in progress)

Code files

File	Role
`src/app/api/chatbot/stream/route.ts`	PRODUCTION endpoint — 16-stage pipeline, all chatbot types
`src/app/api/chatbot/unified/route.ts`	Support/demo router — delegates church traffic to `/stream`
`src/lib/agent-type-config.ts`	Source of truth for agent counts and tier gating (`TIER_AGENTS`)
`src/lib/agent-prompts.ts`	Per-agent system prompt construction
`src/lib/chatbot-tools.ts`	Tool definitions and execution handlers
`src/lib/tool-config.ts`	`TOOL_AGENT_MAP` — which tools each agent type can use
`src/lib/tool-deferral.ts`	Deferred tool eligibility rules (Phase 4)
`src/lib/tier-config.ts`	`normalizePlanTier()` — raw plan key → starter/pro/suite
`src/lib/tier-features.ts`	Per-tier feature flags and limits
`src/lib/rag.ts`	Vector search against `unified_rag_content` and `church_knowledge_base`
`src/lib/semantic-cache.ts`	Semantic response cache (reduces repeat LLM calls)
`src/lib/faq-matcher.ts`	Exact + fuzzy FAQ matching against `chatbot_faqs`
`src/lib/llm-provider.ts`	Provider abstraction — Anthropic primary, OpenAI fallback
`src/lib/response-cascade.ts`	HEAR protocol response structure
`src/lib/escalation.ts`	`shouldEscalate()` — crisis/complexity/emotional depth detection
`src/lib/pastoral-care-library.ts`	Pre-built care response library (tradition-calibrated)
`src/lib/persona-templates.ts`	Chatbot persona system prompts per agent type
`src/lib/lens-vocabulary.ts`	Tradition-specific word preference lists
`src/lib/moderation.ts`	`checkRestriction()` — active cooldown/block lookup
`src/lib/content-moderation.ts`	Content violation detection
`src/lib/chatbot-provision.ts`	Chatbot provisioning helpers
`src/lib/chatbot-stream-client.ts`	Client-side streaming interface
`src/lib/rate-limit.ts`	IP-based rate limiter (30 req/60s)
`src/lib/usage-tracking.ts`	Per-message token and cost tracking
`src/lib/latency-tracker.ts`	Pipeline stage latency instrumentation

Tests

No critical_path: true entry exists in knowledge/tests/registry.yaml for the chatbot endpoint yet.

TODO: Create a chatbot-live-response critical-path entry in registry.yaml with a Playwright spec that sends a message to a demo church and verifies a streaming response is received. This is a Phase 4 deliverable. Until then, the chatbot pipeline is NOT protected by the CI gate.

Existing coverage:

knowledge/tests/personas/ — persona-based prompt tests (exhaustive; see persona-test-prompts.md)
knowledge/tests/results/ — historical persona test runs
src/lib/__tests__/prompt-behavior-rules.test.ts — HEAR protocol compliance unit tests
src/app/api/test/chatbot/route.ts — internal test harness for chatbot QA

Decisions

2026-04-09-chatbot-chat-legacy-deleted — /api/chatbot/chat deleted after Sonnet escalation paths burned $10–30/day. All chatbot work now MUST target /api/chatbot/stream. See memory feedback_chatbot_stream_is_production.md.
2026-04-01-chatbot-agent-architecture — Chatbot formalized at 4 marketing agent types (Care, Coordinator, Discipleship, Stewardship) for Pro/Suite; Starter gets 2 (Care + Coordinator). Voice deliberately stays at 2 (phone calls can't route to 4 without menu systems). See memory project_agent_architecture.md.

Gotchas

/api/chatbot/stream is the ONLY production chatbot endpoint. There is no other. /api/chatbot/chat was deleted 2026-04-09. Never optimize, modify, or reference /chat again. Any agent that touches chatbot code MUST verify it is editing stream/route.ts.
Chat has 4 agents, Voice has 2 — never confuse them. agent-type-config.ts is the source of truth (TIER_AGENTS). Starter=2 (Care + Coordinator), Pro/Suite=4 (+ Discipleship + Stewardship). Discipleship and Stewardship are REAL chatbot-only agents with full persona mappings, tier gating, and per-church admin toggles. They are NOT in the voice agent Python code because voice doesn't use them. An April 2026 QA agent incorrectly removed them from the pricing page calling them "phantom" — reverted same day.
Always call normalizePlanTier() before comparing plan tiers. Raw plan keys are globally unique strings (cwa_starter, ps_starter, etc.). Direct string comparison breaks silently. The normalizer maps all variants to starter, pro, or suite. Five components broke when this was missed after the plan key rename.
The voice-prefix tables are shared between chat and voice. voice_prayer_requests, voice_callback_requests, and voice_visitor_contacts store data for BOTH channels despite the voice_ prefix (legacy naming). Each row has a source column: 'voice' (voice agent), 'chat' (CWA chatbot), 'pewsearch' (PewSearch chatbot). Always write source='chat' from chatbot tools. Always filter by source when querying for channel-specific analytics.
Agents are a caring bridge to a real person — not the destination. Hard boundaries for all agents: do NOT pray with people, do NOT offer solutions, do NOT take confessions, do NOT counsel, do NOT quote scripture prescriptively, do NOT give theological opinions on contested topics (LGBTQ+, divorce, end-of-life), do NOT give medical advice. DO: listen, empathize, bridge to humans, log for church follow-up. Founder is CPE-trained and a Stephen Leader — the Stephen Ministry model is the reference for all care boundaries.
NEVER produce spiritual harm. No theological judgments about LGBTQ+ identities, no attributing conditions to demons, no sinfulness framing. Always bridge to humans. This is the founder's deepest conviction and a P0 absolute rule.
No "luck/lucky" language anywhere. Use "blessed," "called," or "guided" instead. God's providence, not chance. Applies to all chatbot responses, system prompts, and any generated copy.
Crisis safety net is non-negotiable and runs on ALL chatbot types. After the LLM returns a response, a deterministic regex check runs on every message. If self-harm patterns are detected and the response is missing any of [988, 741741, 911], the safety net auto-appends the full crisis resource block. This cannot be disabled. It runs on basic, pro_website, and full chatbot responses alike.
PewSearch auto-provisioned chatbots get a restricted "basic" chatbot. Detection is via organization_settings.chatbot_source = 'pewsearch_auto_provision'. These only get 1 tool (prayer request submission) and a Q&A receptionist prompt. They upsell ChurchWiseAI after prayer interactions.
OpenAI is a failover, not a primary. Claude Haiku 4.5 via direct Anthropic provider is primary. OpenAI gpt-4o-mini activates only on Anthropic downtime. The AI gateway was removed in Phase 3 to eliminate markup costs. If both providers are unavailable, the endpoint returns 503.
unified_rag_content has 327K irreplaceable records. Never bulk delete. Always paginate vector search results — Supabase default limit is 1000 rows.

Recent activity

2026-04-09 — /api/chatbot/chat deleted; Phase 3 (direct-provider migration) complete.
2026-04-14 — This document created as canonical process record.

Summary​

Flow​

Phase tracker​

Code files​

Tests​

Decisions​

Gotchas​

Recent activity​