Chatbot Unified Endpoint
Summary
All church chatbot traffic flows through a single production endpoint: POST /api/chatbot/stream
(Vercel AI SDK 6, streaming SSE). The legacy /api/chatbot/chat endpoint was deleted on
2026-04-09 after it caused $10–30/day in surprise Sonnet escalation bills during QA runs; its
Sonnet fallback paths were silently triggered far more often than intended. The unified endpoint
uses Claude Haiku 4.5 as the primary LLM (direct Anthropic provider, no gateway — gateway was
removed to eliminate markup costs), with OpenAI gpt-4o-mini as an automatic failover. At
request time it queries product_knowledge for current pricing and feature facts, and
unified_rag_content (327K rows, vector search) for theological content. Agent types and
available tools are gated by plan tier (Starter = 2 agents, Pro/Suite = 4 agents). A secondary
route at /api/chatbot/unified handles support and demo routing, then delegates church traffic
to /stream.
Flow
Phase tracker
- Phase 1 — Tier-gated unified endpoint: one POST handler, all church traffic, tool gating by plan (merged ~2026-03-15)
- Phase 2 — HEAR protocol + agent type specialization: Starter=2 agents, Pro/Suite=4 agents, persona templates, domain rulesets (merged 2026-04-01)
- Phase 3 — Direct-provider migration: legacy
/api/chatbot/chatdeleted, AI gateway removed, direct Anthropic + OpenAI fallover, Sonnet escalation cost eliminated (merged 2026-04-09) - Phase 4 — Tool deferral pattern refinement: smarter tool-use round management, deferred-tool eligibility rules (in progress)
Code files
| File | Role |
|---|---|
src/app/api/chatbot/stream/route.ts | PRODUCTION endpoint — 16-stage pipeline, all chatbot types |
src/app/api/chatbot/unified/route.ts | Support/demo router — delegates church traffic to /stream |
src/lib/agent-type-config.ts | Source of truth for agent counts and tier gating (TIER_AGENTS) |
src/lib/agent-prompts.ts | Per-agent system prompt construction |
src/lib/chatbot-tools.ts | Tool definitions and execution handlers |
src/lib/tool-config.ts | TOOL_AGENT_MAP — which tools each agent type can use |
src/lib/tool-deferral.ts | Deferred tool eligibility rules (Phase 4) |
src/lib/tier-config.ts | normalizePlanTier() — raw plan key → starter/pro/suite |
src/lib/tier-features.ts | Per-tier feature flags and limits |
src/lib/rag.ts | Vector search against unified_rag_content and church_knowledge_base |
src/lib/semantic-cache.ts | Semantic response cache (reduces repeat LLM calls) |
src/lib/faq-matcher.ts | Exact + fuzzy FAQ matching against chatbot_faqs |
src/lib/llm-provider.ts | Provider abstraction — Anthropic primary, OpenAI fallback |
src/lib/response-cascade.ts | HEAR protocol response structure |
src/lib/escalation.ts | shouldEscalate() — crisis/complexity/emotional depth detection |
src/lib/pastoral-care-library.ts | Pre-built care response library (tradition-calibrated) |
src/lib/persona-templates.ts | Chatbot persona system prompts per agent type |
src/lib/lens-vocabulary.ts | Tradition-specific word preference lists |
src/lib/moderation.ts | checkRestriction() — active cooldown/block lookup |
src/lib/content-moderation.ts | Content violation detection |
src/lib/chatbot-provision.ts | Chatbot provisioning helpers |
src/lib/chatbot-stream-client.ts | Client-side streaming interface |
src/lib/rate-limit.ts | IP-based rate limiter (30 req/60s) |
src/lib/usage-tracking.ts | Per-message token and cost tracking |
src/lib/latency-tracker.ts | Pipeline stage latency instrumentation |
Tests
No critical_path: true entry exists in knowledge/tests/registry.yaml for the chatbot endpoint yet.
TODO: Create a chatbot-live-response critical-path entry in registry.yaml with a Playwright spec that sends a message to a demo church and verifies a streaming response is received. This is a Phase 4 deliverable. Until then, the chatbot pipeline is NOT protected by the CI gate.
Existing coverage:
knowledge/tests/personas/— persona-based prompt tests (exhaustive; seepersona-test-prompts.md)knowledge/tests/results/— historical persona test runssrc/lib/__tests__/prompt-behavior-rules.test.ts— HEAR protocol compliance unit testssrc/app/api/test/chatbot/route.ts— internal test harness for chatbot QA
Decisions
- 2026-04-09-chatbot-chat-legacy-deleted —
/api/chatbot/chatdeleted after Sonnet escalation paths burned $10–30/day. All chatbot work now MUST target/api/chatbot/stream. See memoryfeedback_chatbot_stream_is_production.md. - 2026-04-01-chatbot-agent-architecture — Chatbot formalized at 4 marketing agent types (Care, Coordinator, Discipleship, Stewardship) for Pro/Suite; Starter gets 2 (Care + Coordinator). Voice deliberately stays at 2 (phone calls can't route to 4 without menu systems). See memory
project_agent_architecture.md.
Gotchas
-
/api/chatbot/streamis the ONLY production chatbot endpoint. There is no other./api/chatbot/chatwas deleted 2026-04-09. Never optimize, modify, or reference/chatagain. Any agent that touches chatbot code MUST verify it is editingstream/route.ts. -
Chat has 4 agents, Voice has 2 — never confuse them.
agent-type-config.tsis the source of truth (TIER_AGENTS). Starter=2 (Care + Coordinator), Pro/Suite=4 (+ Discipleship + Stewardship). Discipleship and Stewardship are REAL chatbot-only agents with full persona mappings, tier gating, and per-church admin toggles. They are NOT in the voice agent Python code because voice doesn't use them. An April 2026 QA agent incorrectly removed them from the pricing page calling them "phantom" — reverted same day. -
Always call
normalizePlanTier()before comparing plan tiers. Raw plan keys are globally unique strings (cwa_starter,ps_starter, etc.). Direct string comparison breaks silently. The normalizer maps all variants tostarter,pro, orsuite. Five components broke when this was missed after the plan key rename. -
The voice-prefix tables are shared between chat and voice.
voice_prayer_requests,voice_callback_requests, andvoice_visitor_contactsstore data for BOTH channels despite thevoice_prefix (legacy naming). Each row has asourcecolumn:'voice'(voice agent),'chat'(CWA chatbot),'pewsearch'(PewSearch chatbot). Always writesource='chat'from chatbot tools. Always filter by source when querying for channel-specific analytics. -
Agents are a caring bridge to a real person — not the destination. Hard boundaries for all agents: do NOT pray with people, do NOT offer solutions, do NOT take confessions, do NOT counsel, do NOT quote scripture prescriptively, do NOT give theological opinions on contested topics (LGBTQ+, divorce, end-of-life), do NOT give medical advice. DO: listen, empathize, bridge to humans, log for church follow-up. Founder is CPE-trained and a Stephen Leader — the Stephen Ministry model is the reference for all care boundaries.
-
NEVER produce spiritual harm. No theological judgments about LGBTQ+ identities, no attributing conditions to demons, no sinfulness framing. Always bridge to humans. This is the founder's deepest conviction and a P0 absolute rule.
-
No "luck/lucky" language anywhere. Use "blessed," "called," or "guided" instead. God's providence, not chance. Applies to all chatbot responses, system prompts, and any generated copy.
-
Crisis safety net is non-negotiable and runs on ALL chatbot types. After the LLM returns a response, a deterministic regex check runs on every message. If self-harm patterns are detected and the response is missing any of [988, 741741, 911], the safety net auto-appends the full crisis resource block. This cannot be disabled. It runs on basic, pro_website, and full chatbot responses alike.
-
PewSearch auto-provisioned chatbots get a restricted "basic" chatbot. Detection is via
organization_settings.chatbot_source = 'pewsearch_auto_provision'. These only get 1 tool (prayer request submission) and a Q&A receptionist prompt. They upsell ChurchWiseAI after prayer interactions. -
OpenAI is a failover, not a primary. Claude Haiku 4.5 via direct Anthropic provider is primary. OpenAI gpt-4o-mini activates only on Anthropic downtime. The AI gateway was removed in Phase 3 to eliminate markup costs. If both providers are unavailable, the endpoint returns 503.
-
unified_rag_contenthas 327K irreplaceable records. Never bulk delete. Always paginate vector search results — Supabase default limit is 1000 rows.
Recent activity
- 2026-04-09 —
/api/chatbot/chatdeleted; Phase 3 (direct-provider migration) complete. - 2026-04-14 — This document created as canonical process record.