Knowledge > Products > Chatbot > Architecture

Chatbot Architecture

Request Pipeline

Every chatbot message flows through a single POST /api/chatbot/stream handler. The pipeline has 16 stages, each of which can short-circuit and return early.

Pipeline Pseudocode

POST /api/chatbot/stream

INPUTS: { message, churchId, sessionId, history?, agentType?, lensOverride?, lensNameOverride? }

 1. RATE LIMIT CHECK
    limiter(ip) — 30 requests per 60 seconds per IP
    → 429 if exceeded

 2. VALIDATE REQUEST
    - message, churchId, sessionId are required
    - message.length <= 2000 characters
    - At least one LLM provider key (ANTHROPIC_API_KEY or OPENAI_API_KEY) must be set
    → 400 / 503 on failure

 3. ORIGIN VALIDATION
    - Extract origin from Origin header or Referer
    - Check against trusted origins whitelist:
        churchwiseai.com, sermonwise.ai, sharewiseai.com
        + localhost:3000/3001/3002 in development
    - Log suspicious origins (non-trusted, non-custom-domain)
    - Log requests with no origin at all
    - NOTE: Does not hard-block unknown origins (churches embed on arbitrary domains)

 4. CHURCH VALIDATION (early gate)
    - Query premium_churches: churchId must exist, chatbot_enabled=true, status IN ('active', 'preview')
    - This runs BEFORE any expensive work (RAG, LLM calls) to prevent abuse
    → 404 if church not found or chatbot disabled

 5. MODERATION CHECK
    checkRestriction(churchId, sessionId)
    - Checks moderation_restrictions table for active cooldown/temp_block/permanent_block
    → Returns restriction message if restricted (200 with restricted=true)

 6. FAQ MATCHING (fast path)
    matchFAQ(message, churchId, agentType)
    - Exact match with exact_response=true → return immediately (zero LLM cost, does not count toward usage limits)
    - Fuzzy match → save as faqPreferredContext for LLM injection
    - Failure is non-fatal (logged, pipeline continues)

 7. LOAD CHURCH DATA
    - Query churches table: name, denomination, address, phone, website, working_hours
    - Query premium_churches: plan, custom_name, custom_hours, custom_staff, custom_ministries,
      what_to_expect, cap_info, chatbot_enabled
    → 404 if church not found; 403 if chatbot not enabled

 8. CHECK USAGE LIMITS
    normalizePlanTier(plan) → starter | pro | suite
    checkUsageLimit(churchId, tier)
    - Monthly LLM message limits: starter=200, pro=1000, suite=5000
    - Canned responses do NOT count
    → Returns limitReached message if over limit (200 with limitReached=true)

 9. LOAD CONFIG
    - Query church_voice_agents: pastor_name, callback settings, Cal.com keys,
      sermon topic/series/verse, announcements, giving config
    - Query organization_settings: agent_tool_config, agent_config, chatbot_config
    - Detect chatbot source:
        'pewsearch_auto_provision' → basic chatbot (Q&A only)
        plan='pro_website' and not basic → Pro Website chatbot (restricted)
        else → full chatbot (agentic, all tools)

10. RESOLVE THEOLOGICAL LENS
    Priority chain:
    (1) Client-side override (lensOverride + lensNameOverride, validated: ID 1-17, name <= 50 chars)
    (2) Church setting in church_theological_lenses table
    (3) Denomination auto-detect via DENOMINATION_TO_LENS mapping
    (4) Default: Christocentric (lens ID 10)

    Then fetch:
    - Doctrinal rules from theological_contradictions (must-include/must-exclude terms per doctrine)
    - Church-specific doctrinal overrides from organization_settings
    - Lens vocabulary from lens_knowledge (preferred/avoided terminology)

11. RAG RETRIEVAL (parallel)
    generateEmbedding(message) — OpenAI text-embedding-3-small (1536 dims)
    Promise.all([
      searchRAG(embedding, lensIds=[lensId], matchCount=8),   // theological content
      searchChurchKnowledge(churchId, embedding),              // church-specific KB
    ])
    - Also loads product_knowledge table (shared billing/feature info)
    - Failure is non-fatal (logged, pipeline continues without RAG)

12. BUILD CONTEXT BLOCKS
    Assembles these blocks into the system prompt:
    - Church facts (name, address, denomination, phone, website, hours, staff, ministries, what to expect)
    - Church knowledge base (church-specific FAQs and documents — highest priority)
    - Theological RAG (curated content from unified_rag_content)
    - Product knowledge (PewSearch/ChurchWiseAI feature info)
    - FAQ preferred context (fuzzy-matched FAQ, if any)
    - Doctrinal rules (must-include/must-exclude per doctrine)
    - Lens vocabulary (tradition-specific word choices)
    - Critical local resources (pre-fetched emergency numbers, CAP info)
    - Contact info (church phone, website)

13. ROUTE TO CHATBOT TYPE
    Three branches diverge here based on chatbot source detection from step 9:

    BASIC CHATBOT (PewSearch auto-provision):
    - Simple Q&A receptionist prompt, 1 tool (prayer request only)
    - Max 300 tokens, temperature 0.3
    - Scope: church facts only, redirect off-topic, upsell ChurchWiseAI after prayer

    PRO WEBSITE CHATBOT:
    - Richer than basic, still restricted vs full
    - 1 tool (prayer request only), theology-aware
    - Upsells advanced tools without saying "upgrade" or "pay more"
    - Max 400 tokens, temperature 0.3, up to 2 tool-use rounds

    FULL CHATBOT:
    - Full HEAR protocol, all enabled tools filtered by agent type
    - Agent specialization layer (persona, domain ruleset, personality, handoff rules)
    - Dynamic tool filtering, temperature/max_tokens from agent config
    - Up to 3 tool-use rounds

14. FULL CHATBOT: LLM TOOL-USE LOOP
    (See "Two-Call LLM Pattern" below for detail)

    resolve agent config (personality overrides, handoff rules)
    build agent system prompt (base + persona + domain ruleset + personality + tool instructions + safety)
    filter tools for agent type (TOOL_AGENT_MAP)
    determine escalation (shouldEscalate — checks for crisis, complexity, emotional depth)

    for round = 0 to MAX_ROUNDS (3):
      callLLM(systemPrompt, messages, tools=(if round < MAX_ROUNDS), temperature, maxTokens, escalate)
      if response.toolCalls.length > 0 and round < MAX_ROUNDS:
        for each tool call:
          executeTool(name, args, context) → string result
          log to tool_invocations (fire-and-forget)
        append assistant message with tool_use blocks
        append user message with tool_result blocks
        continue
      if response.text:
        finalText = response.text
        break
      // Empty text fallback chain:
      retry with clean messages (no tool artifacts)
      retry with escalated model (Sonnet)
      crisis-specific or conversational fallback text

15. CRISIS SAFETY NET (post-processing)
    NON-NEGOTIABLE — runs on ALL chatbot types (basic, pro_website, full)
    if message matches self-harm/crisis regex patterns:
      if response missing any of [988, 741741, 911]:
        auto-append full crisis resource block
      if LLM did not call flag_safety_concern:
        auto-execute flag_safety_concern(level='urgent') as system_safety_net
      log violation and auto-escalate (moderation system)

16. TRACK USAGE + RETURN RESPONSE
    trackUsage(churchId, sessionId, responseSource, tokens, model)
    upsert chatbot_conversations (fire-and-forget)
    increment_conversation_counts via RPC (fire-and-forget)
    log to response_reviews for admin training review (fire-and-forget)
    return JSON response

Two-Call LLM Pattern

The chatbot uses a multi-round tool-use loop (up to 3 rounds for full chatbot, 2 for Pro Website):

Round 0: LLM call WITH tool definitions
  → Model returns tool_use blocks (e.g., submit_prayer_request, get_church_directions)
  → Execute tools, embed results as tool_result blocks in message history
Round 1: LLM call WITH tool definitions (may call more tools)
  → Execute additional tools if requested
Round 2: LLM call WITH tool definitions (final tool round)
  → Last chance for tool calls
Round 3: LLM call WITHOUT tool definitions (forced text)
  → Model must produce text response using all accumulated tool results

Each round preserves the full conversation state. The assistant message includes both text fragments and tool_use blocks. Tool results come back as tool_result blocks referencing the tool_use_id from the preceding assistant message. This structured format is required by Anthropic's API.

The loop exits early when the model returns text without requesting tools. Most conversations complete in 1-2 rounds.

Empty Text Recovery

If the LLM returns empty text (a known edge case when Haiku processes tool_result artifacts without tool definitions), the system follows a three-step recovery:

Retry with clean messages -- strips tool_use/tool_result artifacts, re-sends original history + current message
Escalate to Sonnet -- more capable model handles short follow-ups better
Hardcoded fallback -- crisis-specific text (with mandatory resources) if crisis detected, otherwise a conversational prompt to continue

Error Handling

All errors are caught at the top level of the POST handler:

Errors are logged to console and reported via reportError() to the ops-reporter system
A generic user-facing message is returned: "Sorry, something went wrong. Please try again." with HTTP 500
Non-fatal failures (RAG, FAQ matching, product knowledge, doctrinal rules, lens vocabulary, tool logging, conversation tracking) are caught individually and logged as warnings without aborting the pipeline
Fire-and-forget operations (usage tracking, conversation upserts, response review logging, tool invocation logging) use .catch() to prevent unhandled rejections

Response Format

Standard Response

{
  "response": "The chatbot's text response",
  "model": "claude-haiku-4-5-20251001",
  "provider": "anthropic",
  "rag": {
    "theological_hits": 5,
    "church_kb_hits": 2,
    "embedding_generated": true,
    "faq_matched": false
  }
}

FAQ Short-Circuit Response

{
  "response": "The canned FAQ answer",
  "source": "canned",
  "cannedResponseId": "uuid"
}

Basic / Pro Website Response

{
  "response": "Response text",
  "source": "basic_chatbot" | "pro_website",
  "upgradeUrl": "https://churchwiseai.com/pricing",
  "upgradeMessage": "Unlock All 33 Ministry Tools"
}

Moderation Restricted Response

{
  "response": "You've been temporarily restricted...",
  "restricted": true,
  "restriction_type": "cooldown" | "temp_block" | "permanent_block",
  "expires_at": "2026-03-25T12:00:00Z"
}

Usage Limit Response

{
  "response": "Monthly conversation limit reached...",
  "limitReached": true,
  "limit": 200,
  "used": 200
}

Error Response

{
  "error": "Sorry, something went wrong. Please try again."
}

Key Design Decisions

Why One Endpoint, Not Separate Routes

All chatbot types (basic, pro_website, full) share the same validation, moderation, church loading, and crisis safety net logic. Splitting into separate routes would duplicate safety-critical code. The internal branching happens at step 13 after all shared checks are complete.

Why Fire-and-Forget for Tracking

Usage tracking, conversation logging, response reviews, and tool invocations are non-critical for the user experience. Making the user wait for database writes would add 50-200ms latency per message. These operations use Promise.resolve().catch() to execute asynchronously without blocking the response.

Why Regex Crisis Detection, Not Just LLM

LLMs occasionally omit crisis resources despite explicit system prompt instructions. The regex-based safety net is a deterministic backstop that cannot be prompt-injected, cannot hallucinate away resources, and cannot be defeated by model behavior changes. It runs on every response to every chatbot type. This is non-negotiable.

Why Anthropic Primary, Not OpenAI

Claude Haiku 4.5 provides better tool-use reliability, more consistent persona adherence, and superior empathetic language for pastoral care contexts compared to gpt-4o-mini at a comparable price point. OpenAI serves as an automatic fallback if Anthropic experiences downtime. Provider failover triggers an email alert (rate-limited to 1 per 15 minutes).

Database Dependencies

Table	Access Pattern	Purpose
`premium_churches`	Read	Chatbot enablement, plan tier, church config, CAP info
`churches`	Read	Church facts (name, address, denomination)
`church_voice_agents`	Read	Pastor name, Cal.com config, sermon info, giving config
`organization_settings`	Read	Tool config, agent config, chatbot source, doctrinal overrides
`church_theological_lenses`	Read	Church's theological lens assignment
`sai_theological_lenses`	Read	Lens name lookup
`theological_contradictions`	Read	Doctrinal must-include/must-exclude rules
`lens_knowledge`	Read	Tradition-specific vocabulary
`unified_rag_content`	Read (vector search)	Theological RAG content
`church_knowledge_base`	Read (vector search)	Church-specific knowledge
`product_knowledge`	Read	Shared product/billing info
`church_local_resources`	Read	Critical local emergency resources
`chatbot_conversations`	Read/Write	Conversation tracking, message counts
`chatbot_usage`	Write	Per-message token/cost tracking
`tool_invocations`	Write	Tool execution audit log
`response_reviews`	Write	LLM response logging for admin review
`moderation_violations`	Read/Write	Violation logging and restriction checks
`moderation_restrictions`	Read	Active restriction lookup
`chatbot_faqs`	Read	FAQ matching

Request Pipeline​

Pipeline Pseudocode​

Two-Call LLM Pattern​

Empty Text Recovery​

Error Handling​

Response Format​

Standard Response​

FAQ Short-Circuit Response​

Basic / Pro Website Response​

Moderation Restricted Response​

Usage Limit Response​

Error Response​

Key Design Decisions​

Why One Endpoint, Not Separate Routes​

Why Fire-and-Forget for Tracking​

Why Regex Crisis Detection, Not Just LLM​

Why Anthropic Primary, Not OpenAI​

Database Dependencies​

See Also​