Knowledge > Products > Voice Agent > Architecture

Voice Agent Architecture

Multi-Tenant Design

The voice agent is a single deployed instance that serves every church customer. There is no per-church container, no per-church deployment, and no per-church scaling decision. When a call arrives, the agent dynamically loads that church's configuration from Supabase and builds a church-specific agent on the fly.

This design was chosen because:

Solo founder -- no DevOps capacity for per-tenant infrastructure
LiveKit Cloud handles the SIP gateway and room management; Railway hosts the Python agent worker
Per-church customization lives entirely in database configuration, not code

Call Routing: SIP Trunk Phone Number

Every inbound call arrives via the LiveKit Cloud SIP gateway. The Twilio SIP trunk forwards the call to LiveKit Cloud, which dispatches a job to the Railway agent worker. The agent worker reads the dialed number from JobContext.room.sip (sip.trunkPhoneNumber) to determine which church (if any) the call belongs to. The routing follows a three-tier resolution:

Tier 1: `resolve_route(to_number)`

Maps the inbound Twilio To number to an (agent_type, church_id) tuple. Three outcomes:

Resolution	agent_type	church_id	What happens next
Toll-free number matches `TOLL_FREE_NUMBER`	`"sales"`	`None`	Build Sales Agent
Number is in `DEMO_NUMBERS` set	`"demo_router"`	`None`	Build Demo Router Agent
Number is in `PHONE_REGISTRY` dict	`"church"`	UUID or `None`	Build Church Coordinator
Number not found anywhere	`"church"`	`None`	Fall through to DB lookup

Tier 2: Agent Type Determines Builder

Based on the agent_type from Tier 1, get_agent() calls the appropriate builder:

agent_type == "sales"        --> build_sales_agent()
agent_type == "demo_router"  --> build_demo_router_agent()
agent_type == "church"       --> build_coordinator_agent()

Tier 3: Per-Church Data Loading

For church calls, the system loads a complete church configuration object before building the agent. This happens in parallel where possible:

await asyncio.gather(
    fetch_session_rag(supabase, church_id, denomination, church_name),
    load_product_knowledge(supabase),
    load_inline_faqs(supabase, church_id),
    load_repeat_caller_history(supabase, caller_phone, church_id),
)

The assembled context includes:

RAG context -- church-specific knowledge base hits + theological content from unified_rag_content
Product knowledge -- runtime FAQ pairs from product_knowledge table (shared across all calls)
Inline FAQs -- church-specific Q&A pairs from church_knowledge_base (injected directly, separate from RAG vector search)
Repeat caller history -- privacy-gated summaries of the caller's last 5 calls within 90 days
Datetime context -- current date/time in the church's timezone with relative references ("This Sunday means March 30, 2026")

All context blocks are concatenated and injected into the agent's system prompt via agent.history.add_entry(rag_context, role="system").

Phone Registry

PHONE_REGISTRY is a static Python dict in session.py that maps every Twilio number the platform owns to a church ID (or None for unassigned/sales numbers).

# Simplified structure (actual phone numbers redacted):
PHONE_REGISTRY = {
    "+1XXXXXXXXXX": None,           # Toll-free -> Sales
    "+1XXXXXXXXXX": None,           # Spare, unassigned
    "+1XXXXXXXXXX": None,           # Cartesia agent number
}

DEMO_NUMBERS = {
    "+1XXXXXXXXXX",   # US demo line
    "+1XXXXXXXXXX",   # CA demo line
    "+1XXXXXXXXXX",   # Cartesia agent demo line
}

Resolution priority:

Toll-free match (TOLL_FREE_NUMBER constant) --> Sales Agent
Demo number match (DEMO_NUMBERS set) --> Demo Router Agent
Registry match (PHONE_REGISTRY dict) --> Church Agent with known church_id
DB lookup (lookup_church_by_phone()) --> queries church_voice_agents.twilio_phone_number
Fallback --> Sales Agent (caller always reaches someone)

The DB lookup at step 4 supports churches whose numbers were provisioned after the last code deploy. The result is cached for 5 minutes.

Caching Strategy

The voice agent uses a simple in-memory TTL cache (session._cache) based on time.monotonic(). No external cache (Redis, Memcached) is used -- LiveKit Cloud runs a single agent process.

Cache Key Pattern	TTL	What It Caches
`pk:all`	15 minutes	Formatted product knowledge FAQ block
`phone:{to_number}`	5 minutes	Church ID resolved from DB phone lookup
`faq:{church_id}`	5 minutes	Formatted church-specific FAQ pairs
`church:{church_id}`	5 minutes	Complete church configuration dict

The cache also supports a stale-while-revalidate pattern via cache_get_stale(). If Supabase fails during a church data reload, the expired cached value is served rather than dropping the call. This ensures degraded-but-functional behavior during database outages.

Note: The Railway agent worker runs a single process per container. If Railway scales to multiple workers, each worker maintains its own independent in-memory cache — there is no shared cache across workers.

Non-Fatal Error Handling

Every Supabase call in the voice agent is wrapped in try/catch. The design principle is: a database error should never drop a call. Specific fallbacks:

Failure	Fallback Behavior
`load_church_data()` Supabase error	Serve stale cache if available; otherwise return `None` (routes to Sales)
`lookup_church_by_phone()` error	Return `None` (routes to Sales)
`insert_call_log()` error	Log warning, call proceeds without a log record
`increment_call_count()` error	Log warning, church gets a free call rather than a dropped call
`load_product_knowledge()` error	Return empty string, agent proceeds without product knowledge
`load_inline_faqs()` error	Return empty string, agent proceeds without FAQ context
`load_repeat_caller_history()` error	Return empty string, agent proceeds without caller history
`update_call_log_end()` error	Log error, call has already completed
Unknown inbound number, no DB match	Route to Sales Agent
Church over call limit	Return `None`, route to Sales Agent
Care Agent build fails	Log warning, Coordinator handles all topics including pastoral

TurnProcessor Pipeline

The TurnProcessor wraps the LLM agent and intercepts every event before and after it reaches the LLM. It is the core per-turn processing pipeline.

For each UserTextSent event (transcribed caller speech), the pipeline runs:

1. Cancel pending farewell timer (caller spoke again)
2. "Are you there?" reassurance
   - If session.is_processing AND caller said "are you there?" / "hello" / etc.
   - Yield "Yes, I'm here! Just one more moment."
   - Return (do not forward to LLM, do not cancel pending work)

3. Moderation checks (BEFORE noise filtering -- safety must never be silently dropped)
   a. Threat detection (check_threat)
      - Hardcoded response: "This call is being recorded..."
      - Immediately end call
      - Fire-and-forget: email + SMS alerts to church + support
   b. Crisis detection (check_crisis)
      - Inject 988 Lifeline directive into LLM context
      - Set session.crisis_detected = True
      - Do NOT end call -- let caller decide
   c. Abuse detection (check_abuse)
      - 1st offense: inject "caller used inappropriate language" context
      - 2nd offense: hardcoded "I'm going to end this call now" + end call

4. Noise filtering (AFTER moderation -- only if moderation didn't fire)
   - Pure noise ("um", "uh", "hmm"): silently dropped
   - Pure backchannels ("uh huh", "mm hmm"): silently dropped
   - Context-dependent ("okay", "yeah", "sure"): dropped if agent didn't ask a question
   - Floor takes ("wait", "stop", "actually"): always pass through
   - Farewell/gratitude ("thanks", "bye"): always pass through

5. Per-turn RAG (500ms hard timeout, skipped on moderation events)
   - Generate embedding of caller message
   - Search church knowledge base (stricter threshold than session init)
   - Skip if caller message < 10 characters
   - If Supabase is slow, RAG is dropped for this turn

6. Combine contexts and delegate to LlmAgent.process()
   - Inject tool filler phrase before first tool call ("Let me check on that.")
   - Skip filler for end_call and demo_agent tools

7. Auto-hangup on mutual farewell
   - Both agent AND caller must have said goodbye
   - 4-second grace period (let farewell audio finish)
   - Cancelled if caller speaks again during grace period
   - DISABLED during crisis mode -- caller controls when to end

Non-text events (CallStarted, etc.) pass through directly to the agent. CallEnded triggers call log finalization and async classification.

LLM Configuration

Each agent type has specific LLM settings:

Setting	Coordinator	Care	Sales/Demo
Model	`gemini/gemini-2.5-flash`	`anthropic/claude-haiku-4-5-20251001`	`gemini/gemini-2.5-flash`
Fallbacks	`[claude-haiku-4-5]`	`[gemini-2.5-flash]`	`[claude-haiku-4-5]`
Temperature	0.7	0.4 (more controlled for sensitive topics)	0.7 (0.5 for Demo Router)
Timeout	15 seconds	15 seconds	15 seconds
Retries	1	1	1
Max tool iterations	5	5	5 (3 for Demo Router)

The cross-fallback pattern (Coordinator uses Haiku as fallback, Care uses Gemini as fallback) ensures that if either provider has an outage, calls still complete.

Session State

Each call maintains a session dict that persists for the duration of the call:

session = {
    "abuse_count": 0,           # Incremented on each abuse detection
    "church_id": "uuid",        # Resolved church or SALES_SENTINEL
    "call_id": "call_sid",      # Twilio/Cartesia call SID
    "caller_phone": "+1...",    # Caller's phone number
    "church_data": {...},       # Full church config dict (or {} for sales)
    "is_processing": False,     # True while LLM/tool call is in-flight
    "crisis_detected": False,   # True after crisis pattern match (disables auto-hangup)
    "start_time": 1234.5,       # time.monotonic() at call start
    "duration": 0,              # Computed each turn: monotonic() - start_time
    "farewell_pending": False,  # True during farewell grace period
    "tool_results": None,       # Legacy field, no longer written to DB
}

The TurnProcessor also injects session data into the TurnEnv object that tools receive, including supabase, church_id, church_data, caller_phone, caller_email, PCO credentials, Cal.com credentials, and church timezone.

Call Log Lifecycle

Insert (insert_call_log) -- called immediately when get_agent() resolves the call. Creates a voice_call_logs row with status: "in_progress", the call SID, church ID, caller number, and called number.
Increment (increment_call_count) -- increments calls_this_month on church_voice_agents for call-limit tracking. Non-fatal.
Update at end (update_call_log_end) -- triggered by CallEnded event. Writes: status: "completed", duration_seconds, transcript (JSONB array of all events), and summary (initially empty, populated by classification).
Async classification (_generate_call_classification) -- after the transcript is saved, Gemini Flash parses the conversation into 7 structured fields:

Field	DB Column	Example Values
Summary	`summary`	"Caller asked about Sunday service times and was given 9:00 AM and 11:00 AM options."
Sentiment	`caller_sentiment`	-1.0 to 1.0 (float)
Topics	`call_topics`	`["service_times", "directions"]` (JSONB array)
Category	`category`	`service_info`, `prayer_request`, `visitor`, `crisis`, `pastoral_care`, etc.
Urgency	`urgency`	`low`, `normal`, `urgent`, `pastoral_emergency`
Follow-up needed	`follow_up_needed`	`true` / `false`
Suggested assignee	`suggested_assignee`	`pastor`, `office_admin`, `prayer_team`, `care_team`, `none`

Classification is fire-and-forget. If it fails, the call log still has the transcript and duration. The admin dashboard uses these fields for filtering, triage, and assignment.

Agent Builder Pattern

All agents are built via builder functions that return a configured LlmAgent. The pattern:

def build_coordinator_agent(church: dict, rag_context: str = "") -> LlmAgent:
    # 1. Assemble tool list based on church config and feature flags
    tools = [send_sms_link, end_call]
    if church.get("address"):
        tools.append(send_directions_link)
    if church.get("giving_enabled") and church.get("giving_url"):
        tools.append(send_giving_link)
    # ... more conditional tools ...

    # 2. Add tier-gated handoffs (Care Agent)
    tier = church.get("plan", "starter")
    if "care" in TIER_AGENTS[tier]:
        care_agent = build_care_agent(church, rag_context)
        tools.append(agent_as_handoff(care_agent, name="transfer_to_care", ...))

    # 3. Build LlmAgent with church-specific prompt
    agent = LlmAgent(
        model=_MODEL,
        api_key=_api_key(),
        tools=tools,
        config=LlmConfig(
            system_prompt=build_coordinator_prompt(church),
            introduction=f"Thank you for calling {church_name}...",
            temperature=0.7,
            fallbacks=_FALLBACKS,
        ),
    )

    # 4. Inject RAG context as system message
    if rag_context:
        agent.history.add_entry(rag_context, role="system")

    return agent

Feature flags from the church config dict control which tools are available:

visitor_intake_enabled -- capture_visitor_contact tool
address -- send_directions_link tool
events -- register_for_event tool
cal_enabled + cal_event_type_id -- check_availability + book_appointment tools
pco_enabled + pco_app_id + pco_secret -- Planning Center tools (service times, events, staff)
giving_enabled + (giving_url or etransfer_email) -- send_giving_link tool

Sales and Demo Agent Architecture

Beyond church calls, the voice agent serves two additional agent types:

Sales Agent

Handles the toll-free line. Has access to:

search_churches -- look up churches in the PewSearch directory
schedule_demo -- book a demo call
capture_support -- log support requests
send_sms_link -- send URLs via SMS
Demo handoffs -- one per configured demo church, with voice swapping

Voice gender is randomly assigned 50/50 per call (Carson male / Brooke female). Demo handoffs use the opposite gender voice via UpdateCallConfig so the caller hears a distinct voice change.

Demo Router Agent

Handles demo phone lines. A lightweight router that greets the caller, offers a choice of demo churches (Protestant or Catholic), and transfers to the selected Demo Agent. No sales knowledge, no product info. If the caller asks about pricing, it directs them to the toll-free number.

Demo Agent

A leaf agent (no handoffs) that role-plays as a specific church's receptionist. All tools are no-op mocks (acknowledged but not persisted to DB) except send_directions_link, which sends a real SMS so the prospect can see the feature in action. Church facts are loaded from the database, with a hardcoded fallback (DEMO_CHURCH_FALLBACK_FACTS) if the DB load fails.

RAG Architecture

The voice agent uses two tiers of RAG:

Session-Init RAG (`fetch_session_rag`)

Runs once when the call starts. Generates an embedding from a broad seed query ("Tell me about {church_name}, services, events, programs, ministries") and searches two sources in parallel:

Church knowledge base (search_church_knowledge RPC) -- church-specific FAQs and uploaded document chunks. 8 results, 0.35 similarity threshold.
Unified RAG content (search_unified_rag_content RPC) -- theological content filtered by the church's denomination (mapped to a theological lens ID). 5 results, 0.35 threshold.

Results are formatted into labeled blocks and injected into the agent's system prompt.

Per-Turn RAG (`fetch_turn_rag`)

Runs on every caller utterance (unless moderation fired or message is < 10 characters). Searches church knowledge base only (not theological) with a stricter 0.4 threshold and a hard 500ms timeout. If Supabase is slow, RAG is silently skipped for that turn -- the call never blocks waiting for search results.

Church Data Loading

load_church_data() in supabase_church.py assembles a complete church configuration by joining three Supabase queries:

church_voice_agents joined with churches and premium_churches -- identity, address, denomination, voice config, feature toggles, integrations, pastor info, weekly content, custom hours/staff/ministries/events
organization_settings -- agent configuration (personality, enabled agents)
premium_churches (standalone query) -- plan tier and call-limit fields

The result is cached for 5 minutes with stale-serve fallback on error. Call-limit enforcement happens at load time: if calls_this_month >= calls_limit, the function returns None and the call routes to the Sales Agent.

Deployment

The voice agent deploys via git push to the main branch. Railway auto-deploys the Python agent worker from GitHub. LiveKit Cloud connects to the agent worker automatically via the agent worker's WebSocket connection — no separate cartesia connect or cartesia deploy step is needed.

# Push to main — Railway auto-deploys
git push origin main

Environment variables (Supabase keys, API keys, LiveKit credentials, Twilio credentials) are configured in the Railway service environment, not in .env files shipped with the code. See runbooks/deployment/deploy-voice-agent.md for the full deploy procedure.

Legacy Code

The churchwiseai-web/voice-agent-line/ directory is the legacy Cartesia LINE SDK implementation. It has been replaced by churchwiseai-web/voice-agent-livekit/. Do not modify voice-agent-line/ — it exists as reference code only. The even older Node.js agent at churchwiseai-web/voice-agent/ is also fully legacy and must not be modified.

Multi-Tenant Design​

Call Routing: SIP Trunk Phone Number​

Tier 1: resolve_route(to_number)​

Tier 2: Agent Type Determines Builder​

Tier 3: Per-Church Data Loading​

Phone Registry​

Caching Strategy​

Non-Fatal Error Handling​

TurnProcessor Pipeline​

LLM Configuration​

Session State​

Call Log Lifecycle​

Agent Builder Pattern​

Sales and Demo Agent Architecture​

Sales Agent​

Demo Router Agent​

Demo Agent​

RAG Architecture​

Session-Init RAG (fetch_session_rag)​

Per-Turn RAG (fetch_turn_rag)​

Church Data Loading​

Deployment​

Legacy Code​