Knowledge > Products > Voice Agent > Call Lifecycle
Voice Agent Call Lifecycle
This document describes what happens from the moment a phone rings to the moment the call ends. Written as pseudocode readable by a non-developer.
Phase 1: Call Arrives
1. Phone rings
Someone calls a church phone number (or the toll-free sales line).
2. Twilio receives the call
Twilio is the telephony provider. It knows which number was called
and who is calling (caller ID).
3. Twilio forwards the call via SIP trunk to LiveKit Cloud
The Twilio number is configured with a SIP trunk that forwards all
calls to the LiveKit Cloud SIP gateway (project: cwa-voice-9x077mph).
LiveKit Cloud dispatches a job to the Railway agent worker and manages
the audio room.
4. The LiveKit agent worker receives the job via JobContext
The entry point in main.py receives a JobContext object containing:
- room.sip.trunkPhoneNumber = the number that was called
- room.sip.from_ = the caller's phone number
- room.name = a unique LiveKit room identifier for this call
Phase 2: Route the Call
5. resolve_route(to_number) determines the agent type
The system checks the called number against three lists:
a. TOLL_FREE_NUMBER (+18886030316)
Result: ("sales", None) -- route to the Sales Agent
b. DEMO_NUMBERS (set of demo line numbers)
Result: ("demo_router", None) -- route to the Demo Router
(lets callers choose which demo church to experience)
c. PHONE_REGISTRY (static map of Twilio numbers to church IDs)
Result: ("church", church_id) -- route to a specific church
d. Not found in any list
Result: ("church", None) -- needs a database lookup
Phase 3: Load Church Data (Church Calls Only)
6. If church_id is unknown, look it up in the database
Query church_voice_agents where twilio_phone_number matches.
Result cached for 5 minutes. If still not found, fall back to Sales Agent.
7. Load church data from Supabase (load_church_data)
Three parallel queries:
a. church_voice_agents joined with churches table
(name, address, denomination, pastor name, voice config,
feature toggles, giving settings, integration keys)
b. organization_settings (chatbot agent config)
c. premium_churches (plan tier, call limits)
These are assembled into a single "church" dict with ~50 fields.
8. Check call limit
Compare calls_this_month against calls_limit.
If the church has used all their calls this month, reject the call
(return None, which causes a fallback to the Sales Agent).
9. Insert call log -- initial database row
Write to voice_call_logs with:
- call_id (from LiveKit SIP attributes)
- church_id
- from_number (caller phone)
- to_number (church phone)
- status: "in_progress"
This creates the record that will be updated throughout the call.
10. Increment call count
Bump calls_this_month on church_voice_agents by 1.
Non-fatal: if this fails, the church gets a free call rather than
a dropped call.
Phase 4: Load Context (Parallel)
11. Load contextual data in parallel (all run at the same time):
a. Session RAG (fetch_session_rag)
Generate an embedding for a broad seed query:
"Tell me about [Church Name], services, events, programs, ministries"
Search two sources in parallel:
- Church knowledge base (FAQs + uploaded documents) -- up to 8 results
- Unified theological content (filtered by denomination lens) -- up to 5 results
Format results into prompt blocks.
b. Product knowledge (load_product_knowledge)
Load all active rows from product_knowledge table, ordered by priority.
Formatted as Q&A pairs. Cached for 15 minutes.
c. Inline FAQs (load_inline_faqs)
Load church-specific FAQ pairs from church_knowledge_base table.
These are SEPARATE from RAG vector search -- injected verbatim.
Cached for 5 minutes.
d. Repeat caller history (load_repeat_caller_history)
Query the last 5 calls within 90 days from the same phone number
to the same church. Extract summaries.
Result is privacy-gated: agent is told "Do NOT mention these
unless the caller brings them up first."
e. Datetime context (build_datetime_context)
Current date, time, timezone for the church.
Includes relative references: "This Sunday means March 30, 2026."
Helps the agent correctly interpret "this Sunday" or "next week."
12. Combine all context blocks
Join RAG, product knowledge, FAQs, repeat history, and datetime
into one large context string, separated by double newlines.
Phase 5: Build the Agent
13. Build Coordinator Agent
The Coordinator is the front-door agent for all church calls.
It receives:
- The church dict (50+ config fields)
- The combined RAG context
It is configured with:
- LLM: Gemini 2.5 Flash (COORDINATOR_MODEL)
- Tools: send_sms_link, end_call, plus feature-gated tools
(capture_visitor_contact, send_directions_link, register_for_event,
check_availability, book_appointment, PCO tools, send_giving_link)
- transfer_to_care: handoff tool that routes to the Care Agent
(Care Agent uses Claude Haiku 4.5 for better empathy)
- Introduction greeting:
"Thank you for calling [Church Name]. I'm an AI assistant
and this call may be recorded. How can I help you today?"
14. Attach call context and start session
Call metadata (call_id, caller_phone, church_data, supabase) is attached
to the agent as _call_context so tools can access it.
Per-turn moderation, noise filtering, RAG injection, and farewell detection
run via on_user_turn_completed() callback on each Agent class.
The agent is now ready. See voice-turn-processing.md for the pipeline.
Phase 6: Conversation Loop (Per Turn)
15. For each turn of the conversation:
a. Caller speaks
Audio goes to Deepgram STT (speech-to-text via livekit-plugins-deepgram).
Result: a text string of what the caller said.
b. Cancel any pending farewell timer
If the caller speaks again after a goodbye, the auto-hangup
is cancelled.
c. "Are you there?" reassurance check
If the agent is currently processing (LLM or tool call in-flight)
and the caller says "are you there?" or "hello?":
-> Reply immediately: "Yes, I'm here! Just one more moment."
-> Do NOT cancel the pending work. Do NOT forward to LLM.
d. Moderation checks (BEFORE noise filtering -- safety first)
THREAT CHECK:
If the caller makes violent threats against others
(e.g., "I'm going to shoot up the church"):
-> Hardcoded response (bypasses LLM entirely):
"I need to stop you right there. This call is being
recorded and logged. I'm ending this call now.
If you or someone else is in danger, please call
nine one one."
-> Log the violation to moderation_violations table
-> Send alert emails and SMS to church + support
-> Wait 4 seconds for the message to play, then hang up
-> RETURN (call is over)
CRISIS CHECK:
If the caller expresses suicidal ideation, self-harm, or
coded crisis language (e.g., "I just can't do this anymore",
"I'm tired of living", "ready to meet my maker"):
-> Log the violation
-> Send crisis alert emails and SMS
-> Inject context directive into the LLM:
"CRITICAL: Caller may be in crisis. Provide the 988
Suicide and Crisis Lifeline immediately."
-> Set crisis_detected = true (disables auto-hangup)
-> Continue to LLM (agent handles with injected directive)
ABUSE CHECK:
If the caller uses profanity or hostile language:
First offense: inject "The caller used inappropriate language.
Respond calmly and redirect." into LLM context
Second offense: hardcoded response:
"I'm going to end this call now. Have a good day."
-> Wait 2 seconds, then hang up
-> RETURN (call is over)
e. Noise filtering (AFTER moderation -- only if moderation did not fire)
Check if the utterance is pure noise ("um", "uh huh", "mmm"):
-> Drop silently. No LLM call.
Check if it's context-dependent ("okay", "yeah", "sure"):
-> If agent asked a question: pass through (it's a valid answer)
-> If agent did NOT ask a question: drop silently
f. Per-turn RAG (500ms hard timeout)
If the caller said something meaningful (10+ characters) and
moderation did not fire:
-> Generate embedding for the caller's message
-> Search church knowledge base (stricter threshold: 0.4)
-> If Supabase is slow (>500ms), skip RAG for this turn
(the call continues without extra context)
g. Combine contexts
Merge any moderation directive + per-turn RAG into a combined
context string.
h. Process through LLM
Send the caller's message + combined context to the LlmAgent.
The LLM generates a response, potentially calling tools.
If a tool is called (e.g., submit_prayer_request):
-> Send a filler phrase first: "One moment." or
"Let me check on that." (randomized, never boring)
-> Execute the tool
-> LLM processes the tool result and generates a response
-> Tools marked as "background" (prayer, callback) survive
barge-in and yield intermediate messages.
i. Response plays as audio
The LLM's text response goes to Cartesia Sonic TTS
(text-to-speech via livekit-plugins-cartesia) and plays as
natural-sounding audio to the caller.
j. Track the agent's response
Record whether the agent asked a question (contains "?")
for noise filtering on the next turn.
k. Mutual farewell detection (disabled during crisis)
After the LLM responds, check if BOTH the agent and the
caller said goodbye:
Agent farewell phrases: "take care", "have a blessed",
"god bless", "goodbye", etc.
Caller farewell phrases: "bye", "good night", "that's all",
"thank you" (short), etc.
If mutual farewell detected:
-> Set farewell_pending = true
-> Wait 4 seconds (grace period for farewell audio to finish)
-> If caller hasn't spoken again: auto-hangup
-> If caller spoke: cancel the hangup (they had more to say)
Phase 7: Call Ends
16. Call ends (CallEnded event received)
a. Extract transcript from agent history
Convert all events (UserTextSent, AgentTextSent, tool calls)
into a JSON array.
b. Update call log (update_call_log_end)
Write to voice_call_logs:
- status: "completed"
- duration_seconds: elapsed time since call start
- transcript: the full JSON transcript
c. Async classification via Gemini 2.5 Flash
Send the transcript text to Gemini Flash with a structured
prompt requesting exactly 7 fields:
SUMMARY: 1-2 sentence factual summary of the call
SENTIMENT: -1.0 (very distressed) to 1.0 (very positive)
TOPICS: comma-separated list (prayer, visitor, giving, etc.)
CATEGORY: single primary category (prayer_request, visitor, etc.)
URGENCY: low | normal | urgent | pastoral_emergency
FOLLOW_UP: true | false (should a staff member review?)
ASSIGNEE: pastor | office_admin | prayer_team | care_team |
volunteer_coordinator | finance_team | none
d. Update call log with classification
Write the parsed classification fields back to voice_call_logs.
This powers the admin dashboard's call list with AI-generated
summaries, urgency flags, and suggested assignees.
Key Design Principles
- Non-fatal everything: Every Supabase call, every notification, every classification is wrapped in try/except. A database hiccup never drops a live call.
- Parallel loading: Session init loads RAG, product knowledge, FAQs, repeat history, and datetime all at the same time. This keeps call setup under 2 seconds.
- Cache-first with stale fallback: Church data is cached for 5 minutes. If Supabase errors during a cache refresh, the stale cached data is served rather than failing the call.
- Safety before convenience: Moderation checks run before noise filtering. A crisis utterance is never silently dropped, even if it matches noise patterns.
- Grace periods: Farewell auto-hangup waits 4 seconds. Threat response waits 4 seconds. Abuse response waits 2 seconds. These let the audio finish playing before the line drops.