Tests
The source of truth for portfolio-level testing. Not the tests themselves — most tests live in their respective code repos under e2e/ and src/**/__tests__/ — but the registries, specs, handoffs, and meta-tests that coordinate them.
The registries
Two YAML files drive a lot of CI + agent behavior. Edit here; downstream workflows pick up changes automatically:
| File | Purpose |
|---|---|
registry.yaml | The test + critical-path registry. Every tracked test domain (chatbot, voice, visual, knowledge, journeys, smoke, unit) and every critical_path: true entry lives here. PRs touching a critical-path code_files entry must attach a matching Playwright artifact or carry an override label. |
tag-registry.yaml | Persona + lifecycle + property tag vocabulary. Prevents tag drift across journeys, acceptance specs, and test personas. Add new tags here first, then reference from content. |
Derive verification tests
TypeScript unit tests that guard the knowledge-repo derive pipeline itself. Run via pnpm test (vitest):
derive-e2e.test.ts,derive-narrative.test.ts,derive-pricing.test.ts,derive-policies.test.ts,derive-products.test.ts,derive-gdrive.test.ts— per-target derive test suites.index-completeness.test.ts,links-integrity.test.ts,manifest.test.ts— cross-cutting structural tests.secret-detection.test.ts,google-oauth-config.test.ts,supabase-auth-config.test.ts— config safety tests.
These run in the knowledge repo's CI and locally before every derive push.
Persona-based tests
Agents (chatbot + voice) are not just tested for "does it respond?" — they're tested for "does Pastor Ruth in crisis feel heard?" Persona-driven tests sit in personas/.
persona-test-prompts.md— the canonical personas + their expected experiencespersona-test-results-2026-03-31.md— historical result snapshot- See
../processes/agent-persona-team.mdfor how personas are used
Journeys
Machine-readable YAML journey definitions at journeys/ drive the /journey-runner skill. They're distinct from the human-readable persona × product journey docs at ../journeys/ — these are evaluable, those are narrative. Each file describes a concrete user goal, start URL, expected observations, and success criteria.
Specs + baselines
specs/— per-feature expected-output specs (tier-gating behavior, chatbot response bounds)baselines/— captured ground-truth outputs for regression comparisonresults/— historical test run results preserved for trend analysisscripts/— helper scripts for test setup, fixture generation, baseline capturemanual-test-paths/— click-through scripts for things too hard to automate (cross-browser manual passes)
Handoffs + audits
Point-in-time context docs that informed the current test infrastructure:
HANDOFF.md,handoff-2026-03-31-morning.md,handoff-starter-kit-pdf-refresh.md— legacy handoff snapshotsSPEC_WRITING_HANDOFF.md— guidance for writing new expected-output specsaudit-chatbot-vs-canonical-2026-03-31.md— chatbot behavior auditpre-ceo-hardening-audit-2026-03-30.md— pre-launch hardening audit
Related
../processes/qa-checklist.md— the 10-section QA pass run before declaring customer-facing work complete../processes/critical-path-definition.md— what counts as a critical path and why../acceptance/— tier-specific expected-output specs; the "what works looks like" that these tests validate against../readiness/—/ensure-solidscorecards that orchestrate tests here into a go-live readiness signal