Skip to main content

Knowledge > Runbooks > Content Ops > Audit Content Quality and Completeness

Audit Content Quality and Completeness

Systematically audit the unified_rag_content table to identify quality gaps, missing content, and improvement opportunities across ITW illustrations and church knowledge base entries.

Prerequisites

  • Supabase MCP or direct DB access
  • Access to sermon-illustrations/scripts/ for batch fixes
  • Time: a full audit typically takes 1–2 hours

Safety Rules

  • NEVER bulk-delete from unified_rag_content — use is_public = false to suppress content.
  • Always scope queries by category to avoid accidentally modifying non-illustration content.
  • ONE production database — all changes are live immediately.

Steps

Phase 1: Quantitative Overview

  1. Count content by category and visibility:

    SELECT
    category,
    is_public,
    COUNT(*) AS count,
    AVG(length(content)) AS avg_chars,
    MIN(length(content)) AS min_chars,
    MAX(length(content)) AS max_chars
    FROM unified_rag_content
    WHERE category IN ('illustration', 'church_kb')
    GROUP BY category, is_public
    ORDER BY category, is_public;
  2. Check theological lens coverage (illustrations only):

    SELECT
    tl.name AS lens,
    COUNT(urc.id) AS total,
    COUNT(CASE WHEN urc.is_public = true THEN 1 END) AS public_count,
    COUNT(CASE WHEN urc.image_url IS NOT NULL THEN 1 END) AS has_image
    FROM sai_theological_lenses tl
    LEFT JOIN unified_rag_content urc
    ON urc.theological_lens_id = tl.id
    AND urc.category = 'illustration'
    GROUP BY tl.id, tl.name
    ORDER BY public_count ASC;

    Flag lenses with fewer than 20 public illustrations as gaps.

  3. Identify stubs (short content):

    SELECT COUNT(*) AS stub_count
    FROM unified_rag_content
    WHERE category = 'illustration'
    AND length(content) < 300
    AND is_public = true;
  4. Identify illustrations missing images:

    SELECT COUNT(*) AS missing_image_count
    FROM unified_rag_content
    WHERE category = 'illustration'
    AND is_public = true
    AND image_url IS NULL;
  5. Check for missing scripture references:

    SELECT COUNT(*) AS missing_ref
    FROM unified_rag_content
    WHERE category = 'illustration'
    AND is_public = true
    AND (scripture_reference IS NULL OR scripture_reference = '');

Phase 2: Topic Coverage Analysis

  1. Check topic tag distribution:

    SELECT unnest(topic_tags) AS tag, COUNT(*) AS count
    FROM unified_rag_content
    WHERE category = 'illustration'
    AND is_public = true
    GROUP BY tag
    ORDER BY count DESC
    LIMIT 40;

    Look for: missing important topics (grace, prayer, hope, forgiveness) or over-indexed topics.

  2. Check scripture book coverage — identify major books with no illustrations:

    SELECT
    SUBSTRING(scripture_reference FROM '^[^0-9]+') AS book,
    COUNT(*) AS count
    FROM unified_rag_content
    WHERE category = 'illustration'
    AND is_public = true
    AND scripture_reference IS NOT NULL
    GROUP BY book
    ORDER BY count ASC
    LIMIT 20;

Phase 3: Quality Spot-Check

  1. Sample 20 random illustrations for manual quality review:
    SELECT id, title, scripture_reference, content, theological_lens_id
    FROM unified_rag_content
    WHERE category = 'illustration'
    AND is_public = true
    ORDER BY RANDOM()
    LIMIT 20;
    Review each for: theological accuracy, appropriate length, scripture match, readability.

Phase 4: Prioritize and Fix

  1. Build a priority fix list:

    Gap TypePriorityFix Runbook
    Crisis-level stubs (< 100 chars)P1regenerate-stubs.md
    Lens with < 10 illustrationsP1generate-lens-content.md
    Common topics with < 5 illustrationsP2generate-by-topic.md
    Missing scripture referencesP2Manual UPDATE
    Missing imagesP3illustration-images.md
    Lens with 10–20 illustrationsP3generate-lens-content.md
  2. Suppress low-quality content that cannot be improved by regeneration:

    UPDATE unified_rag_content
    SET is_public = false, updated_at = now()
    WHERE id IN ('[uuid1]', '[uuid2]')
    AND category = 'illustration';

Verification

Run the overview queries again after fixes to confirm:

  • Stub count is zero or near-zero.
  • All lenses have at least 20 public illustrations.
  • No major topic gaps remain.
  • Image coverage has improved.

See Also