Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt

Use this file to discover all available pages before exploring further.

Version: 1.0.0
Date: 2026-01-16
Status: Planning
Reference: docs/migration/SUPABASE_PERFORMANCE_RECOMMENDATIONS.md

Executive Summary

This document outlines the data archival strategy for the Encore Health OS platform. The goal is to maintain database performance by archiving old data while ensuring compliance with healthcare data retention requirements.

1. Archival Candidates

1.1 Audit & Activity Logs

TableRetention PeriodArchive AfterPriority
pf_audit_logs7 years (HIPAA)1 yearHigh
pf_activity_logs3 years6 monthsHigh
pf_login_history2 years6 monthsMedium
pf_api_logs1 year3 monthsMedium

1.2 Workflow & Execution Logs

TableRetention PeriodArchive AfterPriority
fw_workflow_executions3 years1 yearHigh
fw_execution_logs2 years6 monthsHigh
fw_approval_history5 years1 yearMedium
fw_debug_sessions90 days30 daysLow
fw_sandbox_executions90 days30 daysLow

1.3 Notifications & Communications

TableRetention PeriodArchive AfterPriority
pf_notifications2 years6 monthsHigh
pf_notification_deliveries1 year3 monthsMedium
pf_email_logs2 years6 monthsMedium

1.4 Temporary & Session Data

TableRetention PeriodArchive AfterPriority
pf_sessions30 daysN/A (delete)High
pf_temp_files7 daysN/A (delete)High
fw_form_drafts90 days30 daysMedium

2. Healthcare Compliance Requirements

2.1 HIPAA Requirements

  • Patient Records: Minimum 6 years from date of creation or last effective date
  • Audit Logs: Minimum 6 years (some states require longer)
  • Business Records: Generally 6-7 years

2.2 State-Specific Requirements

StateRequirementNotes
California7 yearsFrom date of service
New York6 yearsFrom date of discharge
Florida7 yearsFor adults, 7 years for minors after reaching majority
Texas10 yearsFor adults

2.3 Compliance Rules

  1. Never archive PHI-containing records prematurely
  2. Maintain audit trail for archived data
  3. Ensure archived data is retrievable
  4. Document archival process for compliance audits

3. Archival Methods

Create parallel archive tables in a separate schema.
-- Create archive schema
CREATE SCHEMA IF NOT EXISTS archive;

-- Create archive table for audit logs
CREATE TABLE archive.pf_audit_logs (
    LIKE public.pf_audit_logs INCLUDING ALL
);

-- Move old records to archive
INSERT INTO archive.pf_audit_logs
SELECT * FROM public.pf_audit_logs
WHERE created_at < NOW() - INTERVAL '1 year';

-- Delete from main table
DELETE FROM public.pf_audit_logs
WHERE created_at < NOW() - INTERVAL '1 year';
Advantages:
  • Data stays in same database
  • Easy to query archived data when needed
  • Simple implementation
Disadvantages:
  • Storage costs remain in database
  • May need to manage archive table growth

3.2 Method 2: External Storage (Future)

Export to external storage for long-term retention. Options:
  • AWS S3 / Glacier (cold storage)
  • Azure Blob Storage
  • Google Cloud Storage
Use When:
  • Archive data exceeds practical database size
  • Cost optimization is priority
  • Data is rarely accessed

3.3 Method 3: Hard Delete (Non-PHI Only)

For temporary data that doesn’t require retention.
-- Delete old sessions
DELETE FROM pf_sessions
WHERE expires_at < NOW() - INTERVAL '7 days';

-- Delete old temp files
DELETE FROM pf_temp_files
WHERE created_at < NOW() - INTERVAL '7 days';
Use Only For:
  • Session data
  • Temporary files
  • Debug/sandbox data
  • Non-business-critical logs

4. Implementation Plan

Phase 1: Infrastructure Setup

  1. Create archive schema
  2. Create archive tables for high-priority candidates
  3. Set up archive RLS policies (platform_admin only)
  4. Create archival functions
-- Create archive schema
CREATE SCHEMA IF NOT EXISTS archive;

-- Grant access to platform admins only
GRANT USAGE ON SCHEMA archive TO authenticated;

Phase 2: Archival Functions

-- Example archival function
CREATE OR REPLACE FUNCTION archive_old_audit_logs(retention_days INTEGER DEFAULT 365)
RETURNS INTEGER
LANGUAGE plpgsql
SECURITY DEFINER
SET search_path = public
AS $$
DECLARE
    archived_count INTEGER;
BEGIN
    -- Insert into archive
    INSERT INTO archive.pf_audit_logs
    SELECT * FROM public.pf_audit_logs
    WHERE created_at < NOW() - (retention_days || ' days')::INTERVAL;
    
    GET DIAGNOSTICS archived_count = ROW_COUNT;
    
    -- Delete from main table
    DELETE FROM public.pf_audit_logs
    WHERE created_at < NOW() - (retention_days || ' days')::INTERVAL;
    
    RETURN archived_count;
END;
$$;

Phase 3: Scheduled Archival

Options for scheduling:
  1. Supabase Edge Functions (cron trigger)
  2. External scheduler (n8n, Vercel cron)
  3. pg_cron extension (if available)
Recommended Schedule:
  • Weekly archival runs (low-traffic period)
  • Monthly summary reports
  • Quarterly compliance audits

Phase 4: Monitoring

  1. Track archive table sizes
  2. Monitor archival job success/failure
  3. Alert on failed archival runs
  4. Report on archived record counts

5. Table Size Assessment Query

Run this query to identify archival candidates by size:
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
    pg_total_relation_size(schemaname||'.'||tablename) as size_bytes
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 30;

6. Archival Verification Query

After archival, verify data integrity:
-- Count records by date range
SELECT 
    DATE_TRUNC('month', created_at) as month,
    COUNT(*) as record_count
FROM public.pf_audit_logs
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month DESC
LIMIT 12;

-- Verify no records older than retention period
SELECT COUNT(*) 
FROM public.pf_audit_logs
WHERE created_at < NOW() - INTERVAL '1 year';

-- Compare archive counts
SELECT 
    'public' as schema,
    COUNT(*) as count
FROM public.pf_audit_logs
UNION ALL
SELECT 
    'archive' as schema,
    COUNT(*) as count
FROM archive.pf_audit_logs;

7. Risk Mitigation

7.1 Data Loss Prevention

  • Always archive before delete
  • Test archival process in staging first
  • Maintain backup before first production run
  • Verify record counts after each run

7.2 Compliance

  • Document archival dates for audit
  • Maintain chain of custody for PHI
  • Test data retrieval from archive
  • Regular compliance review

7.3 Performance

  • Run archival during low-traffic periods
  • Batch large archival operations
  • Monitor lock contention
  • Use transactions with appropriate isolation

8. Success Metrics

MetricTargetMeasurement
Archive completion rate100%Jobs completed / Jobs scheduled
Data integrity100%Archived count = Deleted count
Query performance<1sArchive query execution time
Storage reduction>20%Main table size reduction

9. Next Steps

  1. Run table size assessment query
  2. Identify top 10 archival candidates by size
  3. Create archive schema and initial tables
  4. Implement archival functions
  5. Test in staging environment
  6. Schedule production archival
  7. Set up monitoring and alerting

10. References


Document Owner: Platform Team
Review Schedule: Quarterly
Last Updated: 2026-01-16