Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt

Use this file to discover all available pages before exploring further.

Version: 1.0 Last Updated: 2026-03-15 Status: Architecture Reference Related: PF_FW_BUSINESS_AUTOMATION_WORKFLOW_ARCHITECTURE_RESEARCH.md Recommendation: R-FW-12

Overview

Encore Health OS uses 5 distinct event delivery paths to handle the full spectrum of reactive behavior across the platform — from real-time UI updates to durable business automation to external system integration. Each path makes different trade-offs around durability, latency, retry semantics, and auditability. Choosing the wrong path leads to subtle bugs: lost events, duplicate processing, blocked request cycles, or audit gaps that surface during compliance reviews. This document gives developers a clear framework for selecting the right delivery path for any given use case. Key principle: prefer the least powerful path that satisfies your requirements. UI updates do not need durable queues. Business workflows should not depend on WebSocket connectivity. External integrations must not block database triggers.

Event Delivery Paths

Path 1: Table-Driven Domain Events (Primary)

Role: The backbone of Encore Health OS business automation. Domain events are the canonical mechanism for triggering workflows, automation rules, and cross-core reactions. Flow:
Application Code


publishEvent(event_type, payload, org_id)


INSERT → fw_domain_events table


fw_process_domain_event() AFTER INSERT trigger

    ├── Pattern-match against fw_automation_rules
    ├── Pattern-match against fw_workflow_definitions (trigger config)


INSERT → fw_workflow_executions (status = 'pending')


Enqueue → workflow_execution_queue (pgmq)


pg_cron worker picks up message (10-30s poll)


Execute workflow steps → update execution status
Use when:
  • Business automation triggers (e.g., “when a resident moves in, create billing record”)
  • Workflow execution that must survive server restarts
  • Cross-core reactions that need an audit trail
  • Any event that compliance or governance may need to review
Guarantees:
PropertyValue
DurabilityPersisted in PostgreSQL (fw_domain_events)
DeliveryAt-least-once via pgmq worker retry
OrderingPer-event-type ordering within an organization
Audit trailFull — event, trigger match, execution log, step results
IdempotencyConsumer responsibility; event_id provided for dedup
Key tables:
TablePurpose
fw_domain_eventsImmutable event log (event_type, payload, org_id, timestamp)
fw_automation_rulesConfigurable trigger-condition-action rules
fw_workflow_definitionsWorkflow templates with trigger configuration
fw_workflow_executionsRunning/completed workflow instances
workflow_execution_queuepgmq queue for pending execution pickup
Latency: 10-30 seconds (pg_cron poll interval). Not suitable for user-facing real-time feedback. Example event types:
  • resident.moved_in, resident.moved_out, resident.status_changed
  • invoice.created, payment.received, claim.submitted
  • staff.onboarded, credential.expiring, shift.completed

Path 2: HTTP Event Consumer (Synchronous)

Role: Immediate, request-scoped side effects that must complete before responding to the user. Flow:
Application Code (browser or server)


supabase.functions.invoke('event-consumer', {
  body: { event_type, payload }
})


Edge Function: event-consumer

    ├── Validate event
    ├── Route to handler
    ├── Execute side effect (send email, call API, update record)


Synchronous HTTP response (success/failure)


Caller handles response
Use when:
  • Immediate side effects needed in the request cycle (e.g., send welcome email on signup)
  • Real-time validation against external systems
  • Operations where the caller needs confirmation of completion
  • Simple request-response patterns that do not fan out
Guarantees:
PropertyValue
DurabilityNone — in-memory only during execution
DeliveryAt-most-once (if Edge Function fails, caller gets error)
OrderingN/A — single synchronous call
Audit trailOnly if the handler explicitly logs
TimeoutSupabase Edge Function limit (typically 30s)
Latency: < 2 seconds for typical operations. When NOT to use:
  • Long-running workflows (> 10 seconds)
  • Fan-out to multiple consumers
  • Operations that must retry on failure without user intervention

Path 3: Supabase Realtime (UI Updates Only)

Role: Push updates to connected browser clients for live UI reactivity. This path is strictly for display purposes and must never drive business logic. Flow (Postgres Changes):
Database INSERT/UPDATE/DELETE


Supabase Realtime (Postgres Changes listener)


WebSocket broadcast to subscribed clients


Client callback → React state update → UI re-render
Flow (Broadcast):
Application Code (any client)


supabase.channel('room').send({ type: 'broadcast', event, payload })


Supabase Realtime relay


WebSocket broadcast to other clients in channel


Client callback → React state update → UI re-render
Use when:
  • Notification badge counts and live notification feeds
  • Dashboard widgets that reflect latest data
  • Presence indicators (who is online, who is editing)
  • Collaborative editing cursors or selection highlights
  • Live list/table updates without polling
Guarantees:
PropertyValue
DurabilityNone — ephemeral, not persisted
DeliveryBest-effort; clients miss events if disconnected
OrderingPreserved per channel within a single connection
Audit trailNone
ReconnectionClient library auto-reconnects but missed events are lost
Latency: < 500 milliseconds (WebSocket push). NOT suitable for:
  • Business logic execution
  • Data consistency enforcement
  • Audit-required operations
  • Anything that must happen even if no client is connected

Path 4: External Event Forwarding (PF-35 Phase 2)

Role: Deliver events to external systems (EHR platforms, clearinghouses, payer portals, partner organizations) via outbound webhooks with compliance safeguards. Flow:
fw_domain_events INSERT


fw_process_domain_event() trigger

    ├── Check pf_event_subscriptions for matching external subscribers
    ├── Apply glob pattern matching on event_type


Match found

    ├── 42 CFR Part 2 consent guard check (substance abuse treatment data)
    ├── Apply JSONPath payload transformation (strip internal fields)


Enqueue → event_forwarding_queue (pgmq)


Delivery worker picks up message

    ├── POST to subscriber webhook URL
    ├── Verify response (2xx = success)


    ├── Success → ACK message, log delivery
    └── Failure → Retry (up to 3 attempts, exponential backoff)

              └── Max retries exceeded → fw_dead_letter_queue
Use when:
  • External EHR system integration (HL7 FHIR event notifications)
  • Clearinghouse claim status callbacks
  • Payer portal authorization updates
  • Partner organization data sharing
  • Webhook-based integrations with third-party platforms
Guarantees:
PropertyValue
DurabilityPersisted via pgmq (event_forwarding_queue)
DeliveryAt-least-once with retry (3 attempts, exponential backoff)
OrderingBest-effort per subscriber; not strictly guaranteed
Audit trailFull — subscription match, transformation, delivery attempts, DLQ
Dead letterFailed deliveries move to fw_dead_letter_queue for manual review
Key tables:
TablePurpose
pf_event_subscriptionsExternal subscriber registration (URL, event patterns, transform)
event_forwarding_queuepgmq queue for pending deliveries
fw_dead_letter_queueFailed deliveries after max retries
Compliance:
  • 42 CFR Part 2: Before forwarding any event involving substance abuse treatment data, the consent guard verifies that appropriate patient consent exists for the receiving organization. Events without valid consent are blocked and logged.
  • HIPAA: Payload transformation strips internal identifiers and limits PHI to the minimum necessary for the subscriber’s stated purpose.
  • Audit: Every forwarding attempt (success or failure) is logged with timestamp, subscriber ID, response status, and payload hash.
Supports:
  • Glob pattern matching on event types (e.g., resident.*, billing.payment.*)
  • JSONPath payload transformation for per-subscriber field selection
  • Per-subscriber retry configuration
  • Webhook signature verification (HMAC-SHA256)
Latency: 30 seconds to 5 minutes depending on queue depth and retry schedule.

Path 5: pgmq Direct Queue (Internal Async)

Role: General-purpose internal async processing with guaranteed delivery and backpressure support. Used when work must happen asynchronously but does not fit the domain event / workflow model. Flow:
Application Code (Edge Function, trigger, or worker)


SELECT pgmq.send('queue_name', message_json)


Message persisted in pgmq queue table


Consumer: SELECT pgmq.read('queue_name', visibility_timeout, batch_size)

    ├── Message becomes invisible to other consumers
    ├── Process message


    ├── Success → SELECT pgmq.archive('queue_name', msg_id)  -- ACK
    └── Failure → Message becomes visible again after timeout  -- implicit NACK

              └── Max retries exceeded → move to DLQ
Active queues in Encore Health OS:
QueuePurposeConsumer
workflow_execution_queuePending workflow executionspg_cron workflow worker
workflow_dlqFailed workflow executionsManual review / retry
event_forwarding_queueOutbound webhook deliveriespg_cron forwarding worker
notification_queueEmail, SMS, push notificationsNotification Edge Function
pf_embedding_jobsAI embedding generationEmbedding Edge Function
Use when:
  • Guaranteed async processing with backpressure
  • Batch operations (e.g., nightly report generation, bulk imports)
  • Rate-limited external API calls (e.g., clearinghouse submissions)
  • Work that benefits from visibility timeout to prevent double-processing
  • Fan-out from a single event to multiple independent work items
Guarantees:
PropertyValue
DurabilityPersisted in PostgreSQL (pgmq tables)
DeliveryAt-least-once; visibility timeout prevents double-processing
OrderingFIFO within a single queue
Audit trailVia pgmq archive tables (configurable retention)
BackpressureConsumers control read batch size and poll interval
Latency: Depends on consumer poll interval. Typically 5-60 seconds.

Decision Tree

Use this flowchart to select the appropriate event delivery path:
    "I need to react to something happening in the system"


              ┌───────────────────────┐
              │ Is it a UI-only       │
              │ update? (badge count, │    YES    ┌──────────────────────────┐
              │ live list, presence)  │──────────>│ Path 3: Supabase         │
              └───────────┬───────────┘           │ Realtime                 │
                          │ NO                    │ (WebSocket, best-effort) │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Must it complete      │
              │ synchronously in the  │    YES    ┌──────────────────────────┐
              │ current request       │──────────>│ Path 2: HTTP Event       │
              │ cycle?                │           │ Consumer                 │
              └───────────┬───────────┘           │ (Edge Function, sync)    │
                          │ NO                    └──────────────────────────┘

              ┌───────────────────────┐
              │ Does it trigger a     │
              │ business workflow or  │    YES    ┌──────────────────────────┐
              │ automation rule?      │──────────>│ Path 1: Table-Driven     │
              └───────────┬───────────┘           │ Domain Events            │
                          │ NO                    │ (fw_domain_events, pgmq) │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Does it need to reach │
              │ an external system?   │    YES    ┌──────────────────────────┐
              │ (EHR, payer, partner) │──────────>│ Path 4: External Event   │
              └───────────┬───────────┘           │ Forwarding               │
                          │ NO                    │ (webhook + DLQ)          │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Is it async internal  │
              │ work? (embedding,     │    YES    ┌──────────────────────────┐
              │ notification, batch)  │──────────>│ Path 5: pgmq Direct     │
              └───────────┬───────────┘           │ Queue                    │
                          │ NO                    │ (internal async work)    │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Re-evaluate: you may  │
              │ need a combination of │
              │ paths. See "Composed  │
              │ Patterns" below.      │
              └───────────────────────┘

Path Comparison Matrix

PropertyPath 1: Domain EventsPath 2: HTTP ConsumerPath 3: RealtimePath 4: External ForwardingPath 5: pgmq Direct
DurabilityPersistent (DB)None (in-flight)None (ephemeral)Persistent (pgmq)Persistent (pgmq)
Latency10-30s< 2s< 500ms30s-5min5-60s
DeliveryAt-least-onceAt-most-onceBest-effortAt-least-onceAt-least-once
RetryAutomatic (worker)Caller responsibilityNone3 attempts + DLQVisibility timeout
Audit trailFullExplicit onlyNoneFullArchive tables
OrderingPer-type/orgN/APer-channelBest-effortFIFO per queue
Multi-tenantorg_id scopedCaller scopedChannel scopedSubscription scopedMessage scoped
BackpressureQueue depthConnection poolChannel limitsQueue depthConsumer batch size
Use caseBusiness automationImmediate side effectsUI reactivityExternal integrationInternal async work
ComplexityMediumLowLowHighMedium

Composed Patterns

Many real-world scenarios combine multiple paths. Here are common compositions:

Resident Move-In (Paths 1 + 3 + 5)

Resident record updated (status = 'active')

    ├── Path 1: publishEvent('resident.moved_in') → triggers billing workflow,
    │           creates admission documentation tasks, updates census

    ├── Path 3: Realtime Postgres Changes → dashboard census widget updates live

    └── Path 5: pgmq notification_queue → send welcome packet email + SMS

Claim Submission (Paths 1 + 4 + 2)

Claim finalized by billing staff

    ├── Path 1: publishEvent('claim.submitted') → audit trail, status tracking

    ├── Path 4: External forwarding → clearinghouse receives claim via webhook

    └── Path 2: HTTP consumer → real-time eligibility verification response
                shown to user before submission completes

Anti-Patterns

1. Using Realtime (Path 3) for Business Logic

Wrong:
// Subscribing to Realtime to trigger a billing workflow
supabase.channel('billing')
  .on('postgres_changes', { table: 'invoices' }, (payload) => {
    createBillingRecord(payload.new);  // BUG: missed if client disconnects
  })
Why it fails: Realtime is best-effort. If the browser tab closes, the billing record is never created. There is no retry, no audit trail, and no guarantee of delivery. Correct: Use Path 1 (domain events) to trigger the billing workflow server-side.

2. Using HTTP Consumer (Path 2) for Long-Running Workflows

Wrong:
// Calling an Edge Function that runs a 15-step approval workflow
const { data } = await supabase.functions.invoke('run-full-workflow', {
  body: { workflowId, data }
});
// User waits 45 seconds... Edge Function times out
Why it fails: Edge Functions have execution time limits. Long workflows should be broken into steps and processed asynchronously. Correct: Use Path 1 (domain events) to enqueue the workflow, which the worker processes step-by-step with checkpointing.

3. Bypassing Domain Events for Automatable Triggers

Wrong:
-- Direct INSERT in a trigger instead of going through domain events
CREATE TRIGGER create_billing_on_admission
AFTER INSERT ON rh_residents
FOR EACH ROW
EXECUTE FUNCTION create_billing_record_directly();
Why it fails: No audit trail, no pattern matching, no ability to configure or disable the automation via the UI, no visibility into what triggered what. Correct: Publish a domain event and let the automation engine match it to configured rules.

4. Calling External APIs Synchronously from DB Triggers

Wrong:
-- Making an HTTP call inside a trigger
CREATE TRIGGER notify_ehr_on_discharge
AFTER UPDATE ON rh_residents
FOR EACH ROW
WHEN (NEW.status = 'discharged')
EXECUTE FUNCTION http_post_to_ehr();  -- blocks transaction if EHR is slow
Why it fails: Database triggers run inside the transaction. If the external API is slow or down, the transaction is blocked or rolled back, affecting the user. Correct: Use Path 4 (external event forwarding) to asynchronously deliver the event to the EHR with retry and DLQ support.

5. Using pgmq for User-Facing Real-Time Feedback

Wrong:
// Polling pgmq from the browser to check if a task completed
setInterval(async () => {
  const result = await checkQueueForMyResult();
  if (result) updateUI(result);
}, 1000);
Why it fails: Polling adds unnecessary load and latency. The browser should not interact with pgmq directly. Correct: Use Path 5 (pgmq) for the async processing, then use Path 3 (Realtime) to push the result to the UI when processing completes.

Failure Handling Summary

PathFailure ModeRecovery
Path 1Worker crash mid-executionpgmq visibility timeout re-exposes message; workflow resumes from last checkpoint
Path 2Edge Function errorHTTP error returned to caller; caller decides retry strategy
Path 3Client disconnectEvents lost; client re-fetches state on reconnect
Path 4Webhook delivery failureExponential backoff retry (3 attempts); dead letter queue; manual review dashboard
Path 5Consumer crashVisibility timeout re-exposes message; next read picks it up

Monitoring and Observability

Each path has different monitoring needs:
PathKey MetricsAlert Thresholds
Path 1fw_domain_events unprocessed count, workflow execution duration, failure rate> 100 unprocessed events, execution > 5 min, failure rate > 5%
Path 2Edge Function invocation latency, error ratep95 > 5s, error rate > 2%
Path 3Realtime channel count, message throughput> 1000 concurrent channels per org
Path 4event_forwarding_queue depth, DLQ size, delivery success rateQueue depth > 500, DLQ > 0, success rate < 95%
Path 5Per-queue depth, processing latency, DLQ sizeQueue-specific thresholds; any DLQ entry triggers alert

Multi-Tenancy Considerations

All event delivery paths enforce tenant isolation:
  • Path 1: fw_domain_events.organization_id scopes events; RLS policies prevent cross-tenant reads; trigger matching is org-scoped.
  • Path 2: Edge Function receives organization_id from authenticated JWT; all queries are tenant-filtered.
  • Path 3: Realtime channels include org ID in channel name (e.g., org:{uuid}:notifications); RLS on underlying tables.
  • Path 4: pf_event_subscriptions.organization_id ensures subscribers only receive their own events; consent checks are per-org.
  • Path 5: pgmq messages include organization_id in payload; consumers filter and validate.

ReferenceTitleRelevance
FW-16Workflow Engine CoreDefines fw_workflow_definitions, fw_workflow_executions, and step execution model used by Path 1
FW-46Durable Execution WorkerDurable worker process, retry/state management for executions
FW-47Dead Letter QueueFailed-message capture and inspection for Path 1
PF-35Platform Event Bus & External ForwardingCovers pf_event_subscriptions, webhook delivery, DLQ, and 42 CFR Part 2 consent guard (Path 4)
PF-10Supabase Realtime IntegrationPlatform patterns for Realtime channels, presence, and Postgres Changes subscriptions (Path 3)
PF-85pgmq Queue InfrastructureQueue provisioning, consumer patterns, visibility timeout configuration, and DLQ management (Path 5)

Revision History

VersionDateAuthorChanges
1.02026-03-15Architecture TeamInitial document based on R-FW-12 recommendation