Documentation Index
Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt
Use this file to discover all available pages before exploring further.
Version: 1.0
Last Updated: 2026-03-15
Status: Architecture Reference
Related: PF_FW_BUSINESS_AUTOMATION_WORKFLOW_ARCHITECTURE_RESEARCH.md
Recommendation: R-FW-12
Overview
Encore Health OS uses 5 distinct event delivery paths to handle the full spectrum of
reactive behavior across the platform — from real-time UI updates to durable business
automation to external system integration. Each path makes different trade-offs around
durability, latency, retry semantics, and auditability.
Choosing the wrong path leads to subtle bugs: lost events, duplicate processing, blocked
request cycles, or audit gaps that surface during compliance reviews. This document gives
developers a clear framework for selecting the right delivery path for any given use case.
Key principle: prefer the least powerful path that satisfies your requirements. UI
updates do not need durable queues. Business workflows should not depend on WebSocket
connectivity. External integrations must not block database triggers.
Event Delivery Paths
Path 1: Table-Driven Domain Events (Primary)
Role: The backbone of Encore Health OS business automation. Domain events are the
canonical mechanism for triggering workflows, automation rules, and cross-core reactions.
Flow:
Application Code
│
▼
publishEvent(event_type, payload, org_id)
│
▼
INSERT → fw_domain_events table
│
▼
fw_process_domain_event() AFTER INSERT trigger
│
├── Pattern-match against fw_automation_rules
├── Pattern-match against fw_workflow_definitions (trigger config)
│
▼
INSERT → fw_workflow_executions (status = 'pending')
│
▼
Enqueue → workflow_execution_queue (pgmq)
│
▼
pg_cron worker picks up message (10-30s poll)
│
▼
Execute workflow steps → update execution status
Use when:
- Business automation triggers (e.g., “when a resident moves in, create billing record”)
- Workflow execution that must survive server restarts
- Cross-core reactions that need an audit trail
- Any event that compliance or governance may need to review
Guarantees:
| Property | Value |
|---|
| Durability | Persisted in PostgreSQL (fw_domain_events) |
| Delivery | At-least-once via pgmq worker retry |
| Ordering | Per-event-type ordering within an organization |
| Audit trail | Full — event, trigger match, execution log, step results |
| Idempotency | Consumer responsibility; event_id provided for dedup |
Key tables:
| Table | Purpose |
|---|
fw_domain_events | Immutable event log (event_type, payload, org_id, timestamp) |
fw_automation_rules | Configurable trigger-condition-action rules |
fw_workflow_definitions | Workflow templates with trigger configuration |
fw_workflow_executions | Running/completed workflow instances |
workflow_execution_queue | pgmq queue for pending execution pickup |
Latency: 10-30 seconds (pg_cron poll interval). Not suitable for user-facing
real-time feedback.
Example event types:
resident.moved_in, resident.moved_out, resident.status_changed
invoice.created, payment.received, claim.submitted
staff.onboarded, credential.expiring, shift.completed
Path 2: HTTP Event Consumer (Synchronous)
Role: Immediate, request-scoped side effects that must complete before responding to
the user.
Flow:
Application Code (browser or server)
│
▼
supabase.functions.invoke('event-consumer', {
body: { event_type, payload }
})
│
▼
Edge Function: event-consumer
│
├── Validate event
├── Route to handler
├── Execute side effect (send email, call API, update record)
│
▼
Synchronous HTTP response (success/failure)
│
▼
Caller handles response
Use when:
- Immediate side effects needed in the request cycle (e.g., send welcome email on signup)
- Real-time validation against external systems
- Operations where the caller needs confirmation of completion
- Simple request-response patterns that do not fan out
Guarantees:
| Property | Value |
|---|
| Durability | None — in-memory only during execution |
| Delivery | At-most-once (if Edge Function fails, caller gets error) |
| Ordering | N/A — single synchronous call |
| Audit trail | Only if the handler explicitly logs |
| Timeout | Supabase Edge Function limit (typically 30s) |
Latency: < 2 seconds for typical operations.
When NOT to use:
- Long-running workflows (> 10 seconds)
- Fan-out to multiple consumers
- Operations that must retry on failure without user intervention
Path 3: Supabase Realtime (UI Updates Only)
Role: Push updates to connected browser clients for live UI reactivity. This path is
strictly for display purposes and must never drive business logic.
Flow (Postgres Changes):
Database INSERT/UPDATE/DELETE
│
▼
Supabase Realtime (Postgres Changes listener)
│
▼
WebSocket broadcast to subscribed clients
│
▼
Client callback → React state update → UI re-render
Flow (Broadcast):
Application Code (any client)
│
▼
supabase.channel('room').send({ type: 'broadcast', event, payload })
│
▼
Supabase Realtime relay
│
▼
WebSocket broadcast to other clients in channel
│
▼
Client callback → React state update → UI re-render
Use when:
- Notification badge counts and live notification feeds
- Dashboard widgets that reflect latest data
- Presence indicators (who is online, who is editing)
- Collaborative editing cursors or selection highlights
- Live list/table updates without polling
Guarantees:
| Property | Value |
|---|
| Durability | None — ephemeral, not persisted |
| Delivery | Best-effort; clients miss events if disconnected |
| Ordering | Preserved per channel within a single connection |
| Audit trail | None |
| Reconnection | Client library auto-reconnects but missed events are lost |
Latency: < 500 milliseconds (WebSocket push).
NOT suitable for:
- Business logic execution
- Data consistency enforcement
- Audit-required operations
- Anything that must happen even if no client is connected
Path 4: External Event Forwarding (PF-35 Phase 2)
Role: Deliver events to external systems (EHR platforms, clearinghouses, payer portals,
partner organizations) via outbound webhooks with compliance safeguards.
Flow:
fw_domain_events INSERT
│
▼
fw_process_domain_event() trigger
│
├── Check pf_event_subscriptions for matching external subscribers
├── Apply glob pattern matching on event_type
│
▼
Match found
│
├── 42 CFR Part 2 consent guard check (substance abuse treatment data)
├── Apply JSONPath payload transformation (strip internal fields)
│
▼
Enqueue → event_forwarding_queue (pgmq)
│
▼
Delivery worker picks up message
│
├── POST to subscriber webhook URL
├── Verify response (2xx = success)
│
▼
├── Success → ACK message, log delivery
└── Failure → Retry (up to 3 attempts, exponential backoff)
│
└── Max retries exceeded → fw_dead_letter_queue
Use when:
- External EHR system integration (HL7 FHIR event notifications)
- Clearinghouse claim status callbacks
- Payer portal authorization updates
- Partner organization data sharing
- Webhook-based integrations with third-party platforms
Guarantees:
| Property | Value |
|---|
| Durability | Persisted via pgmq (event_forwarding_queue) |
| Delivery | At-least-once with retry (3 attempts, exponential backoff) |
| Ordering | Best-effort per subscriber; not strictly guaranteed |
| Audit trail | Full — subscription match, transformation, delivery attempts, DLQ |
| Dead letter | Failed deliveries move to fw_dead_letter_queue for manual review |
Key tables:
| Table | Purpose |
|---|
pf_event_subscriptions | External subscriber registration (URL, event patterns, transform) |
event_forwarding_queue | pgmq queue for pending deliveries |
fw_dead_letter_queue | Failed deliveries after max retries |
Compliance:
- 42 CFR Part 2: Before forwarding any event involving substance abuse treatment data,
the consent guard verifies that appropriate patient consent exists for the receiving
organization. Events without valid consent are blocked and logged.
- HIPAA: Payload transformation strips internal identifiers and limits PHI to the
minimum necessary for the subscriber’s stated purpose.
- Audit: Every forwarding attempt (success or failure) is logged with timestamp,
subscriber ID, response status, and payload hash.
Supports:
- Glob pattern matching on event types (e.g.,
resident.*, billing.payment.*)
- JSONPath payload transformation for per-subscriber field selection
- Per-subscriber retry configuration
- Webhook signature verification (HMAC-SHA256)
Latency: 30 seconds to 5 minutes depending on queue depth and retry schedule.
Path 5: pgmq Direct Queue (Internal Async)
Role: General-purpose internal async processing with guaranteed delivery and
backpressure support. Used when work must happen asynchronously but does not fit the
domain event / workflow model.
Flow:
Application Code (Edge Function, trigger, or worker)
│
▼
SELECT pgmq.send('queue_name', message_json)
│
▼
Message persisted in pgmq queue table
│
▼
Consumer: SELECT pgmq.read('queue_name', visibility_timeout, batch_size)
│
├── Message becomes invisible to other consumers
├── Process message
│
▼
├── Success → SELECT pgmq.archive('queue_name', msg_id) -- ACK
└── Failure → Message becomes visible again after timeout -- implicit NACK
│
└── Max retries exceeded → move to DLQ
Active queues in Encore Health OS:
| Queue | Purpose | Consumer |
|---|
workflow_execution_queue | Pending workflow executions | pg_cron workflow worker |
workflow_dlq | Failed workflow executions | Manual review / retry |
event_forwarding_queue | Outbound webhook deliveries | pg_cron forwarding worker |
notification_queue | Email, SMS, push notifications | Notification Edge Function |
pf_embedding_jobs | AI embedding generation | Embedding Edge Function |
Use when:
- Guaranteed async processing with backpressure
- Batch operations (e.g., nightly report generation, bulk imports)
- Rate-limited external API calls (e.g., clearinghouse submissions)
- Work that benefits from visibility timeout to prevent double-processing
- Fan-out from a single event to multiple independent work items
Guarantees:
| Property | Value |
|---|
| Durability | Persisted in PostgreSQL (pgmq tables) |
| Delivery | At-least-once; visibility timeout prevents double-processing |
| Ordering | FIFO within a single queue |
| Audit trail | Via pgmq archive tables (configurable retention) |
| Backpressure | Consumers control read batch size and poll interval |
Latency: Depends on consumer poll interval. Typically 5-60 seconds.
Decision Tree
Use this flowchart to select the appropriate event delivery path:
"I need to react to something happening in the system"
│
▼
┌───────────────────────┐
│ Is it a UI-only │
│ update? (badge count, │ YES ┌──────────────────────────┐
│ live list, presence) │──────────>│ Path 3: Supabase │
└───────────┬───────────┘ │ Realtime │
│ NO │ (WebSocket, best-effort) │
▼ └──────────────────────────┘
┌───────────────────────┐
│ Must it complete │
│ synchronously in the │ YES ┌──────────────────────────┐
│ current request │──────────>│ Path 2: HTTP Event │
│ cycle? │ │ Consumer │
└───────────┬───────────┘ │ (Edge Function, sync) │
│ NO └──────────────────────────┘
▼
┌───────────────────────┐
│ Does it trigger a │
│ business workflow or │ YES ┌──────────────────────────┐
│ automation rule? │──────────>│ Path 1: Table-Driven │
└───────────┬───────────┘ │ Domain Events │
│ NO │ (fw_domain_events, pgmq) │
▼ └──────────────────────────┘
┌───────────────────────┐
│ Does it need to reach │
│ an external system? │ YES ┌──────────────────────────┐
│ (EHR, payer, partner) │──────────>│ Path 4: External Event │
└───────────┬───────────┘ │ Forwarding │
│ NO │ (webhook + DLQ) │
▼ └──────────────────────────┘
┌───────────────────────┐
│ Is it async internal │
│ work? (embedding, │ YES ┌──────────────────────────┐
│ notification, batch) │──────────>│ Path 5: pgmq Direct │
└───────────┬───────────┘ │ Queue │
│ NO │ (internal async work) │
▼ └──────────────────────────┘
┌───────────────────────┐
│ Re-evaluate: you may │
│ need a combination of │
│ paths. See "Composed │
│ Patterns" below. │
└───────────────────────┘
Path Comparison Matrix
| Property | Path 1: Domain Events | Path 2: HTTP Consumer | Path 3: Realtime | Path 4: External Forwarding | Path 5: pgmq Direct |
|---|
| Durability | Persistent (DB) | None (in-flight) | None (ephemeral) | Persistent (pgmq) | Persistent (pgmq) |
| Latency | 10-30s | < 2s | < 500ms | 30s-5min | 5-60s |
| Delivery | At-least-once | At-most-once | Best-effort | At-least-once | At-least-once |
| Retry | Automatic (worker) | Caller responsibility | None | 3 attempts + DLQ | Visibility timeout |
| Audit trail | Full | Explicit only | None | Full | Archive tables |
| Ordering | Per-type/org | N/A | Per-channel | Best-effort | FIFO per queue |
| Multi-tenant | org_id scoped | Caller scoped | Channel scoped | Subscription scoped | Message scoped |
| Backpressure | Queue depth | Connection pool | Channel limits | Queue depth | Consumer batch size |
| Use case | Business automation | Immediate side effects | UI reactivity | External integration | Internal async work |
| Complexity | Medium | Low | Low | High | Medium |
Composed Patterns
Many real-world scenarios combine multiple paths. Here are common compositions:
Resident Move-In (Paths 1 + 3 + 5)
Resident record updated (status = 'active')
│
├── Path 1: publishEvent('resident.moved_in') → triggers billing workflow,
│ creates admission documentation tasks, updates census
│
├── Path 3: Realtime Postgres Changes → dashboard census widget updates live
│
└── Path 5: pgmq notification_queue → send welcome packet email + SMS
Claim Submission (Paths 1 + 4 + 2)
Claim finalized by billing staff
│
├── Path 1: publishEvent('claim.submitted') → audit trail, status tracking
│
├── Path 4: External forwarding → clearinghouse receives claim via webhook
│
└── Path 2: HTTP consumer → real-time eligibility verification response
shown to user before submission completes
Anti-Patterns
1. Using Realtime (Path 3) for Business Logic
Wrong:
// Subscribing to Realtime to trigger a billing workflow
supabase.channel('billing')
.on('postgres_changes', { table: 'invoices' }, (payload) => {
createBillingRecord(payload.new); // BUG: missed if client disconnects
})
Why it fails: Realtime is best-effort. If the browser tab closes, the billing record
is never created. There is no retry, no audit trail, and no guarantee of delivery.
Correct: Use Path 1 (domain events) to trigger the billing workflow server-side.
2. Using HTTP Consumer (Path 2) for Long-Running Workflows
Wrong:
// Calling an Edge Function that runs a 15-step approval workflow
const { data } = await supabase.functions.invoke('run-full-workflow', {
body: { workflowId, data }
});
// User waits 45 seconds... Edge Function times out
Why it fails: Edge Functions have execution time limits. Long workflows should be
broken into steps and processed asynchronously.
Correct: Use Path 1 (domain events) to enqueue the workflow, which the worker
processes step-by-step with checkpointing.
3. Bypassing Domain Events for Automatable Triggers
Wrong:
-- Direct INSERT in a trigger instead of going through domain events
CREATE TRIGGER create_billing_on_admission
AFTER INSERT ON rh_residents
FOR EACH ROW
EXECUTE FUNCTION create_billing_record_directly();
Why it fails: No audit trail, no pattern matching, no ability to configure or disable
the automation via the UI, no visibility into what triggered what.
Correct: Publish a domain event and let the automation engine match it to configured
rules.
4. Calling External APIs Synchronously from DB Triggers
Wrong:
-- Making an HTTP call inside a trigger
CREATE TRIGGER notify_ehr_on_discharge
AFTER UPDATE ON rh_residents
FOR EACH ROW
WHEN (NEW.status = 'discharged')
EXECUTE FUNCTION http_post_to_ehr(); -- blocks transaction if EHR is slow
Why it fails: Database triggers run inside the transaction. If the external API is
slow or down, the transaction is blocked or rolled back, affecting the user.
Correct: Use Path 4 (external event forwarding) to asynchronously deliver the event
to the EHR with retry and DLQ support.
5. Using pgmq for User-Facing Real-Time Feedback
Wrong:
// Polling pgmq from the browser to check if a task completed
setInterval(async () => {
const result = await checkQueueForMyResult();
if (result) updateUI(result);
}, 1000);
Why it fails: Polling adds unnecessary load and latency. The browser should not
interact with pgmq directly.
Correct: Use Path 5 (pgmq) for the async processing, then use Path 3 (Realtime) to
push the result to the UI when processing completes.
Failure Handling Summary
| Path | Failure Mode | Recovery |
|---|
| Path 1 | Worker crash mid-execution | pgmq visibility timeout re-exposes message; workflow resumes from last checkpoint |
| Path 2 | Edge Function error | HTTP error returned to caller; caller decides retry strategy |
| Path 3 | Client disconnect | Events lost; client re-fetches state on reconnect |
| Path 4 | Webhook delivery failure | Exponential backoff retry (3 attempts); dead letter queue; manual review dashboard |
| Path 5 | Consumer crash | Visibility timeout re-exposes message; next read picks it up |
Monitoring and Observability
Each path has different monitoring needs:
| Path | Key Metrics | Alert Thresholds |
|---|
| Path 1 | fw_domain_events unprocessed count, workflow execution duration, failure rate | > 100 unprocessed events, execution > 5 min, failure rate > 5% |
| Path 2 | Edge Function invocation latency, error rate | p95 > 5s, error rate > 2% |
| Path 3 | Realtime channel count, message throughput | > 1000 concurrent channels per org |
| Path 4 | event_forwarding_queue depth, DLQ size, delivery success rate | Queue depth > 500, DLQ > 0, success rate < 95% |
| Path 5 | Per-queue depth, processing latency, DLQ size | Queue-specific thresholds; any DLQ entry triggers alert |
Multi-Tenancy Considerations
All event delivery paths enforce tenant isolation:
- Path 1:
fw_domain_events.organization_id scopes events; RLS policies prevent cross-tenant reads; trigger matching is org-scoped.
- Path 2: Edge Function receives
organization_id from authenticated JWT; all queries are tenant-filtered.
- Path 3: Realtime channels include org ID in channel name (e.g.,
org:{uuid}:notifications); RLS on underlying tables.
- Path 4:
pf_event_subscriptions.organization_id ensures subscribers only receive their own events; consent checks are per-org.
- Path 5: pgmq messages include
organization_id in payload; consumers filter and validate.
| Reference | Title | Relevance |
|---|
| FW-16 | Workflow Engine Core | Defines fw_workflow_definitions, fw_workflow_executions, and step execution model used by Path 1 |
| FW-46 | Durable Execution Worker | Durable worker process, retry/state management for executions |
| FW-47 | Dead Letter Queue | Failed-message capture and inspection for Path 1 |
| PF-35 | Platform Event Bus & External Forwarding | Covers pf_event_subscriptions, webhook delivery, DLQ, and 42 CFR Part 2 consent guard (Path 4) |
| PF-10 | Supabase Realtime Integration | Platform patterns for Realtime channels, presence, and Postgres Changes subscriptions (Path 3) |
| PF-85 | pgmq Queue Infrastructure | Queue provisioning, consumer patterns, visibility timeout configuration, and DLQ management (Path 5) |
Revision History
| Version | Date | Author | Changes |
|---|
| 1.0 | 2026-03-15 | Architecture Team | Initial document based on R-FW-12 recommendation |