> ## Documentation Index
> Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Event Delivery Architecture

> Version: 1.0 Last Updated: 2026-03-15 Status: Architecture Reference Related: PF_FW_BUSINESS_AUTOMATION_WORKFLOW_ARCHITECTURE_RESEARCH.md Recommendation: R-FW-…

**Version:** 1.0
**Last Updated:** 2026-03-15
**Status:** Architecture Reference
**Related:** [PF\_FW\_BUSINESS\_AUTOMATION\_WORKFLOW\_ARCHITECTURE\_RESEARCH.md](PF_FW_BUSINESS_AUTOMATION_WORKFLOW_ARCHITECTURE_RESEARCH.md)
**Recommendation:** R-FW-12

***

## Overview

Encore Health OS uses **5 distinct event delivery paths** to handle the full spectrum of
reactive behavior across the platform -- from real-time UI updates to durable business
automation to external system integration. Each path makes different trade-offs around
durability, latency, retry semantics, and auditability.

Choosing the wrong path leads to subtle bugs: lost events, duplicate processing, blocked
request cycles, or audit gaps that surface during compliance reviews. This document gives
developers a clear framework for selecting the right delivery path for any given use case.

**Key principle:** prefer the *least powerful* path that satisfies your requirements. UI
updates do not need durable queues. Business workflows should not depend on WebSocket
connectivity. External integrations must not block database triggers.

***

## Event Delivery Paths

### Path 1: Table-Driven Domain Events (Primary)

**Role:** The backbone of Encore Health OS business automation. Domain events are the
canonical mechanism for triggering workflows, automation rules, and cross-core reactions.

**Flow:**

```
Application Code
    │
    ▼
publishEvent(event_type, payload, org_id)
    │
    ▼
INSERT → fw_domain_events table
    │
    ▼
fw_process_domain_event() AFTER INSERT trigger
    │
    ├── Pattern-match against fw_automation_rules
    ├── Pattern-match against fw_workflow_definitions (trigger config)
    │
    ▼
INSERT → fw_workflow_executions (status = 'pending')
    │
    ▼
Enqueue → workflow_execution_queue (pgmq)
    │
    ▼
pg_cron worker picks up message (10-30s poll)
    │
    ▼
Execute workflow steps → update execution status
```

**Use when:**

* Business automation triggers (e.g., "when a resident moves in, create billing record")
* Workflow execution that must survive server restarts
* Cross-core reactions that need an audit trail
* Any event that compliance or governance may need to review

**Guarantees:**

| Property    | Value                                                     |
| ----------- | --------------------------------------------------------- |
| Durability  | Persisted in PostgreSQL (`fw_domain_events`)              |
| Delivery    | At-least-once via pgmq worker retry                       |
| Ordering    | Per-event-type ordering within an organization            |
| Audit trail | Full -- event, trigger match, execution log, step results |
| Idempotency | Consumer responsibility; `event_id` provided for dedup    |

**Key tables:**

| Table                      | Purpose                                                        |
| -------------------------- | -------------------------------------------------------------- |
| `fw_domain_events`         | Immutable event log (event\_type, payload, org\_id, timestamp) |
| `fw_automation_rules`      | Configurable trigger-condition-action rules                    |
| `fw_workflow_definitions`  | Workflow templates with trigger configuration                  |
| `fw_workflow_executions`   | Running/completed workflow instances                           |
| `workflow_execution_queue` | pgmq queue for pending execution pickup                        |

**Latency:** 10-30 seconds (pg\_cron poll interval). Not suitable for user-facing
real-time feedback.

**Example event types:**

* `resident.moved_in`, `resident.moved_out`, `resident.status_changed`
* `invoice.created`, `payment.received`, `claim.submitted`
* `staff.onboarded`, `credential.expiring`, `shift.completed`

***

### Path 2: HTTP Event Consumer (Synchronous)

**Role:** Immediate, request-scoped side effects that must complete before responding to
the user.

**Flow:**

```
Application Code (browser or server)
    │
    ▼
supabase.functions.invoke('event-consumer', {
  body: { event_type, payload }
})
    │
    ▼
Edge Function: event-consumer
    │
    ├── Validate event
    ├── Route to handler
    ├── Execute side effect (send email, call API, update record)
    │
    ▼
Synchronous HTTP response (success/failure)
    │
    ▼
Caller handles response
```

**Use when:**

* Immediate side effects needed in the request cycle (e.g., send welcome email on signup)
* Real-time validation against external systems
* Operations where the caller needs confirmation of completion
* Simple request-response patterns that do not fan out

**Guarantees:**

| Property    | Value                                                    |
| ----------- | -------------------------------------------------------- |
| Durability  | None -- in-memory only during execution                  |
| Delivery    | At-most-once (if Edge Function fails, caller gets error) |
| Ordering    | N/A -- single synchronous call                           |
| Audit trail | Only if the handler explicitly logs                      |
| Timeout     | Supabase Edge Function limit (typically 30s)             |

**Latency:** \< 2 seconds for typical operations.

**When NOT to use:**

* Long-running workflows (> 10 seconds)
* Fan-out to multiple consumers
* Operations that must retry on failure without user intervention

***

### Path 3: Supabase Realtime (UI Updates Only)

**Role:** Push updates to connected browser clients for live UI reactivity. This path is
strictly for display purposes and must never drive business logic.

**Flow (Postgres Changes):**

```
Database INSERT/UPDATE/DELETE
    │
    ▼
Supabase Realtime (Postgres Changes listener)
    │
    ▼
WebSocket broadcast to subscribed clients
    │
    ▼
Client callback → React state update → UI re-render
```

**Flow (Broadcast):**

```
Application Code (any client)
    │
    ▼
supabase.channel('room').send({ type: 'broadcast', event, payload })
    │
    ▼
Supabase Realtime relay
    │
    ▼
WebSocket broadcast to other clients in channel
    │
    ▼
Client callback → React state update → UI re-render
```

**Use when:**

* Notification badge counts and live notification feeds
* Dashboard widgets that reflect latest data
* Presence indicators (who is online, who is editing)
* Collaborative editing cursors or selection highlights
* Live list/table updates without polling

**Guarantees:**

| Property     | Value                                                     |
| ------------ | --------------------------------------------------------- |
| Durability   | None -- ephemeral, not persisted                          |
| Delivery     | Best-effort; clients miss events if disconnected          |
| Ordering     | Preserved per channel within a single connection          |
| Audit trail  | None                                                      |
| Reconnection | Client library auto-reconnects but missed events are lost |

**Latency:** \< 500 milliseconds (WebSocket push).

**NOT suitable for:**

* Business logic execution
* Data consistency enforcement
* Audit-required operations
* Anything that must happen even if no client is connected

***

### Path 4: External Event Forwarding (PF-35 Phase 2)

**Role:** Deliver events to external systems (EHR platforms, clearinghouses, payer portals,
partner organizations) via outbound webhooks with compliance safeguards.

**Flow:**

```
fw_domain_events INSERT
    │
    ▼
fw_process_domain_event() trigger
    │
    ├── Check pf_event_subscriptions for matching external subscribers
    ├── Apply glob pattern matching on event_type
    │
    ▼
Match found
    │
    ├── 42 CFR Part 2 consent guard check (substance abuse treatment data)
    ├── Apply JSONPath payload transformation (strip internal fields)
    │
    ▼
Enqueue → event_forwarding_queue (pgmq)
    │
    ▼
Delivery worker picks up message
    │
    ├── POST to subscriber webhook URL
    ├── Verify response (2xx = success)
    │
    ▼
    ├── Success → ACK message, log delivery
    └── Failure → Retry (up to 3 attempts, exponential backoff)
              │
              └── Max retries exceeded → fw_dead_letter_queue
```

**Use when:**

* External EHR system integration (HL7 FHIR event notifications)
* Clearinghouse claim status callbacks
* Payer portal authorization updates
* Partner organization data sharing
* Webhook-based integrations with third-party platforms

**Guarantees:**

| Property    | Value                                                              |
| ----------- | ------------------------------------------------------------------ |
| Durability  | Persisted via pgmq (`event_forwarding_queue`)                      |
| Delivery    | At-least-once with retry (3 attempts, exponential backoff)         |
| Ordering    | Best-effort per subscriber; not strictly guaranteed                |
| Audit trail | Full -- subscription match, transformation, delivery attempts, DLQ |
| Dead letter | Failed deliveries move to `fw_dead_letter_queue` for manual review |

**Key tables:**

| Table                    | Purpose                                                           |
| ------------------------ | ----------------------------------------------------------------- |
| `pf_event_subscriptions` | External subscriber registration (URL, event patterns, transform) |
| `event_forwarding_queue` | pgmq queue for pending deliveries                                 |
| `fw_dead_letter_queue`   | Failed deliveries after max retries                               |

**Compliance:**

* **42 CFR Part 2:** Before forwarding any event involving substance abuse treatment data,
  the consent guard verifies that appropriate patient consent exists for the receiving
  organization. Events without valid consent are blocked and logged.
* **HIPAA:** Payload transformation strips internal identifiers and limits PHI to the
  minimum necessary for the subscriber's stated purpose.
* **Audit:** Every forwarding attempt (success or failure) is logged with timestamp,
  subscriber ID, response status, and payload hash.

**Supports:**

* Glob pattern matching on event types (e.g., `resident.*`, `billing.payment.*`)
* JSONPath payload transformation for per-subscriber field selection
* Per-subscriber retry configuration
* Webhook signature verification (HMAC-SHA256)

**Latency:** 30 seconds to 5 minutes depending on queue depth and retry schedule.

***

### Path 5: pgmq Direct Queue (Internal Async)

**Role:** General-purpose internal async processing with guaranteed delivery and
backpressure support. Used when work must happen asynchronously but does not fit the
domain event / workflow model.

**Flow:**

```
Application Code (Edge Function, trigger, or worker)
    │
    ▼
SELECT pgmq.send('queue_name', message_json)
    │
    ▼
Message persisted in pgmq queue table
    │
    ▼
Consumer: SELECT pgmq.read('queue_name', visibility_timeout, batch_size)
    │
    ├── Message becomes invisible to other consumers
    ├── Process message
    │
    ▼
    ├── Success → SELECT pgmq.archive('queue_name', msg_id)  -- ACK
    └── Failure → Message becomes visible again after timeout  -- implicit NACK
              │
              └── Max retries exceeded → move to DLQ
```

**Active queues in Encore Health OS:**

| Queue                      | Purpose                        | Consumer                   |
| -------------------------- | ------------------------------ | -------------------------- |
| `workflow_execution_queue` | Pending workflow executions    | pg\_cron workflow worker   |
| `workflow_dlq`             | Failed workflow executions     | Manual review / retry      |
| `event_forwarding_queue`   | Outbound webhook deliveries    | pg\_cron forwarding worker |
| `notification_queue`       | Email, SMS, push notifications | Notification Edge Function |
| `pf_embedding_jobs`        | AI embedding generation        | Embedding Edge Function    |

**Use when:**

* Guaranteed async processing with backpressure
* Batch operations (e.g., nightly report generation, bulk imports)
* Rate-limited external API calls (e.g., clearinghouse submissions)
* Work that benefits from visibility timeout to prevent double-processing
* Fan-out from a single event to multiple independent work items

**Guarantees:**

| Property     | Value                                                        |
| ------------ | ------------------------------------------------------------ |
| Durability   | Persisted in PostgreSQL (pgmq tables)                        |
| Delivery     | At-least-once; visibility timeout prevents double-processing |
| Ordering     | FIFO within a single queue                                   |
| Audit trail  | Via pgmq archive tables (configurable retention)             |
| Backpressure | Consumers control read batch size and poll interval          |

**Latency:** Depends on consumer poll interval. Typically 5-60 seconds.

***

## Decision Tree

Use this flowchart to select the appropriate event delivery path:

```
    "I need to react to something happening in the system"
                          │
                          ▼
              ┌───────────────────────┐
              │ Is it a UI-only       │
              │ update? (badge count, │    YES    ┌──────────────────────────┐
              │ live list, presence)  │──────────>│ Path 3: Supabase         │
              └───────────┬───────────┘           │ Realtime                 │
                          │ NO                    │ (WebSocket, best-effort) │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Must it complete      │
              │ synchronously in the  │    YES    ┌──────────────────────────┐
              │ current request       │──────────>│ Path 2: HTTP Event       │
              │ cycle?                │           │ Consumer                 │
              └───────────┬───────────┘           │ (Edge Function, sync)    │
                          │ NO                    └──────────────────────────┘
                          ▼
              ┌───────────────────────┐
              │ Does it trigger a     │
              │ business workflow or  │    YES    ┌──────────────────────────┐
              │ automation rule?      │──────────>│ Path 1: Table-Driven     │
              └───────────┬───────────┘           │ Domain Events            │
                          │ NO                    │ (fw_domain_events, pgmq) │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Does it need to reach │
              │ an external system?   │    YES    ┌──────────────────────────┐
              │ (EHR, payer, partner) │──────────>│ Path 4: External Event   │
              └───────────┬───────────┘           │ Forwarding               │
                          │ NO                    │ (webhook + DLQ)          │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Is it async internal  │
              │ work? (embedding,     │    YES    ┌──────────────────────────┐
              │ notification, batch)  │──────────>│ Path 5: pgmq Direct     │
              └───────────┬───────────┘           │ Queue                    │
                          │ NO                    │ (internal async work)    │
                          ▼                       └──────────────────────────┘
              ┌───────────────────────┐
              │ Re-evaluate: you may  │
              │ need a combination of │
              │ paths. See "Composed  │
              │ Patterns" below.      │
              └───────────────────────┘
```

***

## Path Comparison Matrix

| Property         | Path 1: Domain Events | Path 2: HTTP Consumer  | Path 3: Realtime | Path 4: External Forwarding | Path 5: pgmq Direct |
| ---------------- | --------------------- | ---------------------- | ---------------- | --------------------------- | ------------------- |
| **Durability**   | Persistent (DB)       | None (in-flight)       | None (ephemeral) | Persistent (pgmq)           | Persistent (pgmq)   |
| **Latency**      | 10-30s                | \< 2s                  | \< 500ms         | 30s-5min                    | 5-60s               |
| **Delivery**     | At-least-once         | At-most-once           | Best-effort      | At-least-once               | At-least-once       |
| **Retry**        | Automatic (worker)    | Caller responsibility  | None             | 3 attempts + DLQ            | Visibility timeout  |
| **Audit trail**  | Full                  | Explicit only          | None             | Full                        | Archive tables      |
| **Ordering**     | Per-type/org          | N/A                    | Per-channel      | Best-effort                 | FIFO per queue      |
| **Multi-tenant** | org\_id scoped        | Caller scoped          | Channel scoped   | Subscription scoped         | Message scoped      |
| **Backpressure** | Queue depth           | Connection pool        | Channel limits   | Queue depth                 | Consumer batch size |
| **Use case**     | Business automation   | Immediate side effects | UI reactivity    | External integration        | Internal async work |
| **Complexity**   | Medium                | Low                    | Low              | High                        | Medium              |

***

## Composed Patterns

Many real-world scenarios combine multiple paths. Here are common compositions:

### Resident Move-In (Paths 1 + 3 + 5)

```
Resident record updated (status = 'active')
    │
    ├── Path 1: publishEvent('resident.moved_in') → triggers billing workflow,
    │           creates admission documentation tasks, updates census
    │
    ├── Path 3: Realtime Postgres Changes → dashboard census widget updates live
    │
    └── Path 5: pgmq notification_queue → send welcome packet email + SMS
```

### Claim Submission (Paths 1 + 4 + 2)

```
Claim finalized by billing staff
    │
    ├── Path 1: publishEvent('claim.submitted') → audit trail, status tracking
    │
    ├── Path 4: External forwarding → clearinghouse receives claim via webhook
    │
    └── Path 2: HTTP consumer → real-time eligibility verification response
                shown to user before submission completes
```

***

## Anti-Patterns

### 1. Using Realtime (Path 3) for Business Logic

**Wrong:**

```
// Subscribing to Realtime to trigger a billing workflow
supabase.channel('billing')
  .on('postgres_changes', { table: 'invoices' }, (payload) => {
    createBillingRecord(payload.new);  // BUG: missed if client disconnects
  })
```

**Why it fails:** Realtime is best-effort. If the browser tab closes, the billing record
is never created. There is no retry, no audit trail, and no guarantee of delivery.

**Correct:** Use Path 1 (domain events) to trigger the billing workflow server-side.

***

### 2. Using HTTP Consumer (Path 2) for Long-Running Workflows

**Wrong:**

```
// Calling an Edge Function that runs a 15-step approval workflow
const { data } = await supabase.functions.invoke('run-full-workflow', {
  body: { workflowId, data }
});
// User waits 45 seconds... Edge Function times out
```

**Why it fails:** Edge Functions have execution time limits. Long workflows should be
broken into steps and processed asynchronously.

**Correct:** Use Path 1 (domain events) to enqueue the workflow, which the worker
processes step-by-step with checkpointing.

***

### 3. Bypassing Domain Events for Automatable Triggers

**Wrong:**

```sql theme={null}
-- Direct INSERT in a trigger instead of going through domain events
CREATE TRIGGER create_billing_on_admission
AFTER INSERT ON rh_residents
FOR EACH ROW
EXECUTE FUNCTION create_billing_record_directly();
```

**Why it fails:** No audit trail, no pattern matching, no ability to configure or disable
the automation via the UI, no visibility into what triggered what.

**Correct:** Publish a domain event and let the automation engine match it to configured
rules.

***

### 4. Calling External APIs Synchronously from DB Triggers

**Wrong:**

```sql theme={null}
-- Making an HTTP call inside a trigger
CREATE TRIGGER notify_ehr_on_discharge
AFTER UPDATE ON rh_residents
FOR EACH ROW
WHEN (NEW.status = 'discharged')
EXECUTE FUNCTION http_post_to_ehr();  -- blocks transaction if EHR is slow
```

**Why it fails:** Database triggers run inside the transaction. If the external API is
slow or down, the transaction is blocked or rolled back, affecting the user.

**Correct:** Use Path 4 (external event forwarding) to asynchronously deliver the event
to the EHR with retry and DLQ support.

***

### 5. Using pgmq for User-Facing Real-Time Feedback

**Wrong:**

```
// Polling pgmq from the browser to check if a task completed
setInterval(async () => {
  const result = await checkQueueForMyResult();
  if (result) updateUI(result);
}, 1000);
```

**Why it fails:** Polling adds unnecessary load and latency. The browser should not
interact with pgmq directly.

**Correct:** Use Path 5 (pgmq) for the async processing, then use Path 3 (Realtime) to
push the result to the UI when processing completes.

***

## Failure Handling Summary

| Path   | Failure Mode               | Recovery                                                                           |
| ------ | -------------------------- | ---------------------------------------------------------------------------------- |
| Path 1 | Worker crash mid-execution | pgmq visibility timeout re-exposes message; workflow resumes from last checkpoint  |
| Path 2 | Edge Function error        | HTTP error returned to caller; caller decides retry strategy                       |
| Path 3 | Client disconnect          | Events lost; client re-fetches state on reconnect                                  |
| Path 4 | Webhook delivery failure   | Exponential backoff retry (3 attempts); dead letter queue; manual review dashboard |
| Path 5 | Consumer crash             | Visibility timeout re-exposes message; next read picks it up                       |

***

## Monitoring and Observability

Each path has different monitoring needs:

| Path   | Key Metrics                                                                     | Alert Thresholds                                               |
| ------ | ------------------------------------------------------------------------------- | -------------------------------------------------------------- |
| Path 1 | `fw_domain_events` unprocessed count, workflow execution duration, failure rate | > 100 unprocessed events, execution > 5 min, failure rate > 5% |
| Path 2 | Edge Function invocation latency, error rate                                    | p95 > 5s, error rate > 2%                                      |
| Path 3 | Realtime channel count, message throughput                                      | > 1000 concurrent channels per org                             |
| Path 4 | `event_forwarding_queue` depth, DLQ size, delivery success rate                 | Queue depth > 500, DLQ > 0, success rate \< 95%                |
| Path 5 | Per-queue depth, processing latency, DLQ size                                   | Queue-specific thresholds; any DLQ entry triggers alert        |

***

## Multi-Tenancy Considerations

All event delivery paths enforce tenant isolation:

* **Path 1:** `fw_domain_events.organization_id` scopes events; RLS policies prevent cross-tenant reads; trigger matching is org-scoped.
* **Path 2:** Edge Function receives `organization_id` from authenticated JWT; all queries are tenant-filtered.
* **Path 3:** Realtime channels include org ID in channel name (e.g., `org:{uuid}:notifications`); RLS on underlying tables.
* **Path 4:** `pf_event_subscriptions.organization_id` ensures subscribers only receive their own events; consent checks are per-org.
* **Path 5:** pgmq messages include `organization_id` in payload; consumers filter and validate.

***

## Related Specifications

| Reference | Title                                    | Relevance                                                                                            |
| --------- | ---------------------------------------- | ---------------------------------------------------------------------------------------------------- |
| **FW-16** | Workflow Engine Core                     | Defines `fw_workflow_definitions`, `fw_workflow_executions`, and step execution model used by Path 1 |
| **FW-46** | Durable Execution Worker                 | Durable worker process, retry/state management for executions                                        |
| **FW-47** | Dead Letter Queue                        | Failed-message capture and inspection for Path 1                                                     |
| **PF-35** | Platform Event Bus & External Forwarding | Covers `pf_event_subscriptions`, webhook delivery, DLQ, and 42 CFR Part 2 consent guard (Path 4)     |
| **PF-10** | Supabase Realtime Integration            | Platform patterns for Realtime channels, presence, and Postgres Changes subscriptions (Path 3)       |
| **PF-85** | pgmq Queue Infrastructure                | Queue provisioning, consumer patterns, visibility timeout configuration, and DLQ management (Path 5) |

***

## Revision History

| Version | Date       | Author            | Changes                                          |
| ------- | ---------- | ----------------- | ------------------------------------------------ |
| 1.0     | 2026-03-15 | Architecture Team | Initial document based on R-FW-12 recommendation |
