Documentation Index
Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt
Use this file to discover all available pages before exploring further.
Spec: PF-07 Phase 2 & 3 — Error Handling & Monitoring
Last Updated: 2026-03-14
1. Overview
Encore Health OS uses Sentry for error tracking, performance monitoring, and session replay. Custom performance metrics are stored in pf_health_metrics via the platform performanceMonitor.
Architecture
┌─────────────┐ errors, traces ┌─────────────┐
│ React App │ ──────────────────→ │ Sentry │
│ (browser) │ replay, logs │ Dashboard │
└─────────────┘ └─────────────┘
│
│ page load, custom marks,
│ API histograms
▼
┌─────────────────────┐
│ pf_health_metrics │
│ (Supabase) │
└─────────────────────┘
2. Sentry Configuration
File: src/platform/monitoring/sentry.ts
| Setting | Value | Notes |
|---|
tracesSampleRate | 0.5 (50%) | Auth/billing/clinical/HR-payroll forced to 1.0 |
profilesSampleRate | 0.1 (10%) | JS Self-Profiling API |
replaysSessionSampleRate | 0.0 | Off by default |
replaysOnErrorSampleRate | 1.0 | 100% on errors |
enableLogs | true | Structured log search |
enableMetrics | true | Custom metrics (SDK 10.25+) |
PHI Scrubbing
The beforeSend callback:
- Truncates all event/exception messages to 500 characters
- Strips emails, phone numbers, SSNs, DOBs via regex
- Drops breadcrumb messages matching PHI patterns
- Only UUIDs are sent as user/org context — never names, emails, or clinical data
Source Maps
Source maps are uploaded via @sentry/vite-plugin during the Vercel build. The SENTRY_AUTH_TOKEN, SENTRY_ORG, and SENTRY_PROJECT environment variables must be set in the Vercel project settings.
Verification: After a deploy, check Sentry → Settings → Source Maps → Artifacts to confirm the release has uploaded maps.
3. Alerting Thresholds
Error Rate
| Metric | Warning | Critical | Action |
|---|
| Error rate (events/min) | > 10/min | > 50/min | Check Sentry Issues feed; page on-call if critical |
| Unique issues (new/hour) | > 5 | > 15 | Review new issues for regressions |
| Unhandled rejection rate | > 1% of sessions | > 5% | Investigate JS errors in production |
| Metric | Good | Needs Improvement | Poor |
|---|
| LCP (Largest Contentful Paint) | ≤ 2.5s | 2.5–4.0s | > 4.0s |
| INP (Interaction to Next Paint) | ≤ 200ms | 200–500ms | > 500ms |
| CLS (Cumulative Layout Shift) | ≤ 0.1 | 0.1–0.25 | > 0.25 |
| Metric | Warning | Critical |
|---|
| p95 API latency | > 2s | > 5s |
| API error rate (5xx) | > 1% | > 5% |
4. Dashboards
Sentry Project
- Issues: Real-time error feed with stack traces and session replay
- Performance: Transaction duration, Web Vitals, throughput
- Replays: Session recordings for error context
- Logs: Structured log search (
Sentry.logger.*)
Key Sentry Queries
# High-frequency errors in the last hour
is:unresolved times_seen:>10 firstSeen:-1h
# Errors on auth routes
transaction:/auth/* is:unresolved
# Clinical module errors
module:cl is:unresolved
Custom metrics in pf_health_metrics (Supabase):
- Page load timing (
page_load, dom_ready)
- Custom marks (
startMark/endMark)
- API response time histograms
Query via Supabase dashboard or the platform health module.
5. Error Boundaries
The application uses a layered error boundary strategy:
| Level | Location | Behavior |
|---|
| Global (root) | main.tsx | Catches catastrophic failures; shows full-page fallback |
| Global (app) | App.tsx | Defense-in-depth; catches errors inside providers |
| Feature | RouteLoader.tsx | Per-module isolation; module crash doesn’t break other routes |
| Component | Individual components | Optional; for non-critical widgets |
The double global boundary (main.tsx + App.tsx) is intentional — the outer boundary catches errors that occur during provider initialization.
6. Correlation IDs
Every auth state change (sign-in, sign-out, token refresh) generates a correlation_id via crypto.randomUUID(). This ID is:
- Set in the logger context for all subsequent logs
- Included in structured log entries
- Useful for tracing a user session across log entries
7. Escalation Procedure
- P3 (Low): New non-critical issue appears in Sentry → assign to relevant core team in next standup
- P2 (Medium): Error rate warning threshold → investigate within 4 hours
- P1 (High): Error rate critical threshold or auth/billing errors → investigate within 1 hour
- P0 (Critical): Application-wide crash or data integrity issue → page on-call immediately
8. Maintenance
Sentry Housekeeping
- Review and resolve/archive stale issues monthly
- Update
ignoreErrors patterns when new non-actionable errors are identified
- Verify source map uploads after Vite/build tool upgrades
- Review sampling rates quarterly (adjust based on event volume and budget)
performanceMonitor flushes metrics to pf_health_metrics every 30 seconds
- Metrics are sampled at 10% in production, 100% in development
- Stale metrics can be cleaned up via SQL on
pf_health_metrics