Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.encoreos.io/llms.txt

Use this file to discover all available pages before exploring further.

Operational procedures for responding to system health alerts.

Severity Levels

LevelResponse TimeAction
Critical< 15 minutesImmediate investigation, escalate if needed
Warning< 1 hourInvestigate root cause
InfoNext business dayReview and document

Alert Response Procedures

API Response Time High (> 500ms p95)

Symptoms: Slow page loads, timeouts Steps:
  1. Check server resource utilization (CPU, memory)
  2. Review slow query logs in database
  3. Check for traffic spikes
  4. Verify no ongoing deployments
  5. Consider scaling resources if sustained

Error Rate High (> 5%)

Symptoms: User-facing errors, failed operations Steps:
  1. Check application logs for error patterns
  2. Identify affected endpoints
  3. Review recent deployments
  4. Rollback if deployment-related
  5. Fix and deploy patch if code issue

Database Query Slow (> 100ms avg)

Symptoms: Slow data operations, timeouts Steps:
  1. Identify slow queries in logs
  2. Check for missing indexes
  3. Review query plans with EXPLAIN
  4. Optimize queries or add indexes
  5. Consider read replicas for heavy reads

LCP Poor (> 2.5s)

Symptoms: Slow initial page render Steps:
  1. Check bundle sizes
  2. Review image optimization
  3. Verify lazy loading implementation
  4. Check CDN performance
  5. Optimize critical render path

Integration Unhealthy

Symptoms: Third-party service failures Steps:
  1. Verify integration credentials
  2. Check service status page
  3. Test API connectivity
  4. Review rate limits
  5. Contact provider if persistent

Escalation Matrix

ConditionEscalate To
Multiple critical alertsEngineering Lead
Data loss riskCTO + Engineering Lead
Security incidentSecurity Team + CTO
Provider outageVendor Contact

Post-Incident

  1. Resolve alert with detailed notes
  2. Document root cause
  3. Create follow-up tasks for prevention
  4. Update runbook if new procedure needed