On-call basics
Topic: Monitoring basics
Summary
Set up on-call rotation and escalation. Route alerts to primary and secondary. Use when you need someone to respond to incidents 24/7 or during business hours.
Intent: How-to
Quick answer
- Define rotation: who is primary and secondary. Use PagerDuty, Opsgenie, or similar. Set schedule and override.
- Alerts route to primary. Escalate to secondary or manager if not acknowledged within SLA. Document escalation path.
- Limit alert fatigue. Tune alerts; use runbooks. Track load and balance rotation.
Prerequisites
Steps
-
Rotation and tool
Define primary and secondary. Configure PagerDuty or tool. Set schedule and overrides.
-
Escalation
Escalate if not acked in X minutes. Document path. Test escalation.
-
Tune and balance
Reduce noise. Use runbooks. Balance rotation; track pages per person.
Summary
Set rotation and tool; route alerts to primary; escalate per SLA; tune and balance load.
Prerequisites
Steps
Step 1: Rotation and tool
Define primary and secondary; configure tool; set schedule.
Step 2: Escalation
Escalate after timeout; document path; test.
Step 3: Tune and balance
Reduce noise; runbooks; balance rotation.
Verification
- Alert reaches primary; escalation works; load acceptable.
Troubleshooting
No one paged — Check routing and schedule. Too many pages — Tune alerts and thresholds.