Uptime and availability monitoring
Topic: Monitoring basics
Summary
Monitor endpoint availability from external or synthetic checks. Use HTTP or TCP checks from multiple regions. Use when you need to know if users can reach your service.
Intent: How-to
Quick answer
- Run HTTP or TCP checks from outside the network. Check every 1-5 minutes. Alert on consecutive failures.
- Use multiple regions or providers to avoid single point of failure. Measure latency and status code.
- Report uptime percentage. Track SLA; page on-call when availability drops below threshold.
Prerequisites
Steps
-
Define checks
HTTP GET to key URL; expect 200 or range. TCP to port if no HTTP. Set interval and timeout.
-
Multi-region
Run same check from several locations. Aggregate success rate. Alert if majority fail.
-
Report and alert
Calculate uptime over window. Alert when below SLA. Link to status page or runbook.
Summary
Run external HTTP or TCP checks; use multiple regions; report uptime and alert on SLA breach.
Prerequisites
Steps
Step 1: Define checks
HTTP or TCP check; interval and timeout; expected response.
Step 2: Multi-region
Run from several locations; aggregate results.
Step 3: Report and alert
Uptime percentage; alert below SLA; link runbook.
Verification
- Checks run; uptime reported; alert fires when you simulate outage.
Troubleshooting
False down — Check check endpoint and network. Missed outage — Shorten interval or add locations.