Uptime and availability monitoring

Topic: Monitoring basics

Summary

Monitor endpoint availability from external or synthetic checks. Use HTTP or TCP checks from multiple regions. Use when you need to know if users can reach your service.

Intent: How-to

Quick answer

  • Run HTTP or TCP checks from outside the network. Check every 1-5 minutes. Alert on consecutive failures.
  • Use multiple regions or providers to avoid single point of failure. Measure latency and status code.
  • Report uptime percentage. Track SLA; page on-call when availability drops below threshold.

Prerequisites

Steps

  1. Define checks

    HTTP GET to key URL; expect 200 or range. TCP to port if no HTTP. Set interval and timeout.

  2. Multi-region

    Run same check from several locations. Aggregate success rate. Alert if majority fail.

  3. Report and alert

    Calculate uptime over window. Alert when below SLA. Link to status page or runbook.

Summary

Run external HTTP or TCP checks; use multiple regions; report uptime and alert on SLA breach.

Prerequisites

Steps

Step 1: Define checks

HTTP or TCP check; interval and timeout; expected response.

Step 2: Multi-region

Run from several locations; aggregate results.

Step 3: Report and alert

Uptime percentage; alert below SLA; link runbook.

Verification

  • Checks run; uptime reported; alert fires when you simulate outage.

Troubleshooting

False down — Check check endpoint and network. Missed outage — Shorten interval or add locations.

Next steps

Continue to