RED and USE metrics

Topic: Monitoring basics

Summary

RED for services: Rate, Errors, Duration. USE for resources: Utilization, Saturation, Errors. Use these to choose what to measure and alert on.

Intent: How-to

Quick answer

  • RED: request Rate, Error rate, Duration (latency). Use for HTTP and RPC services. Alert on rate drop, error spike, or latency high.
  • USE: Utilization, Saturation, Errors. Use for CPU, disk, memory, network. Alert when utilization or saturation high or errors non-zero.
  • Implement RED per service and USE per resource. Dashboards and alerts follow these. Simplifies coverage.

Prerequisites

Steps

  1. RED for services

    Instrument rate, errors, duration. Add to dashboard and alerts. Per service or endpoint.

  2. USE for resources

    Measure utilization, saturation, errors for CPU, disk, memory, network. Dashboard and alert.

  3. Review coverage

    Ensure each service has RED and each resource has USE. Fill gaps.

Summary

Use RED for services and USE for resources. Implement and alert on both. Review coverage.

Prerequisites

Steps

Step 1: RED for services

Rate, errors, duration per service; dashboard and alerts.

Step 2: USE for resources

Utilization, saturation, errors per resource; dashboard and alerts.

Step 3: Review coverage

Each service RED; each resource USE; fill gaps.

Verification

  • RED and USE metrics present; alerts and dashboards in place.

Troubleshooting

Missing metric — Add instrumentation or scrape. Noise — Tune thresholds.

Next steps

Continue to