Prometheus Scrape Target Down — Metrics Endpoint Unreachable or Timeout
About Prometheus Scrape Target Down
Fix Prometheus scrape target showing DOWN status when the metrics endpoint is unreachable, returns errors, or times out during metric collection. This guide covers everything you need to know about this topic, including common causes, step-by-step solutions, and answers to frequently asked questions.
Here are the key things to understand: Prometheus scrapes metrics from targets by sending HTTP GET requests to /metrics endpoints. Target status DOWN means Prometheus cannot successfully scrape the endpoint. Scrape failures can be due to network issues, target crashes, authentication, or timeout. The Targets page in Prometheus UI shows the status of all configured scrape targets. Service discovery (Kubernetes, Consul, DNS) can add targets that are not yet ready. Understanding these fundamentals will help you diagnose and resolve this issue more effectively.
The most common reasons this occurs include: Target application crashed or not exposing the /metrics endpoint. Network connectivity or DNS resolution failure between Prometheus and the target. Scrape timeout exceeded because the target takes too long to generate metrics response. Authentication required (bearer token, basic auth) but not configured in scrape config. Identifying the root cause is the first step toward finding the right solution.
To resolve this, follow these recommended steps: Check Prometheus UI Targets page: Status > Targets to see which targets are down and error messages. Verify the endpoint is accessible: curl http://target:port/metrics from the Prometheus server. Check the target application logs for crashes or /metrics endpoint errors. Increase scrape_timeout in prometheus.yml if the target is slow to respond (default 10s). Verify service discovery configuration: ensure labels, namespaces, and selectors are correct. If these steps do not resolve the issue, consider consulting additional resources or a qualified professional.
This article is part of our Linux Error Codes collection on Error Codes Wiki. We provide comprehensive, up-to-date information to help you find solutions quickly.
Quick Answer
What is an acceptable scrape interval?
15-30 seconds is common. Shorter intervals (5s) give better resolution but increase storage and load. Longer intervals (60s) save resources but may miss short-lived anomalies. Match the interval to your alerting needs.
Overview
Fix Prometheus scrape target showing DOWN status when the metrics endpoint is unreachable, returns errors, or times out during metric collection.
Key Details
- Prometheus scrapes metrics from targets by sending HTTP GET requests to /metrics endpoints
- Target status DOWN means Prometheus cannot successfully scrape the endpoint
- Scrape failures can be due to network issues, target crashes, authentication, or timeout
- The Targets page in Prometheus UI shows the status of all configured scrape targets
- Service discovery (Kubernetes, Consul, DNS) can add targets that are not yet ready
Common Causes
- Target application crashed or not exposing the /metrics endpoint
- Network connectivity or DNS resolution failure between Prometheus and the target
- Scrape timeout exceeded because the target takes too long to generate metrics response
- Authentication required (bearer token, basic auth) but not configured in scrape config
Steps
- 1Check Prometheus UI Targets page: Status > Targets to see which targets are down and error messages
- 2Verify the endpoint is accessible: curl http://target:port/metrics from the Prometheus server
- 3Check the target application logs for crashes or /metrics endpoint errors
- 4Increase scrape_timeout in prometheus.yml if the target is slow to respond (default 10s)
- 5Verify service discovery configuration: ensure labels, namespaces, and selectors are correct