How do I debug scrape failures?

Check the 'Last Error' column on the Targets page. Common errors: connection refused (target down), timeout (target slow), 403 (auth needed), 404 (wrong metrics path). Curl the endpoint from the Prometheus server to reproduce.

Can Prometheus scrape targets in other Kubernetes namespaces?

Yes. Prometheus needs network access (NetworkPolicy allowing it) and proper RBAC for service discovery. Configure kubernetes_sd_configs with the correct namespace selectors or use role: endpoints with namespace discovery.

Prometheus Scrape Target Down — Metrics Endpoint Unreachable or Timeout

Warningsystem

About Prometheus Scrape Target Down

Fix Prometheus scrape target showing DOWN status when the metrics endpoint is unreachable, returns errors, or times out during metric collection. This guide covers everything you need to know about this topic, including common causes, step-by-step solutions, and answers to frequently asked questions.

Here are the key things to understand: Prometheus scrapes metrics from targets by sending HTTP GET requests to /metrics endpoints. Target status DOWN means Prometheus cannot successfully scrape the endpoint. Scrape failures can be due to network issues, target crashes, authentication, or timeout. The Targets page in Prometheus UI shows the status of all configured scrape targets. Service discovery (Kubernetes, Consul, DNS) can add targets that are not yet ready. Understanding these fundamentals will help you diagnose and resolve this issue more effectively.

The most common reasons this occurs include: Target application crashed or not exposing the /metrics endpoint. Network connectivity or DNS resolution failure between Prometheus and the target. Scrape timeout exceeded because the target takes too long to generate metrics response. Authentication required (bearer token, basic auth) but not configured in scrape config. Identifying the root cause is the first step toward finding the right solution.

To resolve this, follow these recommended steps: Check Prometheus UI Targets page: Status > Targets to see which targets are down and error messages. Verify the endpoint is accessible: curl http://target:port/metrics from the Prometheus server. Check the target application logs for crashes or /metrics endpoint errors. Increase scrape_timeout in prometheus.yml if the target is slow to respond (default 10s). Verify service discovery configuration: ensure labels, namespaces, and selectors are correct. If these steps do not resolve the issue, consider consulting additional resources or a qualified professional.

This article is part of our Linux Error Codes collection on Error Codes Wiki. We provide comprehensive, up-to-date information to help you find solutions quickly.

Quick Answer

What is an acceptable scrape interval?

15-30 seconds is common. Shorter intervals (5s) give better resolution but increase storage and load. Longer intervals (60s) save resources but may miss short-lived anomalies. Match the interval to your alerting needs.

Overview

Fix Prometheus scrape target showing DOWN status when the metrics endpoint is unreachable, returns errors, or times out during metric collection.

Key Details

Prometheus scrapes metrics from targets by sending HTTP GET requests to /metrics endpoints
Target status DOWN means Prometheus cannot successfully scrape the endpoint
Scrape failures can be due to network issues, target crashes, authentication, or timeout
The Targets page in Prometheus UI shows the status of all configured scrape targets
Service discovery (Kubernetes, Consul, DNS) can add targets that are not yet ready

Common Causes

Target application crashed or not exposing the /metrics endpoint
Network connectivity or DNS resolution failure between Prometheus and the target
Scrape timeout exceeded because the target takes too long to generate metrics response
Authentication required (bearer token, basic auth) but not configured in scrape config

Steps

1Check Prometheus UI Targets page: Status > Targets to see which targets are down and error messages
2Verify the endpoint is accessible: curl http://target:port/metrics from the Prometheus server
3Check the target application logs for crashes or /metrics endpoint errors
4Increase scrape_timeout in prometheus.yml if the target is slow to respond (default 10s)
5Verify service discovery configuration: ensure labels, namespaces, and selectors are correct