Monitoring Endpoints with Blackbox Exporter

homelab

2026-04-05

Monitoring Endpoints with Blackbox Exporter¶

Most of the monitoring in the homelab is internal. Prometheus scrapes metrics that applications expose. But that only tells you the application thinks it's healthy. It doesn't tell you if a user can actually reach it.

Blackbox Exporter flips the perspective. Instead of asking "are you healthy?", it asks "can I reach you from the outside?" It probes HTTP endpoints, checks TLS certificates, measures response time, and reports back to Prometheus.

What it probes¶

I configured four probe modules:

http_2xx for plain HTTP health checks (internal services)
http_2xx_tls for HTTPS endpoints with TLS validation (external-facing services)
tcp_connect for raw TCP connectivity
icmp for ping

The interesting part is the target list. Two groups:

HTTPS endpoints via Envoy Gateway. All services exposed to my network follow the pattern <service>.ruiz.sh (like grafana.ruiz.sh). Blackbox Exporter hits them over HTTPS and validates the TLS certificate. If cert-manager fails to renew the wildcard cert, this is where I'd see it first.

Internal service health checks. Direct HTTP calls to the health endpoints of Prometheus, Alertmanager, Loki, Thanos Query Frontend, and MinIO. These bypass the Gateway and hit the services directly inside the cluster.

How it works with Prometheus¶

Blackbox Exporter doesn't scrape targets on its own. Prometheus drives the whole flow:

Prometheus                    Blackbox Exporter              Target
    │                               │                          │
    │  scrape with ?target=URL      │                          │
    │──────────────────────────────>│                          │
    │                               │   HTTP/TCP/ICMP probe    │
    │                               │─────────────────────────>│
    │                               │                          │
    │                               │   response (200, 3ms)    │
    │                               │<─────────────────────────│
    │                               │                          │
    │  probe_success=1              │                          │
    │  probe_duration=0.003         │                          │
    │  probe_http_status_code=200   │                          │
    │<──────────────────────────────│                          │
    │                               │                          │

Every 60 seconds, Prometheus makes an HTTP GET to Blackbox Exporter passing the target URL as a parameter. Blackbox Exporter receives the request, tests the endpoint right then (HTTP call, TCP connection, or ping depending on the module), and returns the result as metrics: success or failure, response time, HTTP status code, TLS certificate expiry. Blackbox Exporter doesn't store anything. It only runs the test when Prometheus asks and returns the result immediately.

Prometheus stores those metrics like any other time series, which means you can query them in Grafana, build dashboards, and create alerts. A dashboard shows uptime, response time, and TLS certificate expiry for every probed endpoint.

The Grafana dashboard¶

There's a ready-made Blackbox Exporter HTTP Prober dashboard on Grafana's dashboard marketplace that works out of the box. I imported it as a ConfigMap so it gets loaded automatically via the Grafana sidecar. It shows all probed targets with their status, response time, and HTTP status codes.

Blackbox Exporter dashboard showing endpoint status, HTTP codes, SSL validation and certificate expiry

The TLS expiry metric is especially useful. Combined with cert-manager's own metrics, it gives two independent signals that the certificate is valid. If both disagree, something is wrong.

Blackbox Exporter is lightweight (10m CPU, 32Mi memory) and runs on the medium tier node. For what it does, it's one of the cheapest components in the stack.