Evan Schwartz

autometrics-rs 0.5: Automatically connecting Prometheus metrics to traces

Originally published on the Fiberplane Blog

Autometrics Logo

Exemplars are a powerful but lesser-used feature of Prometheus that enable you to jump from high-level metrics charts to the trace for an individual request. autometrics-rs 0.5 enables you to automatically produce metrics with exemplars, making it easy to add this debugging superpower to your toolkit.

Background on Exemplars

Metrics are great for giving you a high-level understanding of the health and performance of your system. While it is possible to add fairly granular metrics (and Autometrics makes this easy!), metrics systems like Prometheus tend not to work well with high-cardinality or highly variable data. As a result, you don’t want to add all of the details of a particular request to your metrics. For debugging certain types of issues where you need the full request context, it’s useful to turn to traces or logs.

Exemplars enable you to quickly jump from looking at a chart of your metrics to looking at a specific trace. They work by associating sample trace IDs with the metrics at a given point in time. This idea was originally introduced in the OpenMetrics project and incorporated as an experimental feature in Prometheus in 2021. Exemplars are also included in the OpenTelemetry spec, though at the time of writing most of the OpenTelemetry client libraries do not yet support them.

Exemplars are so powerful that one engineer I spoke with said, “once you’ve debugged an issue with exemplars, it’s hard to go back.”

Here’s a small example of what exemplars look like in the Prometheus UI. When you hover over the line for a metric, you will see the Trace exemplar with a linked trace_id. You can also wire things up in Grafana such that you can immediately jump from a chart to the trace in Jaeger or Tempo, but configuring that is a topic for another blog post.

Prometheus with exemplars

Automatically producing metrics with exemplars

Autometrics can now automatically extract the trace_id and other details from the trace context and attach them to the metrics as exemplars.

Let’s see what that looks like in code.

Here, we have an HTTP handler that uses the [tracing](http://crates.io/crates/tracing) library for producing traces and logs. We’ve also “autometricized” it so that it produces metrics as well.

use autometrics::autometrics;
use tracing::{debug, instrument};
use uuid::Uuid;

#[autometrics]
#[instrument(fields(trace_id = %Uuid::new_v4()))]
async fn my_handler() {
	debug!("My handler was called");
}

Then, in the main function, we expose the trace details to Autometrics using the AutometricsExemplarExtractor:

pub fn main() {	
    tracing_subscriber::fmt::fmt()
        .finish()
        .with(AutometricsExemplarExtractor::from_fields(&["trace_id"]))
        .init();

    autometrics::prometheus_exporter::init();
}

Alternatively, you can also have Autometrics extract the trace_id and span_id from the OpenTelemetry Context using the [tracing-opentelemetry](https://docs.rs/autometrics/latest/autometrics/exemplars/index.html#tracing-opentelemetry) integration.

Finally, we expose the metrics to Prometheus. The helper [encode_http_response](https://docs.rs/autometrics/latest/autometrics/prometheus_exporter/fn.encode_http_response.html) function serializes the metrics and automatically uses the OpenMetrics format when exemplars are enabled.

use autometrics::prometheus_exporter::{self, PrometheusResponse};

// Mount this on the /metrics endpoint
pub async fn get_metrics() -> PrometheusResponse {
    prometheus_exporter::encode_http_response()
}

Now, we’re producing both metrics and traces to give ourselves both high-level and detailed pictures of our system. Exemplars link the two together so we can quickly jump from looking at a chart of our error rates or latencies to specific traces that provide the context for why an issue might be occurring.

Supporting the official prometheus-client crate

Unfortunately, many Prometheus and OpenTelemetry client libraries do not yet have support for producing exemplars. In Rust, the [prometheus-client](https://crates.io/crates/prometheus-client) is the only one that does.

This release of Autometrics adds support for this crate and makes it the default if you use the helper [prometheus_exporter](https://docs.rs/autometrics/latest/autometrics/prometheus_exporter/index.html) functions we provide.

Get involved

Some engineers who I had discussed Autometrics with initially suggested the idea for supporting exemplars. If you have other ideas or want to get involved in the Autometrics project, come join us on Discord or pitch in to our Github Discussions!

#autometrics #fiberplane #observability #rust