Evan Schwartz

autometrics-rs 0.3: Defining Service-Level Objectives (SLOs) in Rust source code

Originally published on the Fiberplane Blog

Autometrics Logo

Autometrics makes it easy to track the latency, error rate, and production usage of any function in your code. autometrics-rs v0.3 also enables you to define Service-Level Objectives (SLOs) directly in your Rust code.

SLOs are a way to describe expectations for your code’s performance in production. For example, you might create an objective specifying that 99% of requests to your API should respond in 250 milliseconds or less. Then, you can create alerts that will fire when the service is in danger of failing to meet your targets. (For more background on SLOs and details on how this feature works under the hood, check out: An adventure with SLOs, generic Prometheus alerting rules, and complex PromQL queries.)

Normally, creating SLOs and alerts requires writing complex queries and YAML config files to specify when you want to be alerted. Autometrics allows you to define them right in your source code. Let’s see how.

Instrumenting code with autometrics

When using autometrics, you can instrument functions using the autometrics macro as follows. This tracks the request and error rate of the function, as well as its latency:

Creating SLOs in Rust

Now, you can also create an Objective and include one or multiple functions’ metrics in it. The objective defines your SLO and can specify the target success rate, latency, or both.

Here we’ll define an SLO called “api” that stipulates that we want our API to return successful responses 99% of the time and that it should respond to 99.9% of requests within 250 milliseconds:

use autometrics::objectives::{Objective, ObjectiveLatency, ObjectivePercentile};

const API_SLO: Objective = Objective::new("api")
    .success_rate(ObjectivePercentile::P99)
    .latency(ObjectiveLatency::Ms250, ObjectivePercentile::99_9);

One or more functions can be included in this SLO using the macro’s objective parameter:

#[autometrics(objective = API_SLO)]
pub fn create_user_handler() {
    // ...
}

#[autometrics(objective = API_SLO)]
pub fn get_user_handler() {
    // ...
}

This works by adding additional labels to the metrics produced by these functions: objective_name, objective_percentile, and objective_latency_threshold.

Alerting based on SLOs

Once you have objectives defined in your code, you can set Prometheus up to alert you when the targets are at risk.

Autometrics makes it possible to use SLO best practices for alerting without you needing to write complex PromQL queries or YAML config files.

The library provides a single Prometheus recording and alerting rules file that will work for any autometrics-instrumented project. This works by setting up a number of rules that are dormant by default but become activated when metrics are produced using the objective-related labels.

To enable alerts, you’ll need to configure your Prometheus instance to load the autometrics.rules.yml file. Then, Prometheus will use the SLO best practices to alert you when your objectives are at risk.

Debugging alerts with function-level metrics

One nice benefit of defining SLOs this way is that it becomes easy to look at a graph of all the functions that comprise the SLO. If you’re debugging an alert related to high error rates, you can use the following query to quickly see the errors :

sum by (function, module) (rate(function_calls_count{objective_name="api", result="error"}[5m]))

8

Customizing the metric’s result

Finally, v0.3 also adds the ability to specify ok_if or error_if functions as parameters to the autometrics macro.

By default, the macro will determine whether the function’s result was “good” or “bad” based on whether it returns a Result::Ok or Result::Err. The ok_if and error_if functions can be used to customize this behavior.

For example, you might want to ignore certain error variants, like those related to invalid user requests. Or, you could assign the ok or error labels to variants of enums other than Result, such as an http::Response.

Get involved!

Want to add autometrics to your project? You can find it on Github and crates.io. If you’re interested in getting involved in the project, come join us on Discord – we’d love to hear what you think!

#autometrics #fiberplane #observability #rust