Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Patterns/rules for applying function level instrumentation #756

ctron started this conversation in Ideas
Discussion options

We've been applying instrumentation (like ``#[instrument]`) on functions when necessary. And that's fine.

However, I am not sure what the right approach is when it comes to handling fields and return/errors. This also touches a bit on the log levels. Also, I do it wrong myself. So I'd appreciate a guideline I could look a too.

Checking back with syslog (RFC5424) there are severities defined like this:

Emergency: system is unusable
Alert: action must be taken immediately
Critical: critical conditions
Error: error conditions
Warning: warning conditions
Notice: normal but significant condition
Informational: informational messages
Debug: debug-level messages

I know, we don't have all of them in the log levels, but close enough.

I guess everything of error and above is the same for us. Above error might be a panic on the console. But that's not part of the log system of the application.

When taking a look at what we do, and what tracing does by default, then that's spamming a lot of information on the INFO level. Which is ok, if we assume that the application will be run on WARN. Because other than a developer debugging, only WARN and above would be important for an operator of the software.

For instrumenting we mostly have #[instrument(..., ret)] or #[instrument(..., err)]. Which logs the span with the result on INFO, of the error on ERROR. However, an error on a function isn't necessary a log worthy ERROR for the person operating this software.

So I think we need a set of rules, maybe with examples, that explain how to consistently apply instrumentation and log levels.

You must be logged in to vote

Replies: 4 comments 7 replies

Comment options

You must be logged in to vote
0 replies
Comment options

And the performance impact:

Random non-scientific examples:

  • #[instrument],
  • #[instrument(skip_all, ret)],
  • #[instrument(skip_all)]

a
b
c

You must be logged in to vote
0 replies
Comment options

Currently we have these parts instrumented link

As the tracing crate is a 'general purpose system' to emit events, and OTEL rust crates make use of it , I think the team needs to decide what needs to be instrumented considering:

  • The spamming information ( mentioned in the first comment )
  • The performance impact
  • Who are the interested in looking at the trace results

Example:

Instrument everything by layers?

  • Endpoint
    • Business / service layer
      • DB

Or specific points of interest ? Or a mix of both ?

You must be logged in to vote
6 replies
Comment options

ctron Jan 23, 2025
Maintainer Author

I assume this is only true for the initial endpoint function? Further on, service functions (etc), still need the instrumentation macro?

Comment options

Further on, service functions (etc), still need the instrumentation macro?

Yes and also respecting/following the declared-log-level 👍

  • Upload endpoint example:

OTEL_TRACES_SAMPLER_ARG=1 OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" cargo run --bin trustd api --db-password trustify --devmode --auth-disabled --tracing enabled

Result:

2025年01月23日_07-05


RUST_LOG=info OTEL_TRACES_SAMPLER_ARG=1 OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" cargo run --bin trustd api --db-password trustify --devmode --auth-disabled --tracing enabled

2025年01月23日_07-09

/// An async version of [`decompress`].
#[instrument(skip(bytes), fields(bytes_len=bytes.len()), err(level=tracing::Level::INFO))]
pub async fn decompress_async(
 bytes: Bytes,
 content_type: Option<header::ContentType>,
 limit: usize,
) -> Result<Result<Bytes, Error>, JoinError> {
 Handle::current()
 .spawn_blocking(move || decompress(bytes, content_type, limit))
 .await
}
Comment options

personal-opinion-only-with-a-limited-knowledge-of-both-crates:

I prefer tracing-actix-web screenshot -- ( But actix-web-opentelemetry is working already )

  • less magic (we need to instrument each functioin-endpoint) respecting the tracing crate usage

  • I have no idea how it behaves with utopia yet

    • We have configs in trustify code that binds more strongly with actix-web-opentelemetry not sure how to remove those right now
  • What if I want to instrument only one endpoint ? how to hide all the others with actix-web-opentelemetry ?

  • tracing-actix-web seems to respect end user mental health when dealing with OTEL versions link

  • Show the name of the function instead of the endpoint and easy to use manual instrumentation (span creation) based on tracing crate screenshots

  • I'm not advocating to change the crate

    • I have no idea if that will work
  • I have no idea about the performance impact comparing both

Comment options

Apparently they have the same "automatic" instrumentation via middleware... not required to add instrument attribute macro ( no idea about the implications of that btw )

  • I'm not using the middleware in my example

  • I don't know if actix-web-opentelemetry work with both middelware and manual-attribute-macro-instrumentation

Comment options

yes it supports middleware usage link

Comment options

we don't have OTEL-metrics with tracing-actix-web, so we can discard this option and continue with actix-web-opentelemetry 👍

You must be logged in to vote
1 reply
Comment options

related to #1198

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
performance Observability The ability to observe

AltStyle によって変換されたページ (->オリジナル) /