Patterns/rules for applying function level instrumentation · guacsec/trustify · Discussion #756

ctron
Sep 4, 2024
Maintainer

We've been applying instrumentation (like ``#[instrument]`) on functions when necessary. And that's fine.

However, I am not sure what the right approach is when it comes to handling fields and return/errors. This also touches a bit on the log levels. Also, I do it wrong myself. So I'd appreciate a guideline I could look a too.

Checking back with syslog (RFC5424) there are severities defined like this:

Emergency: system is unusable
Alert: action must be taken immediately
Critical: critical conditions
Error: error conditions
Warning: warning conditions
Notice: normal but significant condition
Informational: informational messages
Debug: debug-level messages

I know, we don't have all of them in the log levels, but close enough.

I guess everything of error and above is the same for us. Above error might be a panic on the console. But that's not part of the log system of the application.

When taking a look at what we do, and what tracing does by default, then that's spamming a lot of information on the INFO level. Which is ok, if we assume that the application will be run on WARN. Because other than a developer debugging, only WARN and above would be important for an operator of the software.

For instrumenting we mostly have #[instrument(..., ret)] or #[instrument(..., err)]. Which logs the span with the result on INFO, of the error on ERROR. However, an error on a function isn't necessary a log worthy ERROR for the person operating this software.

So I think we need a set of rules, maybe with examples, that explain how to consistently apply instrumentation and log levels.

Replies: 4 comments 7 replies

helio-frota
Jan 8, 2025
Collaborator

Docs link

0 replies

helio-frota
Jan 22, 2025
Collaborator

And the performance impact:

Random non-scientific examples:

#[instrument],
#[instrument(skip_all, ret)],
#[instrument(skip_all)]

a
b
c

0 replies

helio-frota
Jan 22, 2025
Collaborator

Currently we have these parts instrumented link

As the tracing crate is a 'general purpose system' to emit events, and OTEL rust crates make use of it , I think the team needs to decide what needs to be instrumented considering:

The spamming information ( mentioned in the first comment )
The performance impact
Who are the interested in looking at the trace results

Example:

Instrument everything by layers?

Endpoint
- Business / service layer
  - DB

Or specific points of interest ? Or a mix of both ?

6 replies

@ctron

ctron Jan 23, 2025
Maintainer Author

I assume this is only true for the initial endpoint function? Further on, service functions (etc), still need the instrumentation macro?

@helio-frota

helio-frota Jan 23, 2025
Collaborator

Further on, service functions (etc), still need the instrumentation macro?

Yes and also respecting/following the declared-log-level 👍

Upload endpoint example:

OTEL_TRACES_SAMPLER_ARG=1 OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" cargo run --bin trustd api --db-password trustify --devmode --auth-disabled --tracing enabled

Result:

2025年01月23日_07-05

RUST_LOG=info OTEL_TRACES_SAMPLER_ARG=1 OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" cargo run --bin trustd api --db-password trustify --devmode --auth-disabled --tracing enabled

2025年01月23日_07-09

/// An async version of [`decompress`].
#[instrument(skip(bytes), fields(bytes_len=bytes.len()), err(level=tracing::Level::INFO))]
pub async fn decompress_async(
 bytes: Bytes,
 content_type: Option<header::ContentType>,
 limit: usize,
) -> Result<Result<Bytes, Error>, JoinError> {
 Handle::current()
 .spawn_blocking(move || decompress(bytes, content_type, limit))
 .await
}

@helio-frota

helio-frota Jan 23, 2025
Collaborator

personal-opinion-only-with-a-limited-knowledge-of-both-crates:

I prefer tracing-actix-web screenshot -- ( But actix-web-opentelemetry is working already )

less magic (we need to instrument each functioin-endpoint) respecting the tracing crate usage
I have no idea how it behaves with utopia yet
- We have configs in trustify code that binds more strongly with actix-web-opentelemetry not sure how to remove those right now
What if I want to instrument only one endpoint ? how to hide all the others with actix-web-opentelemetry ?
tracing-actix-web seems to respect end user mental health when dealing with OTEL versions link
Show the name of the function instead of the endpoint and easy to use manual instrumentation (span creation) based on tracing crate screenshots
I'm not advocating to change the crate
- I have no idea if that will work
  - If yes then it will be nice and less risky see commit comment
I have no idea about the performance impact comparing both

@helio-frota

helio-frota Jan 23, 2025
Collaborator

Apparently they have the same "automatic" instrumentation via middleware... not required to add instrument attribute macro ( no idea about the implications of that btw )

I'm not using the middleware in my example
I don't know if actix-web-opentelemetry work with both middelware and manual-attribute-macro-instrumentation

@helio-frota

helio-frota Jan 31, 2025
Collaborator

yes it supports middleware usage link

helio-frota
Jan 27, 2025
Collaborator

we don't have OTEL-metrics with tracing-actix-web, so we can discard this option and continue with actix-web-opentelemetry 👍

1 reply

@helio-frota

helio-frota Jan 31, 2025
Collaborator

related to #1198

Patterns/rules for applying function level instrumentation #756

Uh oh!

ctron Sep 4, 2024 Maintainer

Replies: 4 comments · 7 replies

Uh oh!

helio-frota Jan 8, 2025 Collaborator

Uh oh!

helio-frota Jan 22, 2025 Collaborator

Uh oh!

Uh oh!

helio-frota Jan 22, 2025 Collaborator

Uh oh!

ctron Jan 23, 2025 Maintainer Author

Uh oh!

helio-frota Jan 23, 2025 Collaborator

Uh oh!

Uh oh!

helio-frota Jan 23, 2025 Collaborator

Uh oh!

Uh oh!

helio-frota Jan 23, 2025 Collaborator

Uh oh!

helio-frota Jan 31, 2025 Collaborator

Uh oh!

helio-frota Jan 27, 2025 Collaborator

Uh oh!

helio-frota Jan 31, 2025 Collaborator

ctron
Sep 4, 2024
Maintainer

Replies: 4 comments 7 replies

helio-frota
Jan 8, 2025
Collaborator

helio-frota
Jan 22, 2025
Collaborator

helio-frota
Jan 22, 2025
Collaborator

ctron Jan 23, 2025
Maintainer Author

helio-frota Jan 23, 2025
Collaborator

helio-frota Jan 23, 2025
Collaborator

helio-frota Jan 23, 2025
Collaborator

helio-frota Jan 31, 2025
Collaborator

helio-frota
Jan 27, 2025
Collaborator

helio-frota Jan 31, 2025
Collaborator