-
Notifications
You must be signed in to change notification settings - Fork 47
-
We've been applying instrumentation (like ``#[instrument]`) on functions when necessary. And that's fine.
However, I am not sure what the right approach is when it comes to handling fields and return/errors. This also touches a bit on the log levels. Also, I do it wrong myself. So I'd appreciate a guideline I could look a too.
Checking back with syslog (RFC5424) there are severities defined like this:
Emergency: system is unusable
Alert: action must be taken immediately
Critical: critical conditions
Error: error conditions
Warning: warning conditions
Notice: normal but significant condition
Informational: informational messages
Debug: debug-level messages
I know, we don't have all of them in the log levels, but close enough.
I guess everything of error and above is the same for us. Above error might be a panic on the console. But that's not part of the log system of the application.
When taking a look at what we do, and what tracing does by default, then that's spamming a lot of information on the INFO level. Which is ok, if we assume that the application will be run on WARN. Because other than a developer debugging, only WARN and above would be important for an operator of the software.
For instrumenting we mostly have #[instrument(..., ret)] or #[instrument(..., err)]. Which logs the span with the result on INFO, of the error on ERROR. However, an error on a function isn't necessary a log worthy ERROR for the person operating this software.
So I think we need a set of rules, maybe with examples, that explain how to consistently apply instrumentation and log levels.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 4 comments 7 replies
-
Beta Was this translation helpful? Give feedback.
All reactions
-
And the performance impact:
Random non-scientific examples:
#[instrument],#[instrument(skip_all, ret)],#[instrument(skip_all)]
Beta Was this translation helpful? Give feedback.
All reactions
-
Currently we have these parts instrumented link
As the tracing crate is a 'general purpose system' to emit events, and OTEL rust crates make use of it , I think the team needs to decide what needs to be instrumented considering:
- The spamming information ( mentioned in the first comment )
- The performance impact
- Who are the interested in looking at the trace results
Example:
Instrument everything by layers?
- Endpoint
- Business / service layer
- DB
- Business / service layer
Or specific points of interest ? Or a mix of both ?
Beta Was this translation helpful? Give feedback.
All reactions
-
I assume this is only true for the initial endpoint function? Further on, service functions (etc), still need the instrumentation macro?
Beta Was this translation helpful? Give feedback.
All reactions
-
Further on, service functions (etc), still need the instrumentation macro?
Yes and also respecting/following the declared-log-level 👍
- Upload endpoint example:
OTEL_TRACES_SAMPLER_ARG=1 OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" cargo run --bin trustd api --db-password trustify --devmode --auth-disabled --tracing enabled
Result:
RUST_LOG=info OTEL_TRACES_SAMPLER_ARG=1 OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317" cargo run --bin trustd api --db-password trustify --devmode --auth-disabled --tracing enabled
/// An async version of [`decompress`]. #[instrument(skip(bytes), fields(bytes_len=bytes.len()), err(level=tracing::Level::INFO))] pub async fn decompress_async( bytes: Bytes, content_type: Option<header::ContentType>, limit: usize, ) -> Result<Result<Bytes, Error>, JoinError> { Handle::current() .spawn_blocking(move || decompress(bytes, content_type, limit)) .await }
Beta Was this translation helpful? Give feedback.
All reactions
-
personal-opinion-only-with-a-limited-knowledge-of-both-crates:
I prefer tracing-actix-web screenshot -- ( But actix-web-opentelemetry is working already )
-
less magic (we need to instrument each functioin-endpoint) respecting the
tracingcrate usage -
I have no idea how it behaves with utopia yet
- We have configs in trustify code that binds more strongly with
actix-web-opentelemetrynot sure how to remove those right now
- We have configs in trustify code that binds more strongly with
-
What if I want to instrument only one endpoint ? how to hide all the others with
actix-web-opentelemetry? -
tracing-actix-webseems to respect end user mental health when dealing with OTEL versions link -
Show the name of the function instead of the endpoint and easy to use manual instrumentation (span creation) based on
tracingcrate screenshots -
I'm not advocating to change the crate
- I have no idea if that will work
- If yes then it will be nice and less risky see commit comment
- I have no idea if that will work
-
I have no idea about the performance impact comparing both
Beta Was this translation helpful? Give feedback.
All reactions
-
Apparently they have the same "automatic" instrumentation via middleware... not required to add instrument attribute macro ( no idea about the implications of that btw )
-
I'm not using the middleware in my example
-
I don't know if actix-web-opentelemetry work with both middelware and manual-attribute-macro-instrumentation
Beta Was this translation helpful? Give feedback.
All reactions
-
yes it supports middleware usage link
Beta Was this translation helpful? Give feedback.
All reactions
-
we don't have OTEL-metrics with tracing-actix-web, so we can discard this option and continue with actix-web-opentelemetry 👍
Beta Was this translation helpful? Give feedback.
All reactions
-
related to #1198
Beta Was this translation helpful? Give feedback.