-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add custom labels support for metrics #7207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CLA assistant check
All committers have signed the CLA.
d48d43b
to
9391633
Compare
9391633
to
0f85110
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm afraid of, happened. I've just tested and all the replaced labels have the value "unknown". Applying replacements here is too early because many of the values in the replacer aren't filled in yet. Can this be moved lower or refactored so it picks up the labeling? I imagine it needs something similar to observeRequest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you tell me the labels key that is being recorded as unknown at that point? i thought the only value net yet determined at that point was status
, and that status
is determined in the code below, so i thought there was no problem.
but i'm not familiar with this code base, so if there's something i haven't discovered yet, plz let me know for reproducing test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general problem with this I think is cardinality: if users use data that is different on pretty much every request (e.g. the whole URL), then it will balloon the amount of data tracked in memory and returned in metrics responses.
Imagine it's configured with orig_uri
and the app has URLs like /user/{user_id}/page/{page_id}
or something, every one of those URLs would be uniquely tracked for metrics, meaning infinite cardinality. Very dangerous for performance.
One way to work around that would be to use vars in sites to only track specific path prefixes, doing something like vars /user* subpath "user-path"
and vars /admin* subpath "admin-path"
and such, then you could have a label subpath {vars.subpath}
so it's controlled and you only have cardinality of 3 in that case (those two, plus ""
for any other paths). Obviously, that can be expanded as needed for tracking more areas of an app.
Also the other problem is that since metrics are configured globally, if you do take that vars
approach, you don't necessarily know that the vars will be set in all server routes, so you almost always end up with empty label values in some case.
So all that said, I think we need to be careful about what we recommend, the config in your PR description is absolutely dangerous as-is, so I think you should edit it (don't recommend remote
or orig_uri
at the very least, unbound cardinality). Also, our docs would certainly need an explainer on the risks of cardinality when it comes to labels and have certain recommendations to keep it in check.
Disclaimer: I'm saying this as someone who has literally never worked with metrics or prometheus etc, but I've absorbed some of this risk from talking a bunch with @hairyhenderson about this stuff.
close #7205
close #7161
With this fix, we can specify labels in the metric settings like below.
placeholders are automatically converted to real values, and static values that don't include placeholders are left in.
Also, since this PR allows us to add a
proto
label, I think we can just close this PR without merging it.