Search code, repositories, users, issues, pull requests...

Copy link

Contributor

roypat commented May 6, 2025

This is great, thanks for tackling this! I think we can probably even include your script in this PR (somewhere under tools) and then run it as part of our regular test suite, so we don't again reintroduce unneeded metrics later on :)

Btw, I think you can catch a few more unused ones with a tiny tweak to your script: Add metrics.rs files to the check that excludes specific use-sites. For example, the rx_partial_writes metric in net/metrics.rs is unused, but there's a .add call due to metrics aggregation inside the metrics files itself that makes your script believe it is in use.

LDagnachew added 6 commits

May 6, 2025 20:23


 Script for finding Unused Metrics

bd6e153

script finds metric files, extracts fields, and reports
validity of each metrics
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>


 Removed sync_response_fails metric

9dcc7a5

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>


 Removed sync_vmm_send_timeout_count + cleaned up

309727c

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>


 Removed deprecated_cmd_line_api_calls metric

bcd5697

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>


 Removed log_fails metric

de9644b

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>


 Removed device_events metric

99f5049

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>

@mdhaduk mdhaduk force-pushed the rmv_unused_metrics branch from 78eac72 to d8bfb5a Compare

May 6, 2025 20:28

@mdhaduk mdhaduk marked this pull request as ready for review

May 6, 2025 21:06

@mdhaduk mdhaduk requested review from xmarcalx, kalyazin, pb8o and Manciukic as code owners

May 6, 2025 21:06

Copy link

Author

mdhaduk commented May 6, 2025

Great! Thank you for all the feedback, and I'm super stoked to potentially have my scripted integrated within the test suite!!

I went ahead and made the change to the script as you recommended which you can see above or here. Additionally, the following metrics may or may not be removed:

metrics_count (SharedIncMetric)
metrics_fails (SharedIncMetric)
missed_metrics_count (SharedIncMetric)

I opted to ask you since removing them would require reworking src/metrics.rs and requests/metrics.rs; if you deem that is okay I will go ahead and make those changes as well!

Thanks!

@roypat roypat assigned mdhaduk

May 7, 2025

@codecov Codecov

Copy link

codecov bot commented May 7, 2025 •

edited

Loading

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.92%. Comparing base (1ccfe46) to head (9d8ad3c).

Additional details and impacted files

@@ Coverage Diff @@
## main #5182 +/- ##
==========================================
+ Coverage 82.88% 82.92% +0.04% 
==========================================
 Files 250 250 
 Lines 26936 26927 -9 
==========================================
+ Hits 22325 22330 +5 
+ Misses 4611 4597 -14

Flag	Coverage Δ
5.10-c5n.metal	`83.36% <ø> (-0.01%)`	⬇️
5.10-m5n.metal	`83.36% <ø> (+<0.01%)`	⬆️
5.10-m6a.metal	`82.57% <ø> (-0.01%)`	⬇️
5.10-m6g.metal	`79.19% <ø> (-0.01%)`	⬇️
5.10-m6i.metal	`83.36% <ø> (-0.01%)`	⬇️
5.10-m7a.metal-48xl	`82.56% <ø> (?)`
5.10-m7g.metal	`79.19% <ø> (-0.01%)`	⬇️
5.10-m7i.metal-24xl	`83.32% <ø> (?)`
5.10-m7i.metal-48xl	`83.32% <ø> (?)`
5.10-m8g.metal-24xl	`79.18% <ø> (?)`
5.10-m8g.metal-48xl	`79.18% <ø> (?)`
6.1-c5n.metal	`83.41% <ø> (+<0.01%)`	⬆️
6.1-m5n.metal	`83.41% <ø> (-0.01%)`	⬇️
6.1-m6a.metal	`82.62% <ø> (-0.01%)`	⬇️
6.1-m6g.metal	`79.18% <ø> (-0.01%)`	⬇️
6.1-m6i.metal	`83.40% <ø> (-0.02%)`	⬇️
6.1-m7a.metal-48xl	`82.61% <ø> (?)`
6.1-m7g.metal	`79.19% <ø> (-0.01%)`	⬇️
6.1-m7i.metal-24xl	`83.41% <ø> (?)`
6.1-m7i.metal-48xl	`83.42% <ø> (?)`
6.1-m8g.metal-24xl	`79.18% <ø> (?)`
6.1-m8g.metal-48xl	`79.18% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

Contributor

roypat commented May 7, 2025

Ahh, I didn't realize we had files called metrics.rs that dont actually contain definition of metrics structs. How inconvenient haha. Those 3 metrics will need to stick around.

In this case, we can change

 or "metrics.rs" in path

 or ("metrics.rs" in path and "vmm" in path)

since all metrics definitions are in the "vmm" crate, and requests/metrics.rs is in the "firecracker" crate, and then it won't get these false-positives anymore :)

btw, I think your last push maybe combined changes from multiple branches into this PR? :D


 Removed rx_partial_writes

02817e8

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>

@mdhaduk mdhaduk force-pushed the rmv_unused_metrics branch from cdfbef8 to 02817e8 Compare

May 7, 2025 19:37


 Removed tx_partial_reads

1e1a61f

- metric is not used
Signed-off-by: Milan Dhaduk <mdhaduk7@gmail.com>
Signed-off-by: LDagnachew <leulmdagnachew@gmail.com>

Copy link

Author

mdhaduk commented May 7, 2025

Ahhh, I see. Thanks for pointing that out; I made the change, and the script seems to be working as intended.

And yes, sorry about that, I went ahead and cleaned up the history of this feature branch to only include changes related to the pr.

Let me know if any additional changes are needed, thank you!


 Merge branch 'main' into rmv_unused_metrics

9d8ad3c

Copy link

Contributor

roypat commented May 8, 2025

Ahhh, I see. Thanks for pointing that out; I made the change, and the script seems to be working as intended.

And yes, sorry about that, I went ahead and cleaned up the history of this feature branch to only include changes related to the pr.

Let me know if any additional changes are needed, thank you!

This is great! I think we can squash all the commits about removing individual metrics into a single one. Then, could you include your script under tests/host_tools/ as part of this PR, and add a very simple test in tests/integration_tests/style that just executes the script, failing if any unused metrics are found? :o

Copy link

Author

mdhaduk commented May 9, 2025

Sounds good! I will get to that as soon as I finish up some touches on the script!

So to clarify, if a metric is defined and/or aggregated within the vmm crate, but not used anywhere else (excluding tests), then that metric should be marked unused, right?

I ask because I've added a modification to account for metrics like the following that are improperly being marked as unused:

Some((&METRICS.latencies_us.load_snapshot, "load snapshot"))

The modification added rf".+{field}" as a part of the usage patterns to account for instances similar to the above example. This solves the issue, and the only remaining metrics are the following:

process_startup_time_us
process_startup_time_cpu_us
min_us
max_us
sum_us

All of which are only defined within the "vmm" crate and not used elsewhere in the codebase, so they should be removed, correct?

Let me know what you think! :D

Copy link

Contributor

roypat commented May 9, 2025

So to clarify, if a metric is defined and/or aggregated within the vmm crate, but not used anywhere else (excluding tests), then that metric should be marked unused, right?

Yeah, if the only thing a metric is used for is aggregation, or some other logic contained within a metrics.rs file in the vmm crate, then it should be marked as unused, except...

This solves the issue, and the only remaining metrics are the following:
* process_startup_time_us
* process_startup_time_cpu_us
* min_us
* max_us
* sum_us

...these are indeed special .-. The first two only get store()'d in logger/metrics.rs, but it's not actually an aggregation, it's because metrics.rs defines a helper function for setting the metric, and this utility function is then used outside of the metrics module. Similarly the min/max/sum thingies are set in the Drop impl for LatencyMetricsRecorder. I don't think there's a general rule that would catch these, so let's just have an "allowlist" that causes the script to ignore precisely these 5 metrics.

@zulinx86 zulinx86 requested a review from roypat

May 14, 2025 07:44

@JackThomson2

Copy link

Contributor

JackThomson2 commented Jun 4, 2025

Hey @mdhaduk are you planning to continue working on this and taking it to the finish line?