perf: improve performance of 2018 day 2 part 2 #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

maneatingape merged 4 commits into maneatingape:main from Rolv-Apneseth:2018_day2_p2_perf

Jul 12, 2025

Merged

perf: improve performance of 2018 day 2 part 2 #11

maneatingape merged 4 commits into maneatingape:main from Rolv-Apneseth:2018_day2_p2_perf

Jul 12, 2025

Conversation

@Rolv-Apneseth

Copy link

Contributor

@Rolv-Apneseth Rolv-Apneseth commented Jul 12, 2025

Hi again. This is a quick one for improving the performance of 2018 day 2 part 2 (85μs => 15μs for me). Let me know if you want any changes, and no hard feelings if you don't want the PR at all.

Rolv-Apneseth added 3 commits

July 12, 2025 13:27

@Rolv-Apneseth


 perf: improve performance of 2018 day 2 part 2

6977c03

@Rolv-Apneseth


 fix: clippy lint

22569fd

@Rolv-Apneseth


 fix: clippy lint for the clippy lint

f6c50ce

@Rolv-Apneseth Rolv-Apneseth force-pushed the 2018_day2_p2_perf branch from 02094ef to f6c50ce Compare

July 12, 2025 12:39

@maneatingape

Copy link

Owner

maneatingape commented Jul 12, 2025

Let me know if you want any changes, and no hard feelings if you don't want the PR at all.

I really appreciate anyone taking the time to make performance improvements or showing how solutions could be better, so all PR are much appreciated!

maneatingape

maneatingape requested changes

Jul 12, 2025

View reviewed changes

src/year2018/day02.rs Outdated Show resolved Hide resolved

src/year2018/day02.rs Outdated

buffer[0..width].copy_from_slice(id);

buffer[column] = b'*';

let mut diff = false;

for (a, b) in id1.iter().zip(id2) {

Copy link

Owner

@maneatingape maneatingape Jul 12, 2025 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes a performance regression on my input, 87 μs to 129 μs.

Two nested loops have worst case quadratic complexity O(n2).
Hashing is a constant time operation so the original solution worst case complexity is O(1) * 26 * O(n) = O(n).

So it looks like you're getting lucky, while I'm getting unlucky with the input order.

Copy link

Contributor Author

@Rolv-Apneseth Rolv-Apneseth Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am no expert in O(N) notation, but wouldn't the N here be the number of IDs, so it should be lower than O(N^2) (second loop starts at the index of the first loop)?

Anyway, fair enough about the performance being worse. Just luck based that it's so much faster for my input then. I'll close the PR if that's alright with you, but out of curiosity would mind sharing your input too?

Copy link

Owner

@maneatingape maneatingape Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am no expert in O(N) notation, but wouldn't the N here be the number of IDs, so it should be lower than O(N^2) (second loop starts at the index of the first loop)?

Say you have 10 ids. First inner loop iteration will check 9 items, next iteration 8 and so on...
9 + 8 + 7 + ... + 2 + 1 = n(n + 1) / 2 = (1/2)(n2 + n)
So you save half the comparisons but it's still overall quadratic.

Anyway, fair enough about the performance being worse. Just luck based that it's so much faster for my input then. I'll close the PR if that's alright with you, but out of curiosity would mind sharing your input too?

If you re-order the elements (for example swap the first and second halves) you can change the performance drastically. With a few random swaps I got benchmarks from 33 μs to 250 μs.

Copy link

Owner

@maneatingape maneatingape Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it might be possible to improve the performance by taking advantage of the special structure of the input (always 26 lower case ASCII characters).

Copy link

Contributor Author

@Rolv-Apneseth Rolv-Apneseth Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you save half the comparisons but it's still overall quadratic.

I seeee, thank you for helping me understand.

If you re-order the elements (for example swap the first and second halves) you can change the performance drastically

Ah yes, can confirm moving one of the solutions can make it much worse. I really did get quite lucky.

I still think it might be possible to improve the performance by taking advantage of the special structure of the input (always 26 lower case ASCII characters).

I'll have a look at it again later and see if I can come up with anything (unless you do first)

@maneatingape

Copy link

Owner

maneatingape commented Jul 12, 2025

BTW I welcome PRs (even if they don't always work out). It's always an interesting discussion and could spark an improvement either in this or another solution.

@Rolv-Apneseth

Copy link

Contributor Author

Rolv-Apneseth commented Jul 12, 2025

BTW I welcome PRs (even if they don't always work out). It's always an interesting discussion and could spark an improvement either in this or another solution.

Sure yeah. Thank you for being so understanding.

@Rolv-Apneseth


 refactor: revert back to the original approach - but only insert the ...

845d760

...prefix and suffix of IDs into the hashset

@maneatingape maneatingape closed this

Jul 12, 2025

@Rolv-Apneseth

Copy link

Contributor Author

Rolv-Apneseth commented Jul 12, 2025

I tried a couple different (more complicated) approaches without much luck. Went back to the original approach and changed it from storing the ID with a character replaced in the set, to just storing the prefix and suffix around the character, and I believe there is some performance gained from that

@Rolv-Apneseth

Copy link

Contributor Author

Rolv-Apneseth commented Jul 12, 2025

If you're interested @maneatingape, it was just a simple change but for me it's giving around 30-45% speed up:

pub fn part2(input: &[&[u8]]) -> String {
 let width = input[0].len();
 let mut seen = FastSet::with_capacity(input.len());
 // Use a set to check for duplicates by comparing the prefix and suffix of IDs excluding one
 // column at a time.
 for column in 0..width {
 for &id in input {
 let prefix = &id[..column];
 let suffix = &id[column + 1..];
 if !seen.insert([prefix, suffix]) {
 // Convert to String
 return prefix.iter().chain(suffix).cloned().map(char::from).collect();
 }
 }
 seen.clear();
 }
 unreachable!()
}

@maneatingape maneatingape reopened this

Jul 12, 2025

maneatingape

maneatingape approved these changes

Jul 12, 2025

View reviewed changes

Copy link

Owner

@maneatingape maneatingape left a comment •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Faster and more elegant.

79 μs => 49 μs.

@maneatingape maneatingape merged commit a8a3007 into maneatingape:main

Jul 12, 2025

4 of 6 checks passed

@Rolv-Apneseth

Copy link

Contributor Author

Rolv-Apneseth commented Jul 12, 2025

Missed that lint - not getting these from clippy locally but yeah cloned -> copied makes sense

@Rolv-Apneseth Rolv-Apneseth deleted the 2018_day2_p2_perf branch

July 12, 2025 18:52

Labels

None yet

2 participants

@Rolv-Apneseth @maneatingape

perf: improve performance of 2018 day 2 part 2 #11

perf: improve performance of 2018 day 2 part 2 #11

Uh oh!

Conversation

@Rolv-Apneseth Rolv-Apneseth commented Jul 12, 2025

Uh oh!

maneatingape commented Jul 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

@maneatingape maneatingape Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

@Rolv-Apneseth Rolv-Apneseth Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

@maneatingape maneatingape Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

@maneatingape maneatingape Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

@Rolv-Apneseth Rolv-Apneseth Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

maneatingape commented Jul 12, 2025

Uh oh!

Rolv-Apneseth commented Jul 12, 2025

Uh oh!

Rolv-Apneseth commented Jul 12, 2025

Uh oh!

Rolv-Apneseth commented Jul 12, 2025

Uh oh!

@maneatingape maneatingape left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Rolv-Apneseth commented Jul 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

@maneatingape maneatingape Jul 12, 2025 •

edited

Loading

@maneatingape maneatingape left a comment •

edited

Loading