Optimize IntMap.alter using unboxed sums. #523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

m-renaud wants to merge 1 commit into haskell:master

from m-renaud:optimize-intmap-alter

Draft

Optimize IntMap.alter using unboxed sums. #523

m-renaud wants to merge 1 commit into haskell:master from m-renaud:optimize-intmap-alter

Conversation

m-renaud

Copy link

Contributor

@m-renaud m-renaud commented Feb 1, 2018 •

edited

Loading

Use ptrEq and unboxed sums (available in GHC >= 8.2) to track whether or not the map
required modification. If not skip rebuilding a spine and return the original map.

Benchmark results:

Before:

benchmarking alter
time 881.1 μs (862.0 μs .. 899.2 μs)
 0.996 R2 (0.994 R2 .. 0.998 R2)
mean 854.9 μs (841.8 μs .. 872.5 μs)
std dev 49.55 μs (38.01 μs .. 69.97 μs)
variance introduced by outliers: 48% (moderately inflated)

After: (41% speedup)

benchmarking alter
time 513.1 μs (506.5 μs .. 519.6 μs)
 0.998 R2 (0.997 R2 .. 0.999 R2)
mean 517.8 μs (511.7 μs .. 528.5 μs)
std dev 27.05 μs (17.74 μs .. 47.52 μs)
variance introduced by outliers: 45% (moderately inflated)

@m-renaud


 Optimize IntMap.alter using unboxed sums.

1f61c77

We use unboxed sums (available in GHC >= 8.2) to track whether or not the map
required modification.
Benchmark results:
------------------
Before:
-------
benchmarking alter
time 881.1 μs (862.0 μs .. 899.2 μs)
 0.996 R2 (0.994 R2 .. 0.998 R2)
mean 854.9 μs (841.8 μs .. 872.5 μs)
std dev 49.55 μs (38.01 μs .. 69.97 μs)
variance introduced by outliers: 48% (moderately inflated)
After:
------
benchmarking alter
time 513.1 μs (506.5 μs .. 519.6 μs)
 0.998 R2 (0.997 R2 .. 0.999 R2)
mean 517.8 μs (511.7 μs .. 528.5 μs)
std dev 27.05 μs (17.74 μs .. 47.52 μs)
variance introduced by outliers: 45% (moderately inflated)

@m-renaud m-renaud added the performance label

Feb 1, 2018

@m-renaud

Copy link

Contributor Author

m-renaud commented Feb 1, 2018

I'll add more fine grained benchmarks if I get the chance tomorrow. Sending out for feedback on the code.

treeowl

treeowl reviewed

Feb 1, 2018

View reviewed changes

Data/IntMap/Internal.hs

-- If no modifications are made to the map (# (# #) | #) is returned, otherwise

-- (# | newMap #) is returned.

alter# :: (Maybe a -> Maybe a) -> Key -> IntMap a -> (# (# #) | IntMap a #)

Copy link

Contributor

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you make the function have type Maybe# a -> Maybe# a, where Maybe# is the unboxed version of Maybe? Couldn't we often avoid the Maybe allocation that way? Benchmark!

Copy link

Contributor Author

@m-renaud m-renaud Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function is user defined (passed to alter) so at some point you would have to do the conversion from Maybe to Maybe#. As far as I know coerce wouldn't work for f because they have different representations (Maybe and Maybe# that is), but there may be another way that I'm unaware of. Did you have a specific approach in mind?

Copy link

Contributor

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a decent bet that the function we're passed will be small enough to inline. So calling alter# with

\ p -> fromMaybe (f (toMaybe p))

(essentially) should usually avoid any actual Maybes. Or so I imagine.

@m-renaud

Copy link

Contributor Author

m-renaud commented Feb 1, 2018

So, as is always the case, using unboxed sums makes some operations faster, and others a lot slower :(

Before:

benchmarking alter absent
time 283.1 μs (262.8 μs .. 298.1 μs)
 0.982 R2 (0.977 R2 .. 0.991 R2)
mean 270.9 μs (262.3 μs .. 283.0 μs)
std dev 32.15 μs (24.01 μs .. 45.31 μs)
variance introduced by outliers: 84% (severely inflated)
 
benchmarking alter insert
time 277.1 μs (273.3 μs .. 282.3 μs)
 0.997 R2 (0.995 R2 .. 0.999 R2)
mean 275.7 μs (271.7 μs .. 281.7 μs)
std dev 16.49 μs (11.93 μs .. 27.52 μs)
variance introduced by outliers: 56% (severely inflated)
 
benchmarking alter update
time 280.3 μs (276.7 μs .. 284.3 μs)
 0.998 R2 (0.998 R2 .. 0.999 R2)
mean 280.5 μs (277.7 μs .. 284.5 μs)
std dev 11.21 μs (8.620 μs .. 15.42 μs)
variance introduced by outliers: 36% (moderately inflated)
 
benchmarking alter delete
time 282.5 μs (274.8 μs .. 289.7 μs)
 0.996 R2 (0.995 R2 .. 0.998 R2)
mean 281.1 μs (276.0 μs .. 290.9 μs)
std dev 21.75 μs (12.40 μs .. 34.62 μs)
variance introduced by outliers: 68% (severely inflated)
 
benchmarking alter delete absent
time 243.8 μs (240.8 μs .. 246.9 μs)
 0.998 R2 (0.997 R2 .. 0.999 R2)
mean 241.7 μs (239.1 μs .. 245.8 μs)
std dev 11.08 μs (8.402 μs .. 14.84 μs)
variance introduced by outliers: 43% (moderately inflated)

After:

benchmarking alter absent (-16%)
time 237.9 μs (234.0 μs .. 241.8 μs)
 0.997 R2 (0.995 R2 .. 0.999 R2)
mean 237.0 μs (232.9 μs .. 246.5 μs)
std dev 20.16 μs (7.155 μs .. 32.95 μs)
variance introduced by outliers: 74% (severely inflated)
 
benchmarking alter insert (+290%)
time 805.2 μs (783.5 μs .. 828.1 μs)
 0.990 R2 (0.979 R2 .. 0.996 R2)
mean 822.7 μs (799.7 μs .. 872.8 μs)
std dev 110.8 μs (59.63 μs .. 192.9 μs)
variance introduced by outliers: 84% (severely inflated)
 
benchmarking alter update (+250%)
time 705.5 μs (699.2 μs .. 712.7 μs)
 0.997 R2 (0.995 R2 .. 0.999 R2)
mean 703.5 μs (695.1 μs .. 716.0 μs)
std dev 34.66 μs (25.56 μs .. 49.30 μs)
variance introduced by outliers: 41% (moderately inflated)
 
benchmarking alter delete (+255%)
time 720.1 μs (684.8 μs .. 766.0 μs)
 0.987 R2 (0.978 R2 .. 0.998 R2)
mean 684.8 μs (674.2 μs .. 702.9 μs)
std dev 46.01 μs (28.89 μs .. 72.45 μs)
variance introduced by outliers: 57% (severely inflated)
 
benchmarking alter delete absent (-8%)
time 225.4 μs (223.0 μs .. 227.6 μs)
 0.999 R2 (0.998 R2 .. 0.999 R2)
mean 225.1 μs (222.4 μs .. 231.9 μs)
std dev 13.72 μs (6.353 μs .. 26.03 μs)
variance introduced by outliers: 58% (severely inflated)

It looks like in cases where we need to modify the spine anyways the extra pattern matches (despite being unboxed sums) on every Bin have quite a bit of overhead.

treeowl

treeowl reviewed

Feb 1, 2018

View reviewed changes

Data/IntMap/Internal.hs

alter# f !k t@(Bin p m l r)

| nomatch k p m = case f Nothing of

Nothing -> (# (# #) | #)

Just x -> (# | link k (Tip k x) p t #)

Copy link

Contributor

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you're being strict enough with your return values.

Copy link

Contributor Author

@m-renaud m-renaud Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like the following?

let !t' = link k (Tip k x) p t in (# | t' #)

Or something else?

Copy link

Contributor

@treeowl treeowl Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but see the link below for a clearer way.

Data/IntMap/Internal.hs

-- If no modifications are made to the map (# (# #) | #) is returned, otherwise

-- (# | newMap #) is returned.

alter# :: (Maybe a -> Maybe a) -> Key -> IntMap a -> (# (# #) | IntMap a #)

Copy link

Contributor

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a decent bet that the function we're passed will be small enough to inline. So calling alter# with

\ p -> fromMaybe (f (toMaybe p))

(essentially) should usually avoid any actual Maybes. Or so I imagine.

@treeowl

Copy link

Contributor

treeowl commented Feb 5, 2018

Take a look at treeowl@6f6fa02. That's not very well organized yet, and doesn't deal with the lazy version, but it shows how the strict version can be made more eager, how pattern synonyms can cut down dramatically on the noise, and how we can "unbox" the Maybes consumed and produced by the passed function. Feel free to incorporate those ideas into your branch if you wish.

@treeowl

Copy link

Contributor

treeowl commented Feb 5, 2018

We need to benchmark the unboxed sums version against one that just uses pointer equality on each recursive call.

@m-renaud

Copy link

Contributor Author

m-renaud commented Feb 7, 2018

Thanks for the info David! I've been realllllly busy the last week, sorry I haven't had time to move this forward as quickly as I'd like. I'm hoping to get around to this sometime this week.

@sjakobi sjakobi added the IntMap label

Jul 15, 2020

@sjakobi sjakobi mentioned this pull request

Jul 15, 2020

Too restrictive type signatures #538

Closed

@sjakobi

Copy link

Member

sjakobi commented Aug 1, 2020

What's the status here, @m-renaud?

In light of #538 it would be nice to make progress on this.

@m-renaud

Copy link

Contributor Author

m-renaud commented Aug 10, 2020

Oh geeze it's been a while since I've looked at this, and realistically I won't have substantial time to look into this and incorporate the suggestion's from the comment above in the near future.

@treeowl

Copy link

Contributor

treeowl commented Aug 10, 2020

I will try to look at it shortly. No guarantees.

@sjakobi sjakobi marked this pull request as draft

April 29, 2021 14:07

@sjakobi

Copy link

Member

sjakobi commented Apr 29, 2021

I've marked this PR as a draft to clarify that it's not ready for final review / merge.

@Boarders

Copy link

Contributor

Boarders commented Sep 10, 2021

Shall I try to take this up and finish it - anyone know what there is to do left?

@treeowl

Copy link

Contributor

treeowl commented Sep 10, 2021

I don't remember,, sorry. Lost track of this one. Worth trying!

@Boarders Boarders mentioned this pull request

Sep 10, 2021

Unboxed sums for performance (Discussion) #801

Open

@sjakobi

Copy link

Member

sjakobi commented Sep 15, 2021

Shall I try to take this up and finish it

That would be great! :)

anyone know what there is to do left?

I think the minimum is to fix any strictness issues such as #523 (comment).

If the benchmarks show good improvements at this stage, I think this should be good to merge.

Adopting some of noise reduction measures suggested in treeowl@6f6fa02 probably wouldn't hurt though.

Regarding further optimization ideas, I think those could be left for further work.

@Kleidukos

Copy link

Member

Kleidukos commented Nov 2, 2022 •

edited

Loading

What is the status on this one? Are these benchmarks numbers still the same on GHC 9.2?

@treeowl

Copy link

Contributor

treeowl commented Nov 2, 2022

What is the status on this one? Are these benchmarks numbers still the same on GHC 9.2?

Has something changed there?

@Kleidukos

Copy link

Member

Kleidukos commented Nov 2, 2022

One can only hope, that's why I'm asking :)

@treeowl

Copy link

Contributor

treeowl commented Nov 2, 2022

I haven't tested recently. Do you have time to try?

@Kleidukos

Copy link

Member

Kleidukos commented Nov 2, 2022

No unfortunately I'm quite committed on Haddock and other stuff for the time being. How's the maintainer structure for containers going btw? Need any help on this front?

Labels

IntMap performance

5 participants

@m-renaud @treeowl @sjakobi @Boarders @Kleidukos

Optimize IntMap.alter using unboxed sums. #523

Are you sure you want to change the base?

Optimize IntMap.alter using unboxed sums. #523

Uh oh!

Conversation

@m-renaud m-renaud commented Feb 1, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark results:

Uh oh!

m-renaud commented Feb 1, 2018

Uh oh!

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

Uh oh!

@m-renaud m-renaud Feb 1, 2018

Choose a reason for hiding this comment

Uh oh!

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

Uh oh!

m-renaud commented Feb 1, 2018

Uh oh!

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

Uh oh!

@m-renaud m-renaud Feb 6, 2018

Choose a reason for hiding this comment

Uh oh!

@treeowl treeowl Feb 6, 2018

Choose a reason for hiding this comment

Uh oh!

@treeowl treeowl Feb 1, 2018

Choose a reason for hiding this comment

Uh oh!

treeowl commented Feb 5, 2018

Uh oh!

treeowl commented Feb 5, 2018

Uh oh!

m-renaud commented Feb 7, 2018

Uh oh!

sjakobi commented Aug 1, 2020

Uh oh!

m-renaud commented Aug 10, 2020

Uh oh!

treeowl commented Aug 10, 2020

Uh oh!

sjakobi commented Apr 29, 2021

Uh oh!

Boarders commented Sep 10, 2021

Uh oh!

treeowl commented Sep 10, 2021

Uh oh!

sjakobi commented Sep 15, 2021

Uh oh!

Kleidukos commented Nov 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

treeowl commented Nov 2, 2022

Uh oh!

Kleidukos commented Nov 2, 2022

Uh oh!

treeowl commented Nov 2, 2022

Uh oh!

Kleidukos commented Nov 2, 2022

Uh oh!

Uh oh!

@m-renaud m-renaud commented Feb 1, 2018 •

edited

Loading

Kleidukos commented Nov 2, 2022 •

edited

Loading