Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Enable -fworker-wrapper-cbv on ghc-9.4 #1003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
AndreasPK wants to merge 1 commit into haskell:master
base: master
Choose a base branch
Loading
from AndreasPK:wip/ww-cbv

Conversation

Copy link
Contributor

@AndreasPK AndreasPK commented May 23, 2024

This flag allows for some significant perf improvements and the downsides don't apply to containers.

In particular this flag can cause rules to not fire if the relevant functions don't have NOINLINE pragmas. However the relevant functions in containers seem to have such a pragma so there should be no downside.

Here are the results for map-benchmarks:

Warning: Unknown/unsupported 'ghc' version detected (Cabal 3.8.1.0 supports
'ghc' version < 9.6): /opt/ghc-9.10.1/bin/ghc is version 9.10.1
Resolving dependencies...
Up to date
All
 lookup absent: OK
 89.4 μs ± 6.4 μs, same as baseline
 lookup present: OK
 75.3 μs ± 5.7 μs, same as baseline
 map: OK
 35.8 μs ± 3.0 μs, same as baseline
 map really: OK
 84.9 μs ± 5.3 μs, same as baseline
 <$: OK
 23.9 μs ± 1.4 μs, 11% less than baseline
 <$ really: OK
 57.1 μs ± 2.6 μs, same as baseline
 alterF lookup absent: OK
 89.6 μs ± 5.8 μs, same as baseline
 alterF lookup present: OK
 75.0 μs ± 5.3 μs, same as baseline
 alterF no rules lookup absent: OK
 91.6 μs ± 6.1 μs, same as baseline
 alterF no rules lookup present: OK
 80.9 μs ± 5.5 μs, same as baseline
 insert absent: OK
 209 μs ± 14 μs, 20% less than baseline
 insert present: OK
 159 μs ± 12 μs, 23% less than baseline
 alterF insert absent: OK
 253 μs ± 22 μs, 20% less than baseline
 alterF insert present: OK
 180 μs ± 11 μs, same as baseline
 alterF no rules insert absent: OK
 296 μs ± 23 μs, 16% less than baseline
 alterF no rules insert present: OK
 223 μs ± 11 μs, same as baseline
 delete absent: OK
 143 μs ± 12 μs, same as baseline
 delete present: OK
 199 μs ± 11 μs, 25% less than baseline
 alterF delete absent: OK
 163 μs ± 14 μs, same as baseline
 alterF delete present: OK
 234 μs ± 22 μs, 22% less than baseline
 alterF no rules delete absent: OK
 97.8 μs ± 6.0 μs, same as baseline
 alterF no rules delete present: OK
 272 μs ± 26 μs, 19% less than baseline
 alter absent: OK
 211 μs ± 11 μs, 23% less than baseline
 alter insert: OK
 215 μs ± 11 μs, 24% less than baseline
 alter update: OK
 170 μs ± 11 μs, 24% less than baseline
 alter delete: OK
 213 μs ± 13 μs, 25% less than baseline
 alterF alter absent: OK
 169 μs ± 11 μs, same as baseline
 alterF alter insert: OK
 245 μs ± 23 μs, 19% less than baseline
 alterF alter update: OK
 177 μs ± 11 μs, 9% less than baseline
 alterF alter delete: OK
 240 μs ± 22 μs, 21% less than baseline
 alterF no rules alter absent: OK
 98.5 μs ± 5.5 μs, same as baseline
 alterF no rules alter insert: OK
 293 μs ± 22 μs, 16% less than baseline
 alterF no rules alter update: OK
 222 μs ± 11 μs, same as baseline
 alterF no rules alter delete: OK
 272 μs ± 22 μs, 19% less than baseline
 insertWith absent: OK
 209 μs ± 11 μs, 20% less than baseline
 insertWith present: OK
 164 μs ± 11 μs, 22% less than baseline
 insertWith' absent: OK
 201 μs ± 11 μs, 20% less than baseline
 insertWith' present: OK
 171 μs ± 13 μs, 20% less than baseline
 insertWithKey absent: OK
 215 μs ± 11 μs, 21% less than baseline
 insertWithKey present: OK
 164 μs ± 11 μs, 23% less than baseline
 insertWithKey' absent: OK
 201 μs ± 11 μs, 21% less than baseline
 insertWithKey' present: OK
 159 μs ± 12 μs, 23% less than baseline
 insertLookupWithKey absent: OK
 216 μs ± 12 μs, 23% less than baseline
 insertLookupWithKey present: OK
 172 μs ± 11 μs, 23% less than baseline
 insertLookupWithKey' absent: OK
 207 μs ± 11 μs, 24% less than baseline
 insertLookupWithKey' present: OK
 174 μs ± 13 μs, 22% less than baseline
 mapWithKey: OK
 39.4 μs ± 3.5 μs, same as baseline
 foldlWithKey: OK
 351 μs ± 23 μs, 24% less than baseline
 foldlWithKey': OK
 15.0 μs ± 816 ns, same as baseline
 foldrWithKey: OK
 61.5 ns ± 5.3 ns, same as baseline
 foldrWithKey': OK
 32.2 μs ± 1.4 μs, same as baseline
 update absent: OK
 189 μs ± 12 μs, 21% less than baseline
 update present: OK
 146 μs ± 11 μs, 24% less than baseline
 update delete: OK
 200 μs ± 13 μs, 25% less than baseline
 updateLookupWithKey absent: OK
 199 μs ± 11 μs, 21% less than baseline
 updateLookupWithKey present: OK
 161 μs ± 11 μs, 23% less than baseline
 updateLookupWithKey delete: OK
 207 μs ± 12 μs, 24% less than baseline
 mapMaybe: OK
 95.0 μs ± 3.4 μs, 14% less than baseline
 mapMaybeWithKey: OK
 94.4 μs ± 5.7 μs, 10% less than baseline
 lookupIndex: OK
 167 μs ± 11 μs, same as baseline
 union: OK
 72.7 μs ± 5.9 μs, 26% less than baseline
 difference: OK
 63.3 μs ± 6.0 μs, 17% less than baseline
 intersection: OK
 28.9 μs ± 2.7 μs, 11% less than baseline
 split: OK
 7.58 ns ± 644 ps, same as baseline
 fromList: OK
 59.5 μs ± 2.8 μs, 4% less than baseline
 fromList-desc: OK
 363 μs ± 14 μs, 30% less than baseline
 fromAscList: OK
 78.6 μs ± 5.3 μs, same as baseline
 fromDistinctAscList: OK
 34.5 μs ± 2.9 μs, same as baseline
 fromDistinctAscList:fusion: OK
 30.9 μs ± 2.7 μs, same as baseline
 fromDistinctDescList: OK
 33.2 μs ± 1.6 μs, same as baseline
 fromDistinctDescList:fusion: OK
 30.3 μs ± 2.7 μs, same as baseline
 minView: OK
 18.8 ns ± 1.3 ns, 22% less than baseline
All 72 tests passed (13.93s)

Mikolaj, meooow25, and treeowl reacted with hooray emoji
This flag allows for some significant perf improvements and the
downsides don't apply to containers.
In particular this flag can cause rules to not fire *if* the relevant
functions don't have NOINLINE pragmas. However the relevant functions
in containers seem to have such a pragma so there should be no downside.
Copy link
Contributor Author

For reference I benchmarked this using 9.10 on a skylake machine. If others could try to reproduce these results I would be grateful.

Copy link
Contributor Author

AndreasPK commented May 23, 2024
edited
Loading

There is a segfault on 9.4.8, weirdly enough I'm not yet sure it has anything to do with this flag. In particular ghc segfaults when building the Main.hs of the seq-properties tests:

Loading unit process-1.6.18.0 ... linking ... done.
Loading unit transformers-compat-0.7.2 ... linking ... done.
Loading unit optparse-applicative-0.18.1.0 ... linking ... done.
Loading unit tagged-0.8.8 ... linking ... done.
Loading unit stm-2.5.1.0 ... linking ... done.
Loading unit tasty-1.5 ... linking ... done.
Loading unit tasty-quickcheck-0.10.3 ... linking ... done.
Loading unit call-stack-0.4.0 ... linking ... done.
Loading unit tasty-hunit-0.10.1 ... linking ... done.
Loading unit ghc-heap-9.10.1 ... linking ... done.
Loading unit primitive-0.9.0.0 ... linking ... done.
Loading unit vector-stream-0.1.0.1 ... linking ... done.
Loading unit vector-0.13.1.0 ... linking ... done.
Loading unit wherefrom-compat-0.1.1.0 ... linking ... done.
Loading unit nothunks-0.2.1.0 ... linking ... done.
Loading unit containers-tests-0 ... linking ... done.
Search directories (user):
Search directories (gcc):
Segmentation fault (core dumped)

I opened a ghc ticket.

Copy link
Contributor Author

Sadly seems there is a ghc bug in 9.4+ that causes the segfault. So I will let this rest until new point releases which contain a fix have been released.

Copy link
Contributor

Is the GHC issue fixed? If not, why don't we just enable the flag on >= 9.6?

Copy link
Contributor Author

AndreasPK commented Feb 19, 2025
edited
Loading

Is the GHC issue fixed? If not, why don't we just enable the flag on >= 9.6?

I believe 9.10.1 does not yet have the fix but 9.10.2 isn't far away and contains it. Once that is out all recent ghc versions will have it. Here is the upstream ticket: https://gitlab.haskell.org/ghc/ghc/-/issues/24870

meooow25 reacted with thumbs up emoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /