-
Notifications
You must be signed in to change notification settings - Fork 102
WIP: Represent empty maps as BitmapIndexed 0 []
#579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| -- empty vs. anything | ||
| go !_ t1 Empty = t1 | ||
| go _ Empty t2 = t2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just one of probably many performance bugs:
- We want to be strict in the
Shiftargument! - Any BmI vs. BmI cases (including empties) are now handled via
unionArrayBy, which doesn't include any shortcuts for handling empties yet!
| new :: Int -> a -> ST s (MArray s a) | ||
| new _n@(I# n#) b = | ||
| CHECK_GT("new",_n,(0 :: Int)) | ||
| CHECK_GE("new",_n,(0 :: Int)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you leave this as it was, and define empty from first principles. We really don't want to create any other empty arrays!
| in HM.bitmapIndexedOrFull (b .|. m) ary' | ||
| let !l = leaf h k x | ||
| in if b == 0 | ||
| then l |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmmm.... The extent of special casing is making me skeptical about this whole endeavor, especially if I'm right that we can't safely use pointer equality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO this should still improve branch prediction. We've simply moved the Empty alternative from the outer-most "layer" to this inner branch.
However we still check for the empty case at every level of the tree, which is still suboptimal.
What we can do more easily now, is to split off an inner function that doesn't need to check for empties.
Something like:
insertWith f k v Empty = Leaf ...
insertWith f k v m = go ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we could know the branch at the tag check, before loading the heap object. Now we need to wait. How's that improve matters?
8021eb6 to
81fbd2e
Compare
|
So far (at 81fbd2e), the For present keys, the speedups are around 5–10% for size >=10. For absent keys, there is of course a slowdown of about 10% for the empty map. But the other sizes are faster, especially the smaller ones. |
|
size 1 is about 10% faster and for the sizes >= 50, I see a speedup of ~3–8%. EDIT: After a small refactoring, |
$ cabal run fine-grained -- -p size --stdev 1
All
HashMap.Strict
size
Int
0: OK
2.04 ns ± 22 ps
1: OK
3.01 ns ± 40 ps
5: OK
20.0 ns ± 128 ps
10: OK
45.3 ns ± 568 ps
50: OK
186 ns ± 1.1 ns
100: OK
396 ns ± 4.1 ns
500: OK
1.88 μs ± 5.8 ns
1000: OK
4.59 μs ± 84 ns
5000: OK
28.3 μs ± 483 ns
10000: OK
56.2 μs ± 885 ns
50000: OK
387 μs ± 4.4 μs
100000: OK
735 μs ± 4.9 μs
500000: OK
5.36 ms ± 51 μs
81fbd2e to
f02530b
Compare
...in order to reduce code size.
| (# st #) -> | ||
| let !st' = go collPos (nextSH shiftedHash) k st | ||
| -- These let-bindings help GHC form join points in order to | ||
| -- prevent code duplication. | ||
| deletion = BitmapIndexed (b .&. complement m) (A.delete ary i) | ||
| update_ = BitmapIndexed b (A.update ary i st') | ||
| {-# NOINLINE update_ #-} | ||
| in case st' of | ||
| Empty | A.length ary == 2 | ||
| , (# l #) <- A.index# ary (otherOfOneOrZero i) | ||
| , isLeafOrCollision l | ||
| -> l | ||
| | otherwise | ||
| -> deletion | ||
| _ | isLeafOrCollision st' && A.length ary == 1 -> st' | ||
| | otherwise -> update_ | ||
| where m = maskSH shiftedHash | ||
| i = sparseIndex b m | ||
| go collPos shiftedHash k (Full ary) = | ||
| case A.index# ary i of | ||
| (# st #) -> case go collPos (nextSH shiftedHash) k st of | ||
| Empty -> | ||
| (# st #) -> | ||
| let !st' = go collPos (nextSH shiftedHash) k st | ||
| -- This let-binding helps GHC form a join point in order to | ||
| -- prevent code duplication. | ||
| update_ = Full (updateFullArray ary i st') | ||
| {-# NOINLINE update_ #-} | ||
| in if null st' | ||
| then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code duplication issue is pretty annoying. Apparently, when a constructor check is replaced by a constructor check and an additional field comparison (bitmap == 0), GHC tends to duplicate the code for the fallthrough case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative way to handle this, would be to change the return type of go to (# (# #) | HashMap k v #), where the "Nothing"-case represents an empty map.
Addresses #578.
TODO:
lookupdeleteadjustsizedisjointfirst.empty.