-
Notifications
You must be signed in to change notification settings - Fork 98
BitVector::build_index: 100x speedup #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Process the BitVector unit-by-unit instead of bit-by-bit. Use PopCount::count() to update num_1s and use select_bit to find the bit positions for the select0s_ and select1s_ index. According to my benchmarks, the old bit-by-bit version processed a 256kbit vector at about 20MB/s independent of enables_select0/enables_select1. The new version is 50x-150x faster, depending on the compiler and build_index options. enables_select_0=enables_select1=false: popcnt, no bmi2: 1600MB/s popcnt, no bmi2: 2500MB/s popcnt and bmi2: 2900MB/s enables_select_0=enables_select1=true: no popcnt, no bmi2: 1100MB/s popcnt, no bmi2: 1600MB/s popcnt and bmi2: 1800MB/s
This is safe and there is no truncation.
The 32-bit select_bit will be used to make the new build_index implementation work for MARISA_WORD_SIZE == 32.
This is already used by build_index and fixes the 32-bit build.
Benchmark
The following table shows build speed [1,000 keys/second].
jmr:build-index is 2-3% faster than s-yata:master. |
Did you configure with My benchmark was just on I will have time to run/profile the benchmarks myself later in the week. |
|
The table shows the speed of dictionary construction and |
|
It looks good tome. |
Process the
BitVectorunit-by-unit instead of bit-by-bit.Use
PopCount::count()to updatenum_1sand useselect_bitto findthe bit positions for the
select0s_andselect1s_indexes.According to my benchmarks, the old bit-by-bit version processed a
256kbit vector at about 20MB/s independent of
enables_select0andenables_select1.The new version is 50x-150x faster, depending on the compiler and build_index
options.
enables_select_0=enables_select1=false:popcnt, no bmi2: 1600MB/s
popcnt, no bmi2: 2500MB/s
popcnt and bmi2: 2900MB/s
enables_select_0=enables_select1=true:no popcnt, no bmi2: 1100MB/s
popcnt, no bmi2: 1600MB/s
popcnt and bmi2: 1800MB/s