Skip to content

Use rapidhashNano on folly::hasher<string/range> #9617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Nicoshev
Copy link
Contributor

@Nicoshev Nicoshev commented Jun 5, 2025

Summary:
X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from ddeb5e5 to 67ec734 Compare June 5, 2025 18:54
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 67ec734 to e5fac7f Compare June 5, 2025 18:58
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from e5fac7f to 4f227c0 Compare June 5, 2025 19:56
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 4f227c0 to 4da9652 Compare June 5, 2025 19:57
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 4da9652 to 30df8f2 Compare June 5, 2025 20:00
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 30df8f2 to 249b6dd Compare June 5, 2025 20:02
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 249b6dd to c47d9be Compare June 5, 2025 20:05
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from c47d9be to ef712fe Compare June 5, 2025 20:10
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 1dc6d0a to 22d2d37 Compare June 5, 2025 20:20
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 5, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 22d2d37 to abcc873 Compare June 5, 2025 20:27
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 5, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 6, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 6, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from abcc873 to 895a475 Compare June 6, 2025 21:02
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 6, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 6, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 895a475 to 99eeb69 Compare June 6, 2025 21:07
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 7, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 99eeb69 to 54fcd6d Compare June 7, 2025 16:59
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 7, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 7, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

@Nicoshev Nicoshev force-pushed the export-D76052916 branch from 54fcd6d to c6b0672 Compare June 7, 2025 17:05
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 7, 2025
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Summary:
We've seen high cpu usage on these two hashing algorithms: std::_Hash_bytes and multifeed/common/Hash.h::hashBytesImpl

For the record, std::_Hash_bytes compiles to ~60 instructions on aarch64 and ~100 instructions on AMD64: https://godbolt.org/z/xeoqf1aaE

hashBytesImpl compiles to slightly over 100 instructions on aarch64 and slightly over 160 instructions on AMD64: https://godbolt.org/z/bTroqGE7o

The diff adds three new hash functions: rapidhash, rapidhashMicro and rapidhashNano

RapidhashNano is designed for situations where keeping a small code size is a top priority.
Clang-19 compiles it to less than 100 instructions without stack usage, both on x86-64 and aarch64.
The fastest for sizes up to 48 bytes, but may be considerably slower for larger inputs.

RapidhashMicro is designed for situations where cache misses make a noticeable performance detriment.
Clang-19 compiles it to ~140 instructions without stack usage, both on x86-64 and aarch64.
Faster for sizes up to 512 bytes, just 15%-20% slower for inputs above 1kb.

rapidhash provides formidable speed across all input sizes
Clang-19 compiles it to ~185 instructions, both on x86-64 and aarch64.

Benchmark results on BGM: P1826606121, and Grace: P1826591223

On AMD64, RapidhashNano should be strictly better than both std::_Hash_bytes and hashBytesImpl

On aarch64, std::_Hash_bytes compiles to fewer instructions. RapidhashNano should still be faster in most situations, given its much higher throughput. It should also be strictly better than hashBytesImpl

In many situations, RapidhashMicro should be a better choice, due to its higher throughput. This diff allows us to analyze workloads on a case by case basis.

rapidhash seems to be the fastest high-quality hash function for aarch64 systems. It may still find usage on large-input cases.

Folly's benchmark results have been updated to include runs from Bergamo and Neoverse-V2

Differential Revision: D66326393
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 9, 2025
Summary:
X-link: facebook/hhvm#9617


Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@Nicoshev Nicoshev force-pushed the export-D76052916 branch from c6b0672 to e4b893d Compare June 9, 2025 21:36
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Jun 9, 2025
Summary:

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Nicoshev added a commit to Nicoshev/folly that referenced this pull request Jun 9, 2025
Summary:
X-link: facebook/hhvm#9617

Pull Request resolved: facebook#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
Summary:
Pull Request resolved: facebook#9617

X-link: facebook/folly#2448

Replacing SpookyHashV2 with rapidhashNano

folly::hasher::operator() accounts for almost 3M$ in $cpu_t1_equiv_per_year_q2_2025 https://fburl.com/strobelight/izute4k3
Given that integral hashing is the identity function, most of the registered cycles should come from strings/byteRanges

See D66326393 and D75697257 for a detailed discussion around benchmarks and canaries

Differential Revision: D76052916
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76052916

@Nicoshev Nicoshev force-pushed the export-D76052916 branch from e4b893d to 26465d9 Compare June 9, 2025 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants