Files · 3e59efa2ad1d1777257bd3b1845d5acc4a931687 · open / abseil-cpp

Optimize `absl::Hash` by making `LowLevelHash` faster. · 3e59efa2

Throughput of the 64 byte chunk loop inside `LowLevelHash` (or now in `LowLevelHashLenGt16`) gets limited by the loop carried dependency on `current_state`. By using 4 states instead of 2, we can reduce this duration by 1 cycle. On Skylake, it is reduced from 9 cycles to 8 cycles (12.5% faster asymptotically).

To see the reduction in a simplified version of `LowLevelHash` implementation on Skylake:
* Before: https://godbolt.org/z/Tcj9vsGax, llvm-mca (https://godbolt.org/z/3o78Msr63) shows 9 cycles / iteration.
* After: https://godbolt.org/z/q4GM4EjPr, llvm-mca (https://godbolt.org/z/W5d1KEMzq) shows 8 cycles / iteration.
* This CL is removing 1 xor (1 cycle) per iteration from the critical path.

A block for 32 byte chunk is also added.

Finally, just before returning, `Mix` is called 1 time instead of twice.

PiperOrigin-RevId: 605090653
Change-Id: Ib7517ebb8bef7484066cd14cf41a943953e93377

committed Feb 07, 2024

3e59efa2

Name	Last commit	Last Update
.github		Loading commit data...
CMake		Loading commit data...
absl		Loading commit data...
ci		Loading commit data...
.clang-format		Loading commit data...
.gitignore		Loading commit data...
ABSEIL_ISSUE_TEMPLATE.md		Loading commit data...
AUTHORS		Loading commit data...
BUILD.bazel		Loading commit data...
CMakeLists.txt		Loading commit data...
CONTRIBUTING.md		Loading commit data...
FAQ.md		Loading commit data...
LICENSE		Loading commit data...
MODULE.bazel		Loading commit data...
PrivacyInfo.xcprivacy		Loading commit data...
README.md		Loading commit data...
UPGRADES.md		Loading commit data...
WORKSPACE		Loading commit data...
WORKSPACE.bzlmod		Loading commit data...
conanfile.py		Loading commit data...
create_lts.py		Loading commit data...

README.md