Optimize SwissMap iteration by another 5-10% for ARM
https://pastebin.com/fDvgWgHe After having a chat with Dougall Johnson (https://twitter.com/dougallj/status/1534213050944802816), we realized that __clzll works with zero arguments per documentation: https://developer.arm.com/documentation/101028/0009/Data-processing-intrinsics ``` Returns the number of leading zero bits in x. When x is zero it returns the argument width, i.e. 32 or 64. ``` Codegen improves https://godbolt.org/z/ebadf717Y Thus we can use a little bit different construction not involving CLS but using more understandable CLZ and removing some operations. PiperOrigin-RevId: 453879080 Change-Id: Ie2d7f834f63364d7bd50dd6a682c107985f21942
Showing
Please
register
or
sign in
to comment