Commits · cced061b3e68da386aaadca0d87e03538ce2bc72 · open / abseil-cpp

26 Sep, 2023 5 commits

Add an internal wrapper for `abi::__cxa_demangle()`. · cced061b
```
PiperOrigin-RevId: 568652465
Change-Id: I9f72a11cb514eaf694dae589a19dc139891e7af2
```
Abseil Team committed Sep 26, 2023
cced061b Browse Files

Optimize CRC32 for Ampere Siryn · ac364eb9

Siryn's crc32 instruction seems to have latency 3 and throughput 1, which makes the optimal ratio of pmull and crc streams close to that of tested x86 machines. Up to +120% faster for large inputs.

PiperOrigin-RevId: 568645559
Change-Id: I86b85b1b2a5d4fb3680c516c4c9044238b20fe61

committed Sep 26, 2023

ac364eb9 Browse Files

Fix logging flags documentation to refer to `LogSink`s instead of logfiles · 2fa24cc4
```
PiperOrigin-RevId: 568603611
Change-Id: I7a31e0d6336a7235a8dc6eeed5680625cb3b4298
```
Derek Mauro committed Sep 26, 2023
2fa24cc4 Browse Files

Adds `AbslStringify` to `absl::Status` for completeness. · e3114cc5

This also adds a test for `operator<<`.

PiperOrigin-RevId: 568590367
Change-Id: Ia0ad39cb582e7d24e6c4131827818d8c4b10dfd9

committed Sep 26, 2023

e3114cc5 Browse Files

`absl::Overload()` which returns a functor that provides overloads based on the… · f4c6246d

`absl::Overload()` which returns a functor that provides overloads based on the functors passed to it.

PiperOrigin-RevId: 568476251
Change-Id: Ic625c9b5300d1db496979c178ca1e655581f9276

committed Sep 26, 2023

f4c6246d Browse Files

23 Sep, 2023 1 commit
- Import of CCTZ from GitHub. · d53ca3be
```
PiperOrigin-RevId: 567869792
Change-Id: I29948282b57b401f3199dc41160538aa9a8079a7
```
  Abseil Team committed Sep 23, 2023
  d53ca3be Browse Files
22 Sep, 2023 1 commit
- Fix a crash when calling `EstimatedMemoryUsage()` on an empty checksummed `absl::Cord`. · 1ad22093
```
PiperOrigin-RevId: 567695227
Change-Id: I13eb8a1872d2fe703b5f3b9bc8df7fec4381fb55
```
  Abseil Team committed Sep 22, 2023
  1ad22093 Browse Files
21 Sep, 2023 4 commits

Mutex: Rollback requeing waiters as LIFO · 90e8f6f7
```
PiperOrigin-RevId: 567415671
Change-Id: I59bfcb5ac9fbde227a4cdb3b497b0bd5969b0770
```
Abseil Team committed Sep 21, 2023
90e8f6f7 Browse Files

Optimize CRC32 Extend for large inputs on Arm · aa3c949a

This is a temporary workaround for an apparent compiler bug with pmull(2) instructions. The current hot loop looks like this:

mov	w14, #0xef02,
lsl	x15, x15, #6,
mov	x13, xzr,
movk	w14, #0x740e, lsl #16,
sub	x15, x15, #0x40,
ldr	q4, [x16, #0x4e0],

_LOOP_START:
add	x16, x9, x13,
add	x17, x12, x13,
fmov	d19, x14,            <--------- This is Loop invariant and expensive
add	x13, x13, #0x40,
cmp	x15, x13,
prfm	pldl1keep, [x16, #0x140],
prfm	pldl1keep, [x17, #0x140],
ldp	x18, x0, [x16, #0x40],
crc32cx	w10, w10, x18,
ldp	x2, x18, [x16, #0x50],
crc32cx	w10, w10, x0,
crc32cx	w10, w10, x2,
ldp	x0, x2, [x16, #0x60],
crc32cx	w10, w10, x18,
ldp	x18, x16, [x16, #0x70],
pmull2	v5.1q, v1.2d, v4.2d,
pmull2	v6.1q, v0.2d, v4.2d,
pmull2	v7.1q, v2.2d, v4.2d,
pmull2	v16.1q, v3.2d, v4.2d,
ldp	q17, q18, [x17, #0x40],
crc32cx	w10, w10, x0,
pmull	v1.1q, v1.1d, v19.1d,
crc32cx	w10, w10, x2,
pmull	v0.1q, v0.1d, v19.1d,
crc32cx	w10, w10, x18,
pmull	v2.1q, v2.1d, v19.1d,
crc32cx	w10, w10, x16,
pmull	v3.1q, v3.1d, v19.1d,
ldp	q20, q21, [x17, #0x60],
eor	v1.16b, v17.16b, v1.16b,
eor	v0.16b, v18.16b, v0.16b,
eor	v1.16b, v1.16b, v5.16b,
eor	v2.16b, v20.16b, v2.16b,
eor	v0.16b, v0.16b, v6.16b,
eor	v3.16b, v21.16b, v3.16b,
eor	v2.16b, v2.16b, v7.16b,
eor	v3.16b, v3.16b, v16.16b,
b.ne	_LOOP_START

There is a redundant fmov that moves the same constant into a Neon register every loop iteration to be used in the PMULL instructions. The PMULL2 instructions already have this constant loaded into Neon registers. After this change, both the PMULL and PMULL2 instructions use the values in q4, and they are not reloaded every iteration. This fmov was expensive because it contends for execution units with crc32cx instructions. This is up to 20% faster for large inputs.

PiperOrigin-RevId: 567391972
Change-Id: I4c8e49750cfa5cc5730c3bb713bd9fd67657804a

committed Sep 21, 2023

aa3c949a Browse Files

Replace BtreeAllocatorTest with individual test cases for copy/move/swap… · 821756c3

Replace BtreeAllocatorTest with individual test cases for copy/move/swap propagation (defined in test_allocator.h) and minimal alignment.

Also remove some extraneous value_types from typed tests. The motivation is to reduce btree_test compile time.

PiperOrigin-RevId: 567376572
Change-Id: I6ac6130b99faeadaedab8c2c7b05d5e23e77cc1e

committed Sep 21, 2023

821756c3 Browse Files

Rollback "absl: speed up Mutex::Lock" · e313f0ed

There are some regressions reported.

PiperOrigin-RevId: 567181925
Change-Id: I4ee8a61afd336de7ecb22ec307adb2068932bc8b

committed Sep 20, 2023

e313f0ed Browse Files

20 Sep, 2023 6 commits

Use ABSL_PREDICT_FALSE and ABSL_RAW_LOG for shared safety checks in raw_hash_set. · db08109e

`SwisstableDebugEnabled()` is also true for release builds with hardening
enabled. To minimize their impact in those builds:
 - use `ABSL_PREDICT_FALSE()` to provide a compiler hint for code layout
 - use `ABSL_RAW_LOG()` with a format string to reduce code size and improve
   the chances that the hot paths will be inlined.

PiperOrigin-RevId: 567102494
Change-Id: I6734bd491d7b2e1fb9df0e86f4e29e6ad0a03102

committed Sep 20, 2023

db08109e Browse Files

Rolling back cl/565792699 · 1f220d57
```
PiperOrigin-RevId: 567102456
Change-Id: I0750284c36850adbabc5ec0b4a2635aa8a967e53
```
Abseil Team committed Sep 20, 2023
1f220d57 Browse Files

absl:speed up Mutex::[Reader]TryLock · c45a4393

Tidy up Mutex::[Reader]TryLock codegen by outlining slow path
and non-tail function call, and un-unrolling the loop.

Current codegen:
https://gist.githubusercontent.com/dvyukov/a4d353fd71ac873af9332c1340675b60/raw/226537ffa305b25a79ef3a85277fa870fee5191d/gistfile1.txt

New codegen:
https://gist.githubusercontent.com/dvyukov/686a094c5aa357025689764f155e5a29/raw/e3125c1cdb5669fac60faf336e2f60395e29d888/gistfile1.txt

name old cpu/op new cpu/op delta
BM_TryLock 18.0ns ± 0% 17.7ns ± 0% -1.64% (p=0.016 n=4+5)
BM_ReaderTryLock/real_time/threads:1 17.9ns ± 0% 17.9ns ± 0% -0.10% (p=0.016 n=5+5)
BM_ReaderTryLock/real_time/threads:72 9.61µs ± 8% 8.42µs ± 7% -12.37% (p=0.008 n=5+5)

PiperOrigin-RevId: 567006472
Change-Id: Iea0747e71bbf2dc1f00c70a4235203071d795b99

committed Sep 20, 2023

c45a4393 Browse Files

absl: add Mutex::[Reader]TryLock benchmark · adcaae43
```
PiperOrigin-RevId: 566991965
Change-Id: I6c4d64de79d303e69b18330bda04fdc84d40893d
```
Dmitry Vyukov committed Sep 20, 2023
adcaae43 Browse Files

absl: speed up Mutex::ReaderLock/Unlock · 28549d18

Currently ReaderLock/Unlock tries CAS only once.
Even if there is moderate contention from other readers only,
ReaderLock/Unlock go onto slow path, which does lots of additional work
before retrying the CAS (since there are only readers, the slow path
logic is not really needed for anything).
Retry CAS while there are only readers.

name                                old cpu/op   new cpu/op   delta
BM_ReaderLock/real_time/threads:1   17.9ns ± 0%  17.9ns ± 0%     ~     (p=0.071 n=5+5)
BM_ReaderLock/real_time/threads:72  11.4µs ± 3%   8.4µs ± 4%  -26.24%  (p=0.008 n=5+5)

PiperOrigin-RevId: 566981511
Change-Id: I432a3c1d85b84943d0ad4776a34fa5bfcf5b3b8e

committed Sep 20, 2023

28549d18 Browse Files

absl: add Mutex::ReaderLock benchmark · 556fcb57
```
PiperOrigin-RevId: 566961701
Change-Id: Id04e4c5a598f508a0fe7532ae8f084c583865f2d
```
Dmitry Vyukov committed Sep 20, 2023
556fcb57 Browse Files

19 Sep, 2023 4 commits

Refactor for preliminary API update. · d91f39ab
```
PiperOrigin-RevId: 566675048
Change-Id: Ie598c21474858974e4b4adbad401c61a38924c98
```
Abseil Team committed Sep 19, 2023
d91f39ab Browse Files
Additional StrCat microbenchmarks. · bd467aad
```
PiperOrigin-RevId: 566650311
Change-Id: Ibfabee88ea9999d08ade05ece362f5a075d19695
```
Abseil Team committed Sep 19, 2023
bd467aad Browse Files

absl: speed up Mutex::Lock · cffc9ef2

Currently Mutex::Lock contains not inlined non-tail call:
TryAcquireWithSpinning -> GetMutexGlobals -> LowLevelCallOnce -> init closure
This turns the function into non-leaf with stack frame allocation
and additional register use. Remove this non-tail call to make the function leaf.
Move spin iterations initialization to LockSlow.

Current Lock happy path:

00000000001edc20 <absl::Mutex::Lock()>:
  1edc20:	55                   	push   %rbp
  1edc21:	48 89 e5             	mov    %rsp,%rbp
  1edc24:	53                   	push   %rbx
  1edc25:	50                   	push   %rax
  1edc26:	48 89 fb             	mov    %rdi,%rbx
  1edc29:	48 8b 07             	mov    (%rdi),%rax
  1edc2c:	a8 19                	test   $0x19,%al
  1edc2e:	75 0e                	jne    1edc3e <absl::Mutex::Lock()+0x1e>
  1edc30:	48 89 c1             	mov    %rax,%rcx
  1edc33:	48 83 c9 08          	or     $0x8,%rcx
  1edc37:	f0 48 0f b1 0b       	lock cmpxchg %rcx,(%rbx)
  1edc3c:	74 42                	je     1edc80 <absl::Mutex::Lock()+0x60>
  ... unhappy path ...
  1edc80:	48 83 c4 08          	add    $0x8,%rsp
  1edc84:	5b                   	pop    %rbx
  1edc85:	5d                   	pop    %rbp
  1edc86:	c3                   	ret

New Lock happy path:

00000000001eea80 <absl::Mutex::Lock()>:
  1eea80:	48 8b 07             	mov    (%rdi),%rax
  1eea83:	a8 19                	test   $0x19,%al
  1eea85:	75 0f                	jne    1eea96 <absl::Mutex::Lock()+0x16>
  1eea87:	48 89 c1             	mov    %rax,%rcx
  1eea8a:	48 83 c9 08          	or     $0x8,%rcx
  1eea8e:	f0 48 0f b1 0f       	lock cmpxchg %rcx,(%rdi)
  1eea93:	75 01                	jne    1eea96 <absl::Mutex::Lock()+0x16>
  1eea95:	c3                   	ret
  ... unhappy path ...

PiperOrigin-RevId: 566488042
Change-Id: I62f854b82a322cfb1d42c34f8ed01b4677693fca

committed Sep 18, 2023

cffc9ef2 Browse Files

absl: requeue waiters as LIFO · a5dc018f

Currently if a thread already blocked on a Mutex,
but then failed to acquire the Mutex, we queue it in FIFO order again.
As the result unlucky threads can suffer bad latency
if they are requeued several times.
The least we can do for them is to queue in LIFO order after blocking.

PiperOrigin-RevId: 566478783
Change-Id: I8bac08325f20ff6ccc2658e04e1847fd4614c653

committed Sep 18, 2023

a5dc018f Browse Files

18 Sep, 2023 1 commit

Change absl::Status implementation to be amenable to [[clang:trivial_abi]] annotation. · 243b7d38

This moves the implementation of most methods from absl::Status to absl::status_internal::StatusRep, and ensures that no calls to absl::Status methods are in a cc file.

Stub implementations checking only inlined rep properties and calling no-op (RepToPointer) or out of line methods exist in status.h

PiperOrigin-RevId: 566187430
Change-Id: I356ec29c0970ffe82eac2a5d98850e647fcd5ea5

committed Sep 17, 2023

243b7d38 Browse Files

15 Sep, 2023 7 commits

absl: remove special case for timed CondVar waits · 2c1e7e3c

CondVar wait morhping has a special case for timed waits.
The code goes back to 2006, it seems that there might have
been some reasons to do this back then.
But now it does not seem to be necessary.
Wait morphing should work just fine after timed CondVar waits.
Remove the special case and simplify code.

PiperOrigin-RevId: 565798838
Change-Id: I4e4d61ae7ebd521f5c32dfc673e57a0c245e7cfb

committed Sep 15, 2023

2c1e7e3c Browse Files

Honor ABSL_MIN_LOG_LEVEL in CHECK_XX, CHECK_STRXX, CHECK_OK, and the QCHECK flavors of these. · 9356553a

In particular, if ABSL_MIN_LOG_LEVEL exceeds kFatal, these should, upon failure, terminate the program without logging anything. The lack of logging should be visible to the optimizer so that it can strip string literals and stringified variable names from the object file.

Making some edge cases work under Clang required rewriting NormalizeLogSeverity to help make constraints on its return value more obvious to the optimizer.

PiperOrigin-RevId: 565792699
Change-Id: Ibb6a47d4956191bbbd0297e04492cddc354578e2

committed Sep 15, 2023

9356553a Browse Files

Fix a bug in which we used propagate_on_container_copy_assignment in btree move assignment. · f44e2cac
```
PiperOrigin-RevId: 565730754
Change-Id: Id828847d32c812736669803c179351433dda4aa6
```
Evan Brown committed Sep 15, 2023
f44e2cac Browse Files

Move CountingAllocator into test_allocator.h and add some other allocators that… · 49be2e68

Move CountingAllocator into test_allocator.h and add some other allocators that can be shared between different container tests.

PiperOrigin-RevId: 565693736
Change-Id: I59af987e30da03a805ce59ff0fb7eeae3fc08293

committed Sep 15, 2023

49be2e68 Browse Files

Allow const qualified FunctionRef instances. This allows the signature to be… · e68f1412

Allow const qualified FunctionRef instances. This allows the signature to be compatible with AnyInvokable for const uses.

PiperOrigin-RevId: 565682320
Change-Id: I924dadf110481e572bdb8af0111fa62d6f553d90

committed Sep 15, 2023

e68f1412 Browse Files

absl: optimize Condition checks in Mutex code · 9a592abd

1. Remove special handling of Condition::kTrue.

Condition::kTrue is used very rarely (frequently its uses even indicate
confusion and bugs). But we pay few additional branches for kTrue
on all Condition operations.
Remove that special handling and simplify logic.

2. And remove known_false condition in Mutex code.

Checking known_false condition only causes slow down because:
1. We already built skip list with equivalent conditions
(and keep improving it on every Skip call). And when we built
the skip list, we used more capable GuaranteedEqual function
(it does not just check for equality of pointers,
but for also for equality of function/arg).

2. Condition pointer are rarely equal even for equivalent conditions
becuase temp Condition objects are usually created on the stack.
We could call GuaranteedEqual(cond, known_false) instead of cond == known_false,
but that slows down things even more (see point 1).

So remove the known_false optimization.
Benchmark results for this and the previous change:

name                        old cpu/op   new cpu/op   delta
BM_ConditionWaiters/0/1     36.0ns ± 0%  34.9ns ± 0%   -3.02%  (p=0.008 n=5+5)
BM_ConditionWaiters/1/1     36.0ns ± 0%  34.9ns ± 0%   -2.98%  (p=0.008 n=5+5)
BM_ConditionWaiters/2/1     35.9ns ± 0%  34.9ns ± 0%   -3.03%  (p=0.016 n=5+4)
BM_ConditionWaiters/0/8     55.5ns ± 5%  49.8ns ± 3%  -10.33%  (p=0.008 n=5+5)
BM_ConditionWaiters/1/8     36.2ns ± 0%  35.2ns ± 0%   -2.90%  (p=0.016 n=5+4)
BM_ConditionWaiters/2/8     53.2ns ± 7%  48.3ns ± 7%     ~     (p=0.056 n=5+5)
BM_ConditionWaiters/0/64     295ns ± 1%   254ns ± 2%  -13.73%  (p=0.008 n=5+5)
BM_ConditionWaiters/1/64    36.2ns ± 0%  35.2ns ± 0%   -2.85%  (p=0.008 n=5+5)
BM_ConditionWaiters/2/64     290ns ± 6%   250ns ± 4%  -13.68%  (p=0.008 n=5+5)
BM_ConditionWaiters/0/512   5.50µs ±12%  4.99µs ± 8%     ~     (p=0.056 n=5+5)
BM_ConditionWaiters/1/512   36.7ns ± 3%  35.2ns ± 0%   -4.10%  (p=0.008 n=5+5)
BM_ConditionWaiters/2/512   4.44µs ±13%  4.01µs ± 3%   -9.74%  (p=0.008 n=5+5)
BM_ConditionWaiters/0/4096   104µs ± 6%   101µs ± 3%     ~     (p=0.548 n=5+5)
BM_ConditionWaiters/1/4096  36.2ns ± 0%  35.1ns ± 0%   -3.03%  (p=0.008 n=5+5)
BM_ConditionWaiters/2/4096  90.4µs ± 5%  85.3µs ± 7%     ~     (p=0.222 n=5+5)
BM_ConditionWaiters/0/8192   384µs ± 5%   367µs ± 7%     ~     (p=0.222 n=5+5)
BM_ConditionWaiters/1/8192  36.2ns ± 0%  35.2ns ± 0%   -2.84%  (p=0.008 n=5+5)
BM_ConditionWaiters/2/8192   363µs ± 3%   316µs ± 7%  -12.84%  (p=0.008 n=5+5)

PiperOrigin-RevId: 565669535
Change-Id: I5180c4a787933d2ce477b004a111853753304684

committed Sep 15, 2023

9a592abd Browse Files

Remove implicit int64_t->uint64_t conversion in ARM version of V128_Extract64 · c78a3f32
```
PiperOrigin-RevId: 565662176
Change-Id: I18d5d9eb444b0090e3f4ab8f66ad214a67344268
```
Abseil Team committed Sep 15, 2023
c78a3f32 Browse Files

14 Sep, 2023 1 commit
- Remove unused internal function `InlineRep::empty()` which has misleading semantics. · 5655528c
```
PiperOrigin-RevId: 565330231
Change-Id: I84f0e9065986bb592b5bfb196b3fc221feb14bc4
```
  Abseil Team committed Sep 14, 2023
  5655528c Browse Files
13 Sep, 2023 2 commits
- Make `HasAbslStringify` public. · 9e1789ff
```
PiperOrigin-RevId: 565050503
Change-Id: I8f4c463be4ef513a2788745d1b454a7ede489152
```
  Abseil Team committed Sep 13, 2023
  9e1789ff Browse Files
- Removed Google-only #ifdefs. · 6c6b2733
```
PiperOrigin-RevId: 565040001
Change-Id: I1c2e715c97375754c8d863132be2c388265ca4ad
```
  Abseil Team committed Sep 13, 2023
  6c6b2733 Browse Files
12 Sep, 2023 3 commits

Typo fix. · f5b19acb

PiperOrigin-RevId: 564779671
Change-Id: I8cae825a533a00ff1983b48782486d5d00dae69a

committed Sep 12, 2023

f5b19acb Browse Files

Make PolicyTraits::transfer_uses_memcpy() true for node_hash_* tables. · 5ae23ed1

This should enable binary size savings for now and more efficiency improvements with small buffer optimization.

PiperOrigin-RevId: 564741270
Change-Id: Icf204d88256243eb60464439a52dd589d7a559cb

committed Sep 12, 2023

5ae23ed1 Browse Files

Add a flat_hash_set_test that we use value_type member functions to read/write… · f01b1b56

Add a flat_hash_set_test that we use value_type member functions to read/write from value_types when we aren't allowed to memcpy them.

The motivation is to prevent a bug in small buffer optimization for swisstables.

PiperOrigin-RevId: 564726325
Change-Id: Id0df5d28d65c7586428001fcb266886988cd481e

committed Sep 12, 2023

f01b1b56 Browse Files

11 Sep, 2023 2 commits

Fixes StrCat() performance regression when not using libc++ · f3eae68b

65d7b6d4 changed StrCat() to not use an intermediate buffer when the
result fits in the SSO buffer, but only libc++ has an SSO buffer large
enough for this optimization to work.

PiperOrigin-RevId: 564447163
Change-Id: I0c7fa4afed3369b36e13e7d1691eb7f933ea0091

committed Sep 11, 2023

f3eae68b Browse Files

Doc fix. · 317085ad

PiperOrigin-RevId: 564296635
Change-Id: I13ca663cdb676948a7041c5671b82a97a4388ff1

committed Sep 11, 2023

317085ad Browse Files

08 Sep, 2023 3 commits

Remove CordRepRing experiment. · efb035a5

We have no intention to use it instead of the CordRepBtree implementation, so cleanup up and remove all code and references.

PiperOrigin-RevId: 563803813
Change-Id: I95a67318d0f722f3eb7ecdcc7b6c87e28f2e26dd

committed Sep 08, 2023

efb035a5 Browse Files

Fix strict weak ordering in convert_test.cc · 09d29c58

It sorts NaNs and the test became flaky. Flakiness arises from the fact that sorting checks randomize and check for 100 elements but we sort here around a thousand

PiperOrigin-RevId: 563783036
Change-Id: Id25bcb47483acf9c40be3fd1747c37d046197330

committed Sep 08, 2023

09d29c58 Browse Files

Rollback: · 792e55fc

absl: remove special handling of Condition::kTrue
absl: remove known_false condition in Mutex code
There are some test breakages.

PiperOrigin-RevId: 563751370
Change-Id: Ie14dc799e0a0d286a7e1b47f0a9bbe59dfb23f70

committed Sep 08, 2023

792e55fc Browse Files