1. 07 Oct, 2023 1 commit
  2. 06 Oct, 2023 5 commits
  3. 05 Oct, 2023 1 commit
  4. 04 Oct, 2023 1 commit
  5. 03 Oct, 2023 3 commits
    • Use ABSL_RAW_LOG and ABSL_PREDICT_* for all debug checks in swisstable including… · d26b6250
      Use ABSL_RAW_LOG and ABSL_PREDICT_* for all debug checks in swisstable including sanitizer mode checks.
      
      Sanitizer mode can be used for canaries so performance is still relevant. This change also makes the code more uniform.
      
      PiperOrigin-RevId: 570438923
      Change-Id: I62859160eb9323e6420680a43fd23e97e8a62389
      Evan Brown committed
    • Refactor swisstable copy/move assignment to fix issues with allocator… · 22dc7911
      Refactor swisstable copy/move assignment to fix issues with allocator propagation and improve performance.
      
      Correctness:
      - We use swap to implement copy assignment and move assignment, which means that allocator propagation in copy/move assignment depends on `propagate_on_container_swap` in addition to `propagate_on_container_copy_assignment`/`propagate_on_container_move_assignment`.
      - In swap, if `propagate_on_container_swap` is `false` and `get_allocator() != other.get_allocator()`, the behavior is undefined (https://en.cppreference.com/w/cpp/container/unordered_set/swap) - we should assert that this UB case isn't happening. For this reason, we also delete the NoPropagateOn_Swap test case in raw_hash_set_allocator_test.
      
      Performance:
      - Don't rely on swap so we don't have to do unnecessary copying into the moved-from sets.
      - Don't use temp sets in move assignment.
      - Default the move constructor of CommonFields.
      - Avoid using exchange in raw_hash_set move constructor.
      - In `raw_hash_set(raw_hash_set&& that, const allocator_type& a)` with unequal allocators and in move assignment with non-propagating unequal allocators, move set keys instead of copying them.
      PiperOrigin-RevId: 570419290
      Change-Id: I499e54f17d9cb0b0836601f5c06187d1f269a5b8
      Evan Brown committed
    • Update a dead link. · 74a8f6fa
      This cl/ updates the link provided in the comment to point to a valid website. Currently the link points to https://screenshot.googleplex.com/BZhRp6mNJAtjMmz which is now a software company landing page.
      
      PiperOrigin-RevId: 570384723
      Change-Id: Ib6d17851046125957e092b59d845ddb7ecb1f7b7
      Abseil Team committed
  6. 02 Oct, 2023 2 commits
  7. 27 Sep, 2023 3 commits
  8. 26 Sep, 2023 6 commits
  9. 23 Sep, 2023 1 commit
  10. 22 Sep, 2023 1 commit
  11. 21 Sep, 2023 4 commits
    • Mutex: Rollback requeing waiters as LIFO · 90e8f6f7
      PiperOrigin-RevId: 567415671
      Change-Id: I59bfcb5ac9fbde227a4cdb3b497b0bd5969b0770
      Abseil Team committed
    • Optimize CRC32 Extend for large inputs on Arm · aa3c949a
      This is a temporary workaround for an apparent compiler bug with pmull(2) instructions. The current hot loop looks like this:
      
      mov	w14, #0xef02,
      lsl	x15, x15, #6,
      mov	x13, xzr,
      movk	w14, #0x740e, lsl #16,
      sub	x15, x15, #0x40,
      ldr	q4, [x16, #0x4e0],
      
      _LOOP_START:
      add	x16, x9, x13,
      add	x17, x12, x13,
      fmov	d19, x14,            <--------- This is Loop invariant and expensive
      add	x13, x13, #0x40,
      cmp	x15, x13,
      prfm	pldl1keep, [x16, #0x140],
      prfm	pldl1keep, [x17, #0x140],
      ldp	x18, x0, [x16, #0x40],
      crc32cx	w10, w10, x18,
      ldp	x2, x18, [x16, #0x50],
      crc32cx	w10, w10, x0,
      crc32cx	w10, w10, x2,
      ldp	x0, x2, [x16, #0x60],
      crc32cx	w10, w10, x18,
      ldp	x18, x16, [x16, #0x70],
      pmull2	v5.1q, v1.2d, v4.2d,
      pmull2	v6.1q, v0.2d, v4.2d,
      pmull2	v7.1q, v2.2d, v4.2d,
      pmull2	v16.1q, v3.2d, v4.2d,
      ldp	q17, q18, [x17, #0x40],
      crc32cx	w10, w10, x0,
      pmull	v1.1q, v1.1d, v19.1d,
      crc32cx	w10, w10, x2,
      pmull	v0.1q, v0.1d, v19.1d,
      crc32cx	w10, w10, x18,
      pmull	v2.1q, v2.1d, v19.1d,
      crc32cx	w10, w10, x16,
      pmull	v3.1q, v3.1d, v19.1d,
      ldp	q20, q21, [x17, #0x60],
      eor	v1.16b, v17.16b, v1.16b,
      eor	v0.16b, v18.16b, v0.16b,
      eor	v1.16b, v1.16b, v5.16b,
      eor	v2.16b, v20.16b, v2.16b,
      eor	v0.16b, v0.16b, v6.16b,
      eor	v3.16b, v21.16b, v3.16b,
      eor	v2.16b, v2.16b, v7.16b,
      eor	v3.16b, v3.16b, v16.16b,
      b.ne	_LOOP_START
      
      There is a redundant fmov that moves the same constant into a Neon register every loop iteration to be used in the PMULL instructions. The PMULL2 instructions already have this constant loaded into Neon registers. After this change, both the PMULL and PMULL2 instructions use the values in q4, and they are not reloaded every iteration. This fmov was expensive because it contends for execution units with crc32cx instructions. This is up to 20% faster for large inputs.
      
      PiperOrigin-RevId: 567391972
      Change-Id: I4c8e49750cfa5cc5730c3bb713bd9fd67657804a
      Connal de Souza committed
    • Replace BtreeAllocatorTest with individual test cases for copy/move/swap… · 821756c3
      Replace BtreeAllocatorTest with individual test cases for copy/move/swap propagation (defined in test_allocator.h) and minimal alignment.
      
      Also remove some extraneous value_types from typed tests. The motivation is to reduce btree_test compile time.
      
      PiperOrigin-RevId: 567376572
      Change-Id: I6ac6130b99faeadaedab8c2c7b05d5e23e77cc1e
      Evan Brown committed
    • Rollback "absl: speed up Mutex::Lock" · e313f0ed
      There are some regressions reported.
      
      PiperOrigin-RevId: 567181925
      Change-Id: I4ee8a61afd336de7ecb22ec307adb2068932bc8b
      Dmitry Vyukov committed
  12. 20 Sep, 2023 6 commits
  13. 19 Sep, 2023 4 commits
    • Refactor for preliminary API update. · d91f39ab
      PiperOrigin-RevId: 566675048
      Change-Id: Ie598c21474858974e4b4adbad401c61a38924c98
      Abseil Team committed
    • Additional StrCat microbenchmarks. · bd467aad
      PiperOrigin-RevId: 566650311
      Change-Id: Ibfabee88ea9999d08ade05ece362f5a075d19695
      Abseil Team committed
    • absl: speed up Mutex::Lock · cffc9ef2
      Currently Mutex::Lock contains not inlined non-tail call:
      TryAcquireWithSpinning -> GetMutexGlobals -> LowLevelCallOnce -> init closure
      This turns the function into non-leaf with stack frame allocation
      and additional register use. Remove this non-tail call to make the function leaf.
      Move spin iterations initialization to LockSlow.
      
      Current Lock happy path:
      
      00000000001edc20 <absl::Mutex::Lock()>:
        1edc20:	55                   	push   %rbp
        1edc21:	48 89 e5             	mov    %rsp,%rbp
        1edc24:	53                   	push   %rbx
        1edc25:	50                   	push   %rax
        1edc26:	48 89 fb             	mov    %rdi,%rbx
        1edc29:	48 8b 07             	mov    (%rdi),%rax
        1edc2c:	a8 19                	test   $0x19,%al
        1edc2e:	75 0e                	jne    1edc3e <absl::Mutex::Lock()+0x1e>
        1edc30:	48 89 c1             	mov    %rax,%rcx
        1edc33:	48 83 c9 08          	or     $0x8,%rcx
        1edc37:	f0 48 0f b1 0b       	lock cmpxchg %rcx,(%rbx)
        1edc3c:	74 42                	je     1edc80 <absl::Mutex::Lock()+0x60>
        ... unhappy path ...
        1edc80:	48 83 c4 08          	add    $0x8,%rsp
        1edc84:	5b                   	pop    %rbx
        1edc85:	5d                   	pop    %rbp
        1edc86:	c3                   	ret
      
      New Lock happy path:
      
      00000000001eea80 <absl::Mutex::Lock()>:
        1eea80:	48 8b 07             	mov    (%rdi),%rax
        1eea83:	a8 19                	test   $0x19,%al
        1eea85:	75 0f                	jne    1eea96 <absl::Mutex::Lock()+0x16>
        1eea87:	48 89 c1             	mov    %rax,%rcx
        1eea8a:	48 83 c9 08          	or     $0x8,%rcx
        1eea8e:	f0 48 0f b1 0f       	lock cmpxchg %rcx,(%rdi)
        1eea93:	75 01                	jne    1eea96 <absl::Mutex::Lock()+0x16>
        1eea95:	c3                   	ret
        ... unhappy path ...
      
      PiperOrigin-RevId: 566488042
      Change-Id: I62f854b82a322cfb1d42c34f8ed01b4677693fca
      Dmitry Vyukov committed
    • absl: requeue waiters as LIFO · a5dc018f
      Currently if a thread already blocked on a Mutex,
      but then failed to acquire the Mutex, we queue it in FIFO order again.
      As the result unlucky threads can suffer bad latency
      if they are requeued several times.
      The least we can do for them is to queue in LIFO order after blocking.
      
      PiperOrigin-RevId: 566478783
      Change-Id: I8bac08325f20ff6ccc2658e04e1847fd4614c653
      Dmitry Vyukov committed
  14. 18 Sep, 2023 1 commit
  15. 15 Sep, 2023 1 commit
    • absl: remove special case for timed CondVar waits · 2c1e7e3c
      CondVar wait morhping has a special case for timed waits.
      The code goes back to 2006, it seems that there might have
      been some reasons to do this back then.
      But now it does not seem to be necessary.
      Wait morphing should work just fine after timed CondVar waits.
      Remove the special case and simplify code.
      
      PiperOrigin-RevId: 565798838
      Change-Id: I4e4d61ae7ebd521f5c32dfc673e57a0c245e7cfb
      Dmitry Vyukov committed