Commit 62ce712e by Abseil Team Committed by Derek Mauro

Export of internal Abseil changes

--
b927776da818c674a674e46a7bbbdd54170a0ad3 by Todd Lipcon <tlipcon@google.com>:

Include priority in the calculation of mutex waiter equivalence

This changes the behavior of the absl::Mutex wait list to take into account
waiter priority when creating "skip chains". A skip chain on the wait list
is a set of adjacent waiters that share some property and enable skipping
during traversal.

Prior to this CL, the skip chains were formed of waiters with the same
wait type (e.g. exclusive vs read) and Condition. With this CL, the priority
is also taken into account.

This avoids O(n) behavior when enqueueing a waiter onto a wait list where
the oldest waiter is at a lower priority than the waiter to be enqueued.
With the prior notion of equivalence class, a skip chain could contain
waiters of different priority, so we had to walk the linked list one-by-one
until finding the appropriate insertion point. With the new equivalence
class computation, we can skip past all of the equivalent waiters to find
the right insertion point.

This gives a substantial improvement to the enqueue performance in the
case where there's already a waiter at lower priority.

Note that even though this code path isn't a hot one, it's performed while
holding the Mutex's spinlock, which prevents other threads from unlocking
the Mutex, so minimizing the time under the critical section can have
"knock-on" throughput benefits.

Notable performance differences:

name                                                                    old cpu/op  new cpu/op  delta
BM_MutexEnqueue/multiple_priorities:0/threads:4                         8.60µs ± 7%  8.69µs ± 6%     ~     (p=0.365 n=19+20)
BM_MutexEnqueue/multiple_priorities:0/threads:64                        8.47µs ± 5%  8.64µs ±10%     ~     (p=0.569 n=19+20)
BM_MutexEnqueue/multiple_priorities:0/threads:128                       8.56µs ± 3%  8.55µs ± 6%     ~     (p=0.563 n=17+17)
BM_MutexEnqueue/multiple_priorities:0/threads:512                       8.98µs ± 8%  8.86µs ± 4%     ~     (p=0.232 n=19+17)
BM_MutexEnqueue/multiple_priorities:1/threads:4                         6.64µs ±10%  6.45µs ± 4%     ~     (p=0.097 n=20+17)
BM_MutexEnqueue/multiple_priorities:1/threads:64                        15.2µs ± 8%   9.1µs ± 4%  -39.93%  (p=0.000 n=20+17)
BM_MutexEnqueue/multiple_priorities:1/threads:128                       22.3µs ± 6%   9.4µs ± 4%  -57.82%  (p=0.000 n=20+17)
BM_MutexEnqueue/multiple_priorities:1/threads:512                       61.5µs ± 3%  10.1µs ± 8%  -83.53%  (p=0.000 n=20+20)

name                                                                    old time/op             new time/op             delta
BM_Mutex/real_time/threads:1                                            19.6ns ± 4%             19.8ns ±11%     ~           (p=0.534 n=17+17)
BM_Mutex/real_time/threads:112                                           120ns ±17%              122ns ±14%     ~           (p=0.988 n=20+18)
BM_MutexEnqueue/multiple_priorities:0/threads:4                         5.18µs ± 6%             5.23µs ± 6%     ~           (p=0.428 n=19+20)
BM_MutexEnqueue/multiple_priorities:0/threads:64                        5.06µs ± 5%             5.18µs ±10%     ~           (p=0.235 n=19+20)
BM_MutexEnqueue/multiple_priorities:0/threads:128                       5.16µs ± 3%             5.14µs ± 6%     ~           (p=0.474 n=17+17)
BM_MutexEnqueue/multiple_priorities:0/threads:512                       5.40µs ± 8%             5.32µs ± 5%     ~           (p=0.196 n=20+18)
BM_MutexEnqueue/multiple_priorities:1/threads:4                         3.99µs ±10%             3.88µs ± 3%     ~           (p=0.074 n=20+17)
BM_MutexEnqueue/multiple_priorities:1/threads:64                        8.48µs ± 9%             5.41µs ± 3%  -36.20%        (p=0.000 n=20+16)
BM_MutexEnqueue/multiple_priorities:1/threads:128                       12.2µs ± 6%              5.6µs ± 4%  -54.43%        (p=0.000 n=20+17)
BM_MutexEnqueue/multiple_priorities:1/threads:512                       32.1µs ± 3%              5.9µs ± 8%  -81.45%        (p=0.000 n=20+20)
...
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:32   1.69µs ± 4%             1.66µs ± 2%   -1.91%        (p=0.000 n=20+20)
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:48   1.90µs ± 2%             1.82µs ± 2%   -4.09%        (p=0.000 n=20+19)
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:64   2.19µs ± 2%             1.80µs ± 1%  -17.89%        (p=0.000 n=20+20)
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:96   2.18µs ± 5%             1.81µs ± 1%  -16.94%        (p=0.000 n=17+19)
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:128  2.18µs ± 1%             1.91µs ± 2%  -12.33%        (p=0.000 n=19+20)
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:192  2.27µs ± 2%             1.89µs ± 1%  -16.79%        (p=0.000 n=20+19)
BM_Contended<absl::Mutex>/cs_ns:2000/num_prios:2/real_time/threads:256  2.36µs ± 2%             1.83µs ± 1%  -22.25%        (p=0.000 n=20+19)

PiperOrigin-RevId: 350775432

--
e7812590e5dbd75d21e2e8762713bd04c0353ef6 by Todd Lipcon <tlipcon@google.com>:

Fix test timeouts for sequence_lock_test on TSAN

PiperOrigin-RevId: 350680903

--
3090d8154d875f3eabce48876321ae8d6a197302 by Todd Lipcon <tlipcon@google.com>:

Add benchmarks for Mutex performance with multiple priorities

This adds a new benchmark to mutex_benchmark which forces threads to go
through the slow "Enqueue" path. The benchmark runs with varying numbers
of threads and with/without the presence of a lower-priority waiter.

PiperOrigin-RevId: 350655403
GitOrigin-RevId: b927776da818c674a674e46a7bbbdd54170a0ad3
Change-Id: If739e5e205f0d3867661a52466b8f64e7e033b22
parent 92ba5359
...@@ -13,6 +13,7 @@ ...@@ -13,6 +13,7 @@
// limitations under the License. // limitations under the License.
#include "absl/flags/internal/sequence_lock.h" #include "absl/flags/internal/sequence_lock.h"
#include <algorithm>
#include <atomic> #include <atomic>
#include <thread> // NOLINT(build/c++11) #include <thread> // NOLINT(build/c++11)
#include <tuple> #include <tuple>
...@@ -112,13 +113,21 @@ std::vector<int> MultiplicativeRange(int low, int high, int scale) { ...@@ -112,13 +113,21 @@ std::vector<int> MultiplicativeRange(int low, int high, int scale) {
return result; return result;
} }
INSTANTIATE_TEST_SUITE_P(TestManyByteSizes, ConcurrentSequenceLockTest, #ifndef ABSL_HAVE_THREAD_SANITIZER
testing::Combine( const int kMaxThreads = absl::base_internal::NumCPUs();
// Buffer size (bytes). #else
testing::Range(1, 128), // With TSAN, a lot of threads contending for atomic access on the sequence
// Number of reader threads. // lock make this test run too slowly.
testing::ValuesIn(MultiplicativeRange( const int kMaxThreads = std::min(absl::base_internal::NumCPUs(), 4);
1, absl::base_internal::NumCPUs(), 2)))); #endif
INSTANTIATE_TEST_SUITE_P(
TestManyByteSizes, ConcurrentSequenceLockTest,
testing::Combine(
// Buffer size (bytes).
testing::Range(1, 128),
// Number of reader threads.
testing::ValuesIn(MultiplicativeRange(1, kMaxThreads, 2))));
// Simple single-threaded test, parameterized by the size of the buffer to be // Simple single-threaded test, parameterized by the size of the buffer to be
// protected. // protected.
......
...@@ -761,11 +761,13 @@ void SetMutexDeadlockDetectionMode(OnDeadlockCycle mode) { ...@@ -761,11 +761,13 @@ void SetMutexDeadlockDetectionMode(OnDeadlockCycle mode) {
synch_deadlock_detection.store(mode, std::memory_order_release); synch_deadlock_detection.store(mode, std::memory_order_release);
} }
// Return true iff threads x and y are waiting on the same condition for the // Return true iff threads x and y are part of the same equivalence
// same type of lock. Requires that x and y be waiting on the same Mutex // class of waiters. An equivalence class is defined as the set of
// queue. // waiters with the same condition, type of lock, and thread priority.
static bool MuSameCondition(PerThreadSynch *x, PerThreadSynch *y) { //
return x->waitp->how == y->waitp->how && // Requires that x and y be waiting on the same Mutex queue.
static bool MuEquivalentWaiter(PerThreadSynch *x, PerThreadSynch *y) {
return x->waitp->how == y->waitp->how && x->priority == y->priority &&
Condition::GuaranteedEqual(x->waitp->cond, y->waitp->cond); Condition::GuaranteedEqual(x->waitp->cond, y->waitp->cond);
} }
...@@ -784,18 +786,19 @@ static inline PerThreadSynch *GetPerThreadSynch(intptr_t v) { ...@@ -784,18 +786,19 @@ static inline PerThreadSynch *GetPerThreadSynch(intptr_t v) {
// - invalid (iff x is not in a Mutex wait queue), // - invalid (iff x is not in a Mutex wait queue),
// - null, or // - null, or
// - a pointer to a distinct thread waiting later in the same Mutex queue // - a pointer to a distinct thread waiting later in the same Mutex queue
// such that all threads in [x, x->skip] have the same condition and // such that all threads in [x, x->skip] have the same condition, priority
// lock type (MuSameCondition() is true for all pairs in [x, x->skip]). // and lock type (MuEquivalentWaiter() is true for all pairs in [x,
// x->skip]).
// In addition, if x->skip is valid, (x->may_skip || x->skip == null) // In addition, if x->skip is valid, (x->may_skip || x->skip == null)
// //
// By the spec of MuSameCondition(), it is not necessary when removing the // By the spec of MuEquivalentWaiter(), it is not necessary when removing the
// first runnable thread y from the front a Mutex queue to adjust the skip // first runnable thread y from the front a Mutex queue to adjust the skip
// field of another thread x because if x->skip==y, x->skip must (have) become // field of another thread x because if x->skip==y, x->skip must (have) become
// invalid before y is removed. The function TryRemove can remove a specified // invalid before y is removed. The function TryRemove can remove a specified
// thread from an arbitrary position in the queue whether runnable or not, so // thread from an arbitrary position in the queue whether runnable or not, so
// it fixes up skip fields that would otherwise be left dangling. // it fixes up skip fields that would otherwise be left dangling.
// The statement // The statement
// if (x->may_skip && MuSameCondition(x, x->next)) { x->skip = x->next; } // if (x->may_skip && MuEquivalentWaiter(x, x->next)) { x->skip = x->next; }
// maintains the invariant provided x is not the last waiter in a Mutex queue // maintains the invariant provided x is not the last waiter in a Mutex queue
// The statement // The statement
// if (x->skip != null) { x->skip = x->skip->skip; } // if (x->skip != null) { x->skip = x->skip->skip; }
...@@ -929,24 +932,17 @@ static PerThreadSynch *Enqueue(PerThreadSynch *head, ...@@ -929,24 +932,17 @@ static PerThreadSynch *Enqueue(PerThreadSynch *head,
if (s->priority > head->priority) { // s's priority is above head's if (s->priority > head->priority) { // s's priority is above head's
// try to put s in priority-fifo order, or failing that at the front. // try to put s in priority-fifo order, or failing that at the front.
if (!head->maybe_unlocking) { if (!head->maybe_unlocking) {
// No unlocker can be scanning the queue, so we can insert between // No unlocker can be scanning the queue, so we can insert into the
// skip-chains, and within a skip-chain if it has the same condition as // middle of the queue.
// s. We insert in priority-fifo order, examining the end of every //
// skip-chain, plus every element with the same condition as s. // Within a skip chain, all waiters have the same priority, so we can
// skip forward through the chains until we find one with a lower
// priority than the waiter to be enqueued.
PerThreadSynch *advance_to = head; // next value of enqueue_after PerThreadSynch *advance_to = head; // next value of enqueue_after
PerThreadSynch *cur; // successor of enqueue_after
do { do {
enqueue_after = advance_to; enqueue_after = advance_to;
cur = enqueue_after->next; // this advance ensures progress // (side-effect: optimizes skip chain)
advance_to = Skip(cur); // normally, advance to end of skip chain advance_to = Skip(enqueue_after->next);
// (side-effect: optimizes skip chain)
if (advance_to != cur && s->priority > advance_to->priority &&
MuSameCondition(s, cur)) {
// but this skip chain is not a singleton, s has higher priority
// than its tail and has the same condition as the chain,
// so we can insert within the skip-chain
advance_to = cur; // advance by just one
}
} while (s->priority <= advance_to->priority); } while (s->priority <= advance_to->priority);
// termination guaranteed because s->priority > head->priority // termination guaranteed because s->priority > head->priority
// and head is the end of a skip chain // and head is the end of a skip chain
...@@ -965,21 +961,21 @@ static PerThreadSynch *Enqueue(PerThreadSynch *head, ...@@ -965,21 +961,21 @@ static PerThreadSynch *Enqueue(PerThreadSynch *head,
// enqueue_after can be: head, Skip(...), or cur. // enqueue_after can be: head, Skip(...), or cur.
// The first two imply enqueue_after->skip == nullptr, and // The first two imply enqueue_after->skip == nullptr, and
// the last is used only if MuSameCondition(s, cur). // the last is used only if MuEquivalentWaiter(s, cur).
// We require this because clearing enqueue_after->skip // We require this because clearing enqueue_after->skip
// is impossible; enqueue_after's predecessors might also // is impossible; enqueue_after's predecessors might also
// incorrectly skip over s if we were to allow other // incorrectly skip over s if we were to allow other
// insertion points. // insertion points.
ABSL_RAW_CHECK( ABSL_RAW_CHECK(enqueue_after->skip == nullptr ||
enqueue_after->skip == nullptr || MuSameCondition(enqueue_after, s), MuEquivalentWaiter(enqueue_after, s),
"Mutex Enqueue failure"); "Mutex Enqueue failure");
if (enqueue_after != head && enqueue_after->may_skip && if (enqueue_after != head && enqueue_after->may_skip &&
MuSameCondition(enqueue_after, enqueue_after->next)) { MuEquivalentWaiter(enqueue_after, enqueue_after->next)) {
// enqueue_after can skip to its new successor, s // enqueue_after can skip to its new successor, s
enqueue_after->skip = enqueue_after->next; enqueue_after->skip = enqueue_after->next;
} }
if (MuSameCondition(s, s->next)) { // s->may_skip is known to be true if (MuEquivalentWaiter(s, s->next)) { // s->may_skip is known to be true
s->skip = s->next; // s may skip to its successor s->skip = s->next; // s may skip to its successor
} }
} else { // enqueue not done any other way, so } else { // enqueue not done any other way, so
...@@ -989,7 +985,7 @@ static PerThreadSynch *Enqueue(PerThreadSynch *head, ...@@ -989,7 +985,7 @@ static PerThreadSynch *Enqueue(PerThreadSynch *head,
head->next = s; head->next = s;
s->readers = head->readers; // reader count is from previous head s->readers = head->readers; // reader count is from previous head
s->maybe_unlocking = head->maybe_unlocking; // same for unlock hint s->maybe_unlocking = head->maybe_unlocking; // same for unlock hint
if (head->may_skip && MuSameCondition(head, s)) { if (head->may_skip && MuEquivalentWaiter(head, s)) {
// head now has successor; may skip // head now has successor; may skip
head->skip = s; head->skip = s;
} }
...@@ -1009,7 +1005,7 @@ static PerThreadSynch *Dequeue(PerThreadSynch *head, PerThreadSynch *pw) { ...@@ -1009,7 +1005,7 @@ static PerThreadSynch *Dequeue(PerThreadSynch *head, PerThreadSynch *pw) {
pw->next = w->next; // snip w out of list pw->next = w->next; // snip w out of list
if (head == w) { // we removed the head if (head == w) { // we removed the head
head = (pw == w) ? nullptr : pw; // either emptied list, or pw is new head head = (pw == w) ? nullptr : pw; // either emptied list, or pw is new head
} else if (pw != head && MuSameCondition(pw, pw->next)) { } else if (pw != head && MuEquivalentWaiter(pw, pw->next)) {
// pw can skip to its new successor // pw can skip to its new successor
if (pw->next->skip != if (pw->next->skip !=
nullptr) { // either skip to its successors skip target nullptr) { // either skip to its successors skip target
...@@ -1079,11 +1075,13 @@ void Mutex::TryRemove(PerThreadSynch *s) { ...@@ -1079,11 +1075,13 @@ void Mutex::TryRemove(PerThreadSynch *s) {
PerThreadSynch *w; PerThreadSynch *w;
if ((w = pw->next) != s) { // search for thread, if ((w = pw->next) != s) { // search for thread,
do { // processing at least one element do { // processing at least one element
if (!MuSameCondition(s, w)) { // seeking different condition // If the current element isn't equivalent to the waiter to be
// removed, we can skip the entire chain.
if (!MuEquivalentWaiter(s, w)) {
pw = Skip(w); // so skip all that won't match pw = Skip(w); // so skip all that won't match
// we don't have to worry about dangling skip fields // we don't have to worry about dangling skip fields
// in the threads we skipped; none can point to s // in the threads we skipped; none can point to s
// because their condition differs from s // because they are in a different equivalence class.
} else { // seeking same condition } else { // seeking same condition
FixSkip(w, s); // fix up any skip pointer from w to s FixSkip(w, s); // fix up any skip pointer from w to s
pw = w; pw = w;
...@@ -2148,7 +2146,7 @@ ABSL_ATTRIBUTE_NOINLINE void Mutex::UnlockSlow(SynchWaitParams *waitp) { ...@@ -2148,7 +2146,7 @@ ABSL_ATTRIBUTE_NOINLINE void Mutex::UnlockSlow(SynchWaitParams *waitp) {
!old_h->may_skip) { // we used old_h as a terminator !old_h->may_skip) { // we used old_h as a terminator
old_h->may_skip = true; // allow old_h to skip once more old_h->may_skip = true; // allow old_h to skip once more
ABSL_RAW_CHECK(old_h->skip == nullptr, "illegal skip from head"); ABSL_RAW_CHECK(old_h->skip == nullptr, "illegal skip from head");
if (h != old_h && MuSameCondition(old_h, old_h->next)) { if (h != old_h && MuEquivalentWaiter(old_h, old_h->next)) {
old_h->skip = old_h->next; // old_h not head & can skip to successor old_h->skip = old_h->next; // old_h not head & can skip to successor
} }
} }
......
...@@ -61,8 +61,124 @@ class RaiiLocker<std::mutex> { ...@@ -61,8 +61,124 @@ class RaiiLocker<std::mutex> {
std::mutex* mu_; std::mutex* mu_;
}; };
// RAII object to change the Mutex priority of the running thread.
class ScopedThreadMutexPriority {
public:
explicit ScopedThreadMutexPriority(int priority) {
absl::base_internal::ThreadIdentity* identity =
absl::synchronization_internal::GetOrCreateCurrentThreadIdentity();
identity->per_thread_synch.priority = priority;
// Bump next_priority_read_cycles to the infinite future so that the
// implementation doesn't re-read the thread's actual scheduler priority
// and replace our temporary scoped priority.
identity->per_thread_synch.next_priority_read_cycles =
std::numeric_limits<int64_t>::max();
}
~ScopedThreadMutexPriority() {
// Reset the "next priority read time" back to the infinite past so that
// the next time the Mutex implementation wants to know this thread's
// priority, it re-reads it from the OS instead of using our overridden
// priority.
absl::synchronization_internal::GetOrCreateCurrentThreadIdentity()
->per_thread_synch.next_priority_read_cycles =
std::numeric_limits<int64_t>::min();
}
};
void BM_MutexEnqueue(benchmark::State& state) {
// In the "multiple priorities" variant of the benchmark, one of the
// threads runs with Mutex priority 0 while the rest run at elevated priority.
// This benchmarks the performance impact of the presence of a low priority
// waiter when a higher priority waiter adds itself of the queue
// (b/175224064).
//
// NOTE: The actual scheduler priority is not modified in this benchmark:
// all of the threads get CPU slices with the same priority. Only the
// Mutex queueing behavior is modified.
const bool multiple_priorities = state.range(0);
ScopedThreadMutexPriority priority_setter(
(multiple_priorities && state.thread_index != 0) ? 1 : 0);
struct Shared {
absl::Mutex mu;
std::atomic<int> looping_threads{0};
std::atomic<int> blocked_threads{0};
std::atomic<bool> thread_has_mutex{false};
};
static Shared* shared = new Shared;
// Set up 'blocked_threads' to count how many threads are currently blocked
// in Abseil synchronization code.
//
// NOTE: Blocking done within the Google Benchmark library itself (e.g.
// the barrier which synchronizes threads entering and exiting the benchmark
// loop) does _not_ get registered in this counter. This is because Google
// Benchmark uses its own synchronization primitives based on std::mutex, not
// Abseil synchronization primitives. If at some point the benchmark library
// merges into Abseil, this code may break.
absl::synchronization_internal::PerThreadSem::SetThreadBlockedCounter(
&shared->blocked_threads);
// The benchmark framework may run several iterations in the same process,
// reusing the same static-initialized 'shared' object. Given the semantics
// of the members, here, we expect everything to be reset to zero by the
// end of any iteration. Assert that's the case, just to be sure.
ABSL_RAW_CHECK(
shared->looping_threads.load(std::memory_order_relaxed) == 0 &&
shared->blocked_threads.load(std::memory_order_relaxed) == 0 &&
!shared->thread_has_mutex.load(std::memory_order_relaxed),
"Shared state isn't zeroed at start of benchmark iteration");
static constexpr int kBatchSize = 1000;
while (state.KeepRunningBatch(kBatchSize)) {
shared->looping_threads.fetch_add(1);
for (int i = 0; i < kBatchSize; i++) {
{
absl::MutexLock l(&shared->mu);
shared->thread_has_mutex.store(true, std::memory_order_relaxed);
// Spin until all other threads are either out of the benchmark loop
// or blocked on the mutex. This ensures that the mutex queue is kept
// at its maximal length to benchmark the performance of queueing on
// a highly contended mutex.
while (shared->looping_threads.load(std::memory_order_relaxed) -
shared->blocked_threads.load(std::memory_order_relaxed) !=
1) {
}
shared->thread_has_mutex.store(false);
}
// Spin until some other thread has acquired the mutex before we block
// again. This ensures that we always go through the slow (queueing)
// acquisition path rather than reacquiring the mutex we just released.
while (!shared->thread_has_mutex.load(std::memory_order_relaxed) &&
shared->looping_threads.load(std::memory_order_relaxed) > 1) {
}
}
// The benchmark framework uses a barrier to ensure that all of the threads
// complete their benchmark loop together before any of the threads exit
// the loop. So, we need to remove ourselves from the "looping threads"
// counter here before potentially blocking on that barrier. Otherwise,
// another thread spinning above might wait forever for this thread to
// block on the mutex while we in fact are waiting to exit.
shared->looping_threads.fetch_add(-1);
}
absl::synchronization_internal::PerThreadSem::SetThreadBlockedCounter(
nullptr);
}
BENCHMARK(BM_MutexEnqueue)
->Threads(4)
->Threads(64)
->Threads(128)
->Threads(512)
->ArgName("multiple_priorities")
->Arg(false)
->Arg(true);
template <typename MutexType> template <typename MutexType>
void BM_Contended(benchmark::State& state) { void BM_Contended(benchmark::State& state) {
int priority = state.thread_index % state.range(1);
ScopedThreadMutexPriority priority_setter(priority);
struct Shared { struct Shared {
MutexType mu; MutexType mu;
int data = 0; int data = 0;
...@@ -85,81 +201,51 @@ void BM_Contended(benchmark::State& state) { ...@@ -85,81 +201,51 @@ void BM_Contended(benchmark::State& state) {
DelayNs(state.range(0), &shared->data); DelayNs(state.range(0), &shared->data);
} }
} }
void SetupBenchmarkArgs(benchmark::internal::Benchmark* bm,
bool do_test_priorities) {
const int max_num_priorities = do_test_priorities ? 2 : 1;
bm->UseRealTime()
// ThreadPerCpu poorly handles non-power-of-two CPU counts.
->Threads(1)
->Threads(2)
->Threads(4)
->Threads(6)
->Threads(8)
->Threads(12)
->Threads(16)
->Threads(24)
->Threads(32)
->Threads(48)
->Threads(64)
->Threads(96)
->Threads(128)
->Threads(192)
->Threads(256)
->ArgNames({"cs_ns", "num_prios"});
// Some empirically chosen amounts of work in critical section.
// 1 is low contention, 2000 is high contention and few values in between.
for (int critical_section_ns : {1, 20, 50, 200, 2000}) {
for (int num_priorities = 1; num_priorities <= max_num_priorities;
num_priorities++) {
bm->ArgPair(critical_section_ns, num_priorities);
}
}
}
BENCHMARK_TEMPLATE(BM_Contended, absl::Mutex) BENCHMARK_TEMPLATE(BM_Contended, absl::Mutex)
->UseRealTime() ->Apply([](benchmark::internal::Benchmark* bm) {
// ThreadPerCpu poorly handles non-power-of-two CPU counts. SetupBenchmarkArgs(bm, /*do_test_priorities=*/true);
->Threads(1) });
->Threads(2)
->Threads(4)
->Threads(6)
->Threads(8)
->Threads(12)
->Threads(16)
->Threads(24)
->Threads(32)
->Threads(48)
->Threads(64)
->Threads(96)
->Threads(128)
->Threads(192)
->Threads(256)
// Some empirically chosen amounts of work in critical section.
// 1 is low contention, 200 is high contention and few values in between.
->Arg(1)
->Arg(20)
->Arg(50)
->Arg(200);
BENCHMARK_TEMPLATE(BM_Contended, absl::base_internal::SpinLock) BENCHMARK_TEMPLATE(BM_Contended, absl::base_internal::SpinLock)
->UseRealTime() ->Apply([](benchmark::internal::Benchmark* bm) {
// ThreadPerCpu poorly handles non-power-of-two CPU counts. SetupBenchmarkArgs(bm, /*do_test_priorities=*/false);
->Threads(1) });
->Threads(2)
->Threads(4)
->Threads(6)
->Threads(8)
->Threads(12)
->Threads(16)
->Threads(24)
->Threads(32)
->Threads(48)
->Threads(64)
->Threads(96)
->Threads(128)
->Threads(192)
->Threads(256)
// Some empirically chosen amounts of work in critical section.
// 1 is low contention, 200 is high contention and few values in between.
->Arg(1)
->Arg(20)
->Arg(50)
->Arg(200);
BENCHMARK_TEMPLATE(BM_Contended, std::mutex) BENCHMARK_TEMPLATE(BM_Contended, std::mutex)
->UseRealTime() ->Apply([](benchmark::internal::Benchmark* bm) {
// ThreadPerCpu poorly handles non-power-of-two CPU counts. SetupBenchmarkArgs(bm, /*do_test_priorities=*/false);
->Threads(1) });
->Threads(2)
->Threads(4)
->Threads(6)
->Threads(8)
->Threads(12)
->Threads(16)
->Threads(24)
->Threads(32)
->Threads(48)
->Threads(64)
->Threads(96)
->Threads(128)
->Threads(192)
->Threads(256)
// Some empirically chosen amounts of work in critical section.
// 1 is low contention, 200 is high contention and few values in between.
->Arg(1)
->Arg(20)
->Arg(50)
->Arg(200);
// Measure the overhead of conditions on mutex release (when they must be // Measure the overhead of conditions on mutex release (when they must be
// evaluated). Mutex has (some) support for equivalence classes allowing // evaluated). Mutex has (some) support for equivalence classes allowing
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment