Download presentation
Presentation is loading. Please wait.
Published byRuby Andrews Modified over 9 years ago
1
Pessimistic Software Lock-Elision Nir Shavit (Joint work with Yehuda Afek Alexander Matveev)
2
Read-Write Locks One of the most prevalent lock forms in concurrent applications – 80/20 rule applies to reading vs writing of data Mutex between write calls and between writes and read-only calls Allow read-only calls to proceed in parallel with one another
3
Coming Next Year: HTM and Hardware Lock Elision
4
Speculative Lock Elision (SLE) Thread 1 Start Acquire Speculate: try to execute the critical sections concurrently using transactions Start Acquire Lock Elided Start Release Thread 2 On failure: revert back to the lock Rajwar and Goodman: speculative execution of locks by optimistic hardware transactions (Haswell) Roy, Hand, and Harris: software implementation of SLE, transactions executed speculatively in software.
5
SLE: Good and Bad Advantages: Concurrency among writes and among reads and writes -- as long as they do not share/contend for memory Disadvantages: – Contention implies defaulting to lock Reads delayed by writes – System calls and I/O cannot be used will cause trans to fail – Debugging hard due to the speculative non- deterministic behavior Speculative execution breaks the lock semantics – you need to rewrite the code
6
Pessimistic Lock Elision (PLE) Non-speculatively replace read-write locks By pessimistic software transactions In a way that: – Preserves the lock semantics No code rewriting Allows I/O in transactions – Allows read-write concurrency always! Disadvantage: – Does not allow concurrency among writes How important is this for RW-locked code?
7
Pessimistic STM [MatveevShavit2011] A commit-time privatizing STM in which all transactions execute once and never abort And read-only transactions run in parallel with themselves and writes To create PLE, we designed a new encounter- order version of this pessimistic STM that wait-free read-only trans
8
Encounter Order Pessimistic STM Quiescence mechanism [MatveevShavit2010] to tell when reads terminate Write transactions execute sequentially (commits are serialized) by “passing a baton” Writes maintain a public undo log Wait-free reads collect a snapshot of the memory using undo log
9
Pessimistic Read-Write Interaction Write transactions must not write to locations being read by overlapping reads Solution: – On a write, the old value is logged publically before writing the new value – In read phase, logged values of concurrent writes are read – In the commit-phase, the old values are discarded after it is ensured using the quiesence mechanism that no-one reads them
10
Why does this work well? No need for CAS or even memory barriers in common case Even though logging is public, its only by one transaction at a time so very easy to implement
11
Applying Pessimistic Lock-Elision STM Compiler (Intel STM Compiler with PLE Transactions) Program with RW-Locks Program with PLE Processor with HLE (Intel’s Haswell) (HLE code is executed with software fallback to PLE) input output Standard Processor (PLE code is executed) execute Point 1 The semantics are not changed with PLE addition Point 2 Concurrency between read and write critical sections Point 3 HLE has limitations, but HLE + PLE does not have execute Point 4 PLE works on current processors
12
NORMAL HYPERTHREADS NUMA Performance We empirically evaluated our algorithm on an Intel 40-way machine with 2 Xeon E7-4870 chips in a NUMA setup. 1. PLE: Our fully pessimistic encounter-time STM 2.RW_Lock_Egress: An ingress-egress counter based reader-writer mutex implementation for Intel platform. 3.MCS-Lock: Michael and Scott's MCS Lock 4.RW_Lock_SPAA: The new RWLock proposal from SPAA 2012
13
Three Ways to Elide Locks Software-only lock elision – If you don’t have hardware support A fall back (slow path) for the hardware HLE – Intel’s SLE A fall back using HTM – Intel’s RTM
14
If Your Machine Doesn’t Have Hardware Support Automatically replace at compile time all read- write locked code with PLE STM code – As easy as STM in new C++ compiler This will improve on your RW-locks because it will allow read-only calls to proceed in parallel with writes Write calls are sequential, but they were sequential anyhow…
15
If Your Machine Has SLE There is an XTEST instruction which returns true if the thread is currently executing in SLE Execute XTEST after the XACQUIRE instruction (the HLE transaction start instruction) At compile time create a duplicate PLE code path. If the XTEST fails, then the duplicate PLE path is executed
16
If Your Machine Has RTM Two copies: one copy is PLE path, the other is RTM code path: – RTM Hardware fall-back routine is PLE code path start – After the XBEGIN add a read (load) instruction of is_abort variable – PLE code path first executes small RTM transaction that updates is_abort – Causing all concurrently executing RTM transactions will fail
17
Lock-Elision Theory We are going to see a lot of use of lock elision in industry… So, what are the inherent costs of lock-elision using STMs? What are the inherent costs of pessimistic STM implementations? Can we quantify the interaction between hardware and software transactions (or with locks)
18
Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.