Concurrent Programming Without Locks Keir Fraser & Tim Harris Adapted from an earlier presentation by Phil Howard.

Slides:



Advertisements
Similar presentations
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 469-+
Advertisements

Concurrent programming for dummies (and smart people too) Tim Harris & Keir Fraser.
Software Transactional Memory and Conditional Critical Regions Word-Based Systems.
CS510 – Advanced Operating Systems 1 The Synergy Between Non-blocking Synchronization and Operating System Structure By Michael Greenwald and David Cheriton.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Virendra J. Marathe, William N. Scherer III, and Michael L. Scott Department of Computer Science University of Rochester Presented by: Armand R. Burks.
A Lock-Free Multiprocessor OS Kernel1 Henry Massalin and Calton Pu Columbia University June 1991 Presented by: Kenny Graunke.
Locality-Conscious Lock-Free Linked Lists Anastasia Braginsky & Erez Petrank 1.
Read-Copy Update P. E. McKenney, J. Appavoo, A. Kleen, O. Krieger, R. Russell, D. Saram, M. Soni Ottawa Linux Symposium 2001 Presented by Bogdan Simion.
Multi-Object Synchronization. Main Points Problems with synchronizing multiple objects Definition of deadlock – Circular waiting for resources Conditions.
TOWARDS A SOFTWARE TRANSACTIONAL MEMORY FOR GRAPHICS PROCESSORS Daniel Cederman, Philippas Tsigas and Muhammad Tayyab Chaudhry.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.
Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09.
Computer Laboratory Practical non-blocking data structures Tim Harris Computer Laboratory.
CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
CS510 Concurrent Systems Class 13 Software Transactional Memory Should Not be Obstruction-Free.
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
Software Transaction Memory for Dynamic-Sized Data Structures presented by: Mark Schall.
CPS110: Implementing threads/locks on a uni-processor Landon Cox.
Concurrent Programming Without Locks Keir Fraser & Tim Harris.
SUPPORTING LOCK-FREE COMPOSITION OF CONCURRENT DATA OBJECTS Daniel Cederman and Philippas Tsigas.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
1 Thread Synchronization: Too Much Milk. 2 Implementing Critical Sections in Software Hard The following example will demonstrate the difficulty of providing.
Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.
CS510 Concurrent Systems Jonathan Walpole. A Lock-Free Multiprocessor OS Kernel.
Concurrency, Mutual Exclusion and Synchronization.
Memory Management 3 Tanenbaum Ch. 3 Silberschatz Ch. 8,9.
Software Transactional Memory Yoav Cohen Seminar in Distributed Computing Spring 2007 Yoav Cohen Seminar in Distributed Computing Spring 2007.
A Qualitative Survey of Modern Software Transactional Memory Systems Virendra J. Marathe Michael L. Scott.
CS5204 – Operating Systems Transactional Memory Part 2: Software-Based Approaches.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christopher Heriot, Athul Acharya, David Eisenstat, William.
Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Practical concurrent algorithms Mihai Letia Concurrent Algorithms 2012 Distributed Programming Laboratory Slides by Aleksandar Dragojevic.
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects Maged M. Michael Presented by Abdulai Sei.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Slides by Vincent Rayappa.
Techniques and Structures in Concurrent Programming Wilfredo Velazquez.
Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System Ben Gamsa, Orran Krieger, Jonathan Appavoo, Michael Stumm.
1 Previous Lecture Overview  semaphores provide the first high-level synchronization abstraction that is possible to implement efficiently in OS. This.
MULTIVIE W Slide 1 (of 21) Software Transactional Memory Should Not Be Obstruction Free Paper: Robert Ennals Presenter: Emerson Murphy-Hill.
Concurrent Programming Without Locks Based on Fraser & Harris’ paper.
Scalable lock-free Stack Algorithm Wael Yehia York University February 8, 2010.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
Scalable Computing model : Lock free protocol By Peeyush Agrawal 2010MCS3469 Guided By Dr. Kolin Paul.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.

Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III
Concurrency 2 CS 2110 – Spring 2016.
Part 2: Software-Based Approaches
Atomic Operations in Hardware
Atomic Operations in Hardware
Expander: Lock-free Cache for a Concurrent Data Structure
A Qualitative Survey of Modern Software Transactional Memory Systems
Hybrid Transactional Memory
Concurrency: Mutual Exclusion and Process Synchronization
Software Transactional Memory Should Not be Obstruction-Free
Concurrent Programming Without Locks
Locking Protocols & Software Transactional Memory
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
Multicore programming
Multicore programming
Parallel Programming Exercise Session 11
Presentation transcript:

Concurrent Programming Without Locks Keir Fraser & Tim Harris Adapted from an earlier presentation by Phil Howard

Motivation Locking precludes parallelism Recall “A Lock-Free Multiprocessor OS Kernel” by Massalin et al –Extensive use of CAS2 (aka DCAS, DCADS) –instruction does not exist on today’s CPUs Need a practical and general non-blocking solution

Solutions? Only use data structures that can be implemented with CAS? –Limiting RCU –Still uses locks for writers –Still limited to CAS data structures Software MCAS Transactional Memory

Goals Concreteness Linearizability Non-blocking progress guarantee Disjoint access parallelism Read parallelism Dynamicity Practicable space costs Composability

Caveats “It remains possible for a thread to see a mutually inconsistent view of shared memory if it performs a series of [read] calls.”

Definitions Obstruction freedom – a thread will make progress as long as it doesn’t contend with other threads access to any location Lock-freedom – The system as a whole will make progress Wait-freedom – Every thread makes progress Focus is on Lock-free design Whole transactions are lock-free, not just the sub- components

Design considerations Need to update multiple locations atomically – using only “real” instructions The secret? –Indirection! –Use descriptors to access values

New ValueOld ValueAddress Status Memory Descriptor

Implications of Descriptors Commit operation atomically updates status field All accesses are indirect –Need to distinguish between descriptor or value –Need to choose “actual”, “old”, or “new” value Once a descriptor is made visible, only the status field changes Once an outcome is decided, the status value doesn’t change –Retries use a new descriptor Descriptors are managed via garbage collection

Other requirements Descriptors must be able to own locations Uncontended commits must work –Prepare phase –Decision point –Update status value –Clean up –Status values: UNDECIDED, READ- CHECK,SUCCESSFUL, FAILED

Other Requirements Contended Commits must make progress –Decided, but not complete Help the other thread complete –Undecided, not read-check Abort contending transactions –Without contention management can lead to live-lock Help contending transactions –Sort memory addresses to prevent looping –Read-check Abort at least one contender Prevent live-locks by totally ordering transactions

Algorithms MCAS Multiple Compare And Swap WSTM Word Software Transactional Memory OSTMObject Software Transactional Memory

MCAS CAS(word *address, // actual value word expected_value, word new_value); (logically) MCAS(int count, word *address[], // actual values word expected_value[], word new_value[]); (but an extra indirection is added) (pointers must indirect through the descriptor!)

MCAS Operates only on aligned pointers Lower 2 bits used to distinguish value/descriptor Descriptors contain –status –N –address[] –expected[] –new_value[]

Data Access New ValueOld ValueAddress Status: SUCCESS descriptor value descriptor New Value Old ValueAddress Status: UNKNOWN

CCAS Conditional CAS built from CAS - takes effect only if condition == undecided - used to insert descriptor references CCAS(word *address, word expected_value, word new_value, word *condition); return original value of *address

Word *MCASRead(word **addr) { word *v; retry_read: v = CCASRead(addr); if ( !IsMCASDesc(v)) return v; for (int i=0; i N; i++) { if (v->addr[i] == addr) { if (v->status == SUCCESS) if (CCASRead(addr) == v) return v->new[i] else goto retry_read; else // FAILED or UNKNOWN if (CCASRead(addr) == v) return v->expected[i]; else goto retry_read; } return v; }

MCAS(3, {a,b,c}, {1,2,3}, {4,5,6})‏ a b c

MCAS(3, {a,c,b}, {1,3,2}, {4,6,5}) c 52b 41a 3 UNKNOWN a b c

1 2 3 MCAS(3, {a,b,c}, {1,2,3}, {4,5,6}) c 52b 41a 3 SUCCESS a b c

bool MCAS(int N, word **a[], word *e[], word *n[]) { mcas_descriptor *d = new mcas_descriptor(); d->N = N; d->status = UNDECIDED; for (int i=0; i<N; i++) { d->a[i] = a[i]; d->e[i] = e[i]; d->n[i] = n[i]; } address_sort(d); return mcas_help(d); }

bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d->status); if (v = d->e[i] || v == d) break; if (IsMCASDesc(v) ) mcas_help( (mcas_descriptor *)v ); else goto decision_point; } desired = SUCCESS; decision_point:

mcas_help continued // PHASE 2: read – not used by MCAS decision_point: CAS(&d->status, UNDECIDED, desired); // PHASE 3: clean up success = (d->status == SUCCESS); for (int i=0; i N; i++) { CAS(d->a[i], d, success ? d->n[i] : d->e[i]); } return success; }

Claiming Ownership New ValueOld ValueAddress Status: UNKNOWN CCAS Descr &MCAS_Descr &mcas->status 999

Claiming Ownership New ValueOld ValueAddress Status: UNKNOWN CCAS Descr &MCAS_Descr &mcas->status 999

word *CCAS(word **a, word *e, word *n, word *cond) { ccas_descriptor *d = new ccas_descriptor(); word *v; (d->a, d->e, d->n, d->cond) = (a,e,n,cond); while ( (v = CAS(d->a, d->e, d)) != d->e ) { if ( IsCCASDesc(v) ) CCASHelp( (ccas_descriptor *)v); else return v; } CCASHelp(d); return v; } void CCASHelp(ccas_descriptor *d) { bool success = (*d->cond == UNDECIDED); CAS(d->a, d, success ? d->n : d->e); }

word *CCASRead(word **a) { word *v = *a; while ( IsCCASDesc(v) ) { CCASHelp( (ccas_descriptor *)v); v = *a; } return v; }

Conflicts New ValueOld ValueAddress Status: UNKNOWN New Value Old ValueAddress Status: UNKNOWN

bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d->status); if (v = d->e[i] || v == d) break; if (IsMCASDesc(v) ) mcas_help( (mcas_descriptor *)v ); else goto decision_point; } desired = SUCCESS; decision_point:

Conflicts New ValueOld ValueAddress Status: UNKNOWN New Value Old ValueAddress Status: UNKNOWN

Conflicts New ValueOld ValueAddress Status: UNKNOWN

bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d- >status); if (v = d->e[i] || v == d) break; if (IsMCASDesc(v) ) mcas_help( (mcas_descriptor *)v ); else goto decision_point; } desired = SUCCESS; decision_point:

Conflicts New ValueOld ValueAddress Status: UNKNOWN New Value Old ValueAddress Status: UNKNOWN

bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d- >status); if (v = d->e[i] || v == d) break; if (!IsMCASDesc(v) ) goto decision_point; mcas_help( (mcas_descriptor *)v ); } desired = SUCCESS; decision_point:

mcas_help continued // PHASE 2: read – not used by MCAS decision_point: CAS(&d->status, UNDECIDED, desired); // PHASE 3: clean up success = (d->status == SUCCESS); for (int i=0; i N; i++) { CAS(d->a[i], d, success ? d->n[i] : d->e[i]); } return success; }

CCAS “failure modes” Someone helped us with the CCAS –call CCASHelp with our own descriptor –next time around, return MCAS descriptor –MCAS continues Someone else beat us to CCAS –help them with their CCAS –next time around, return their MCAS descriptor –Help with their MCAS –Our MCAS likely aborts Source value changed –return new value –MCAS aborts

word *CCAS(word **a, word *e, word *n, word *cond) { ccas_descriptor *d = new ccas_descriptor(); word *v; (d->a, d->e, d->n, d->cond) = (a,e,n,cond); while ( (v = CAS(d->a, d->e, d)) != d->e ) { if ( !IsCASDesc(v) ) return v; CCASHelp( (ccas_descriptor *)v); } CCASHelp(d); return v; } void CCASHelp(ccas_descriptor *d) { bool success = (*d->cond == UNDECIDED); CAS(d->a, d, success ? d->n : d->e); }

CCASHelp “failure modes” MCAS aborted so status isn’t UNKNOWN –old value put back in place MCAS aborted, CCASHelp doesn’t restore value –MCAS cleanup will put old value back in place Race: status switches to SUCCESS between check and CAS –CAS will fail because CCAS descriptor already removed –CCAS return will not cause MCAS failure Race: status switches to FAILURE between check and CAS –CAS will always fail because for MCAS to fail, someone must have read beyond us

Cost 3N + 1 CAS instructions (plus all the other code) “it is worth noting that the three batches of N updates all act on the same locations” “[improvements] may be useful if there are systems in which CAS operates substantially more slowly than an ordinary write.”

Deep Breath

WSTM Remove requirement for space reserved in values being updated WSTM keeps track of locations rather than caller Provides read parallelism Obstruction free, not lock free nor wait free

Data Structures version 52 Status: Undecided a1: (100,15) -> (200,16)‏ a2: (200,52) -> (100,53)‏ Orecs

Logical contents Orec contains a version number: –value comes direct from memory Orec contains a descriptor reference –descriptor contains address value comes from descriptor based on status –descriptor does not contain address value comes direct from memory

Transaction Process Call WSTMRead/WSTMWrite to gather/change data –Builds transaction data structure, but it’s NOT visible WSTMCommitTransaction –Get ownership – update ORecs –Read-Check – check version numbers –Decide –Clean up

version 52 version 15 version 53 version 16 Data Structures Status: UNKNOWN a1: (100,15) -> (200,16) a2: (200,52) -> (200,52)‏a2: (200,52) -> (100,53) Status: SUCCESS

Complications Fixed number of Orecs Hash collisions lead to false sharing

Issues Orec ownership acts like a lock, so simple scheme is not even obstruction free Can’t help with “cleanup” because might overwrite newer data Can’t determine value during READCHECK, so we’re forced to shoot down force_decision() might be circular causing live lock helping requires stealing of transactions Uncontended cost is N+2

OSTM Objects are represented as opaque handles –can’t use pointers directly –must rewrite data structures Get accessible pointers via OSTMOpenForReading/OSTMOpenForWr iting Eliminates need for Orecs/aliasing

Evaluation “We use … reference-counting garbage collection” Evaluated with one thread/CPU “Since we know the number of threads participating in our experiments…”

Uncontended Performance

Contended Locks

Data Contention

Data/Lock Contention

Spare Slides

word WSTMRead(wstm_transaction *tx, word *addr) { if (entry_exists) return entry->new_value; if (orec->type != descriptor)‏ create entry [current value, orec version] else { force_decision(descriptor); // can’t be ours: not in commit if (descriptor contains our address)‏ if (status == SUCCESS)‏ create entry [descr.new_val, descr.new_ver] else create entry [descr.old_val, descr.old_ver] else create entry [current value, descr.aliased.new_ver] } if (aliased) { if (entry->old_version != aliased->old_version)‏ status = FAILED; entry->old_version = aliased->old_version; entry->new_version = aliased->new_version; } return entry->new_value; }

void WSTMWrite(wstm_transaction *tx, word *addr, word new_value { get entry using WSTMRead logic entry->new_value = new_value; for each aliased entry { entry->new_version++; }

bool WSTMCommit(wstm_transaction *tx) { if (tx->status == FAILED) return false; sort descriptor entries desired_status = FAILED; for each update if (!acquire_orec) goto decision_point; CAS(status, UNDECIDED, READ_CHECK); for each read if (!read_check) goto decision_point; desired_status = SUCCESS; decision_point:

status = tx->status; while (status != FAILED && status != SUCCESS) { CAS(tx->status, status, desired_status); status = tx->status; } if (tx->status == SUCCESS)‏ for each update *addr = entry->new_value; for each update release_orec return (tx->status == SUCCESS); }

bool read_check(wstm_transaction *tx, wstm_entry *entry)‏ { if (orec is WSTM_descriptor) { force_decision()‏ if (SUCCESS)‏ version = new_version; else version = old_version } else { version = orec_version; } return (version == entry->old_version); }

Data Structures version 52 Status: Undecided a1: (100,15) -> (200,16)‏ a2: (200,52) -> (100,53)‏ a3: (300,15) -> (300,16)‏ Orecs a1 a2 a3