Download presentation
Presentation is loading. Please wait.
Published byOswin Barnett Modified over 9 years ago
1
Concurrent Programming Without Locks Keir Fraser & Tim Harris Adapted from an earlier presentation by Phil Howard
2
Motivation Locking precludes parallelism Recall “A Lock-Free Multiprocessor OS Kernel” by Massalin et al –Extensive use of CAS2 (aka DCAS, DCADS) –instruction does not exist on today’s CPUs Need a practical and general non-blocking solution
3
Solutions? Only use data structures that can be implemented with CAS? –Limiting RCU –Still uses locks for writers –Still limited to CAS data structures Software MCAS Transactional Memory
4
Goals Concreteness Linearizability Non-blocking progress guarantee Disjoint access parallelism Read parallelism Dynamicity Practicable space costs Composability
5
Caveats “It remains possible for a thread to see a mutually inconsistent view of shared memory if it performs a series of [read] calls.”
6
Definitions Obstruction freedom – a thread will make progress as long as it doesn’t contend with other threads access to any location Lock-freedom – The system as a whole will make progress Wait-freedom – Every thread makes progress Focus is on Lock-free design Whole transactions are lock-free, not just the sub- components
7
Design considerations Need to update multiple locations atomically – using only “real” instructions The secret? –Indirection! –Use descriptors to access values
8
100 101 102 103 104 105 106 107 789456106 123 105 200100102 New ValueOld ValueAddress Status Memory Descriptor
9
Implications of Descriptors Commit operation atomically updates status field All accesses are indirect –Need to distinguish between descriptor or value –Need to choose “actual”, “old”, or “new” value Once a descriptor is made visible, only the status field changes Once an outcome is decided, the status value doesn’t change –Retries use a new descriptor Descriptors are managed via garbage collection
10
Other requirements Descriptors must be able to own locations Uncontended commits must work –Prepare phase –Decision point –Update status value –Clean up –Status values: UNDECIDED, READ- CHECK,SUCCESSFUL, FAILED
11
Other Requirements Contended Commits must make progress –Decided, but not complete Help the other thread complete –Undecided, not read-check Abort contending transactions –Without contention management can lead to live-lock Help contending transactions –Sort memory addresses to prevent looping –Read-check Abort at least one contender Prevent live-locks by totally ordering transactions
12
Algorithms MCAS Multiple Compare And Swap WSTM Word Software Transactional Memory OSTMObject Software Transactional Memory
13
MCAS CAS(word *address, // actual value word expected_value, word new_value); (logically) MCAS(int count, word *address[], // actual values word expected_value[], word new_value[]); (but an extra indirection is added) (pointers must indirect through the descriptor!)
14
MCAS Operates only on aligned pointers Lower 2 bits used to distinguish value/descriptor Descriptors contain –status –N –address[] –expected[] –new_value[]
15
Data Access 200100102 New ValueOld ValueAddress Status: SUCCESS descriptor value descriptor 300 200100105 New Value Old ValueAddress Status: UNKNOWN
16
CCAS Conditional CAS built from CAS - takes effect only if condition == undecided - used to insert descriptor references CCAS(word *address, word expected_value, word new_value, word *condition); return original value of *address
17
Word *MCASRead(word **addr) { word *v; retry_read: v = CCASRead(addr); if ( !IsMCASDesc(v)) return v; for (int i=0; i N; i++) { if (v->addr[i] == addr) { if (v->status == SUCCESS) if (CCASRead(addr) == v) return v->new[i] else goto retry_read; else // FAILED or UNKNOWN if (CCASRead(addr) == v) return v->expected[i]; else goto retry_read; } return v; }
18
MCAS(3, {a,b,c}, {1,2,3}, {4,5,6}) 1 2 3 a b c
19
MCAS(3, {a,c,b}, {1,3,2}, {4,6,5}) 1 2 3 63c 52b 41a 3 UNKNOWN a b c
20
1 2 3 MCAS(3, {a,b,c}, {1,2,3}, {4,5,6}) 1 2 3 63c 52b 41a 3 SUCCESS 4 5 6 a b c
21
bool MCAS(int N, word **a[], word *e[], word *n[]) { mcas_descriptor *d = new mcas_descriptor(); d->N = N; d->status = UNDECIDED; for (int i=0; i<N; i++) { d->a[i] = a[i]; d->e[i] = e[i]; d->n[i] = n[i]; } address_sort(d); return mcas_help(d); }
22
bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d->status); if (v = d->e[i] || v == d) break; if (IsMCASDesc(v) ) mcas_help( (mcas_descriptor *)v ); else goto decision_point; } desired = SUCCESS; decision_point:
23
mcas_help continued // PHASE 2: read – not used by MCAS decision_point: CAS(&d->status, UNDECIDED, desired); // PHASE 3: clean up success = (d->status == SUCCESS); for (int i=0; i N; i++) { CAS(d->a[i], d, success ? d->n[i] : d->e[i]); } return success; }
24
Claiming Ownership 200100102 789456104 777999108 New ValueOld ValueAddress Status: UNKNOWN 102 104 108 CCAS Descr 108 999 &MCAS_Descr &mcas->status 999
25
Claiming Ownership 200100102 789456104 777999108 New ValueOld ValueAddress Status: UNKNOWN 102 104 108 CCAS Descr 108 999 &MCAS_Descr &mcas->status 999
26
word *CCAS(word **a, word *e, word *n, word *cond) { ccas_descriptor *d = new ccas_descriptor(); word *v; (d->a, d->e, d->n, d->cond) = (a,e,n,cond); while ( (v = CAS(d->a, d->e, d)) != d->e ) { if ( IsCCASDesc(v) ) CCASHelp( (ccas_descriptor *)v); else return v; } CCASHelp(d); return v; } void CCASHelp(ccas_descriptor *d) { bool success = (*d->cond == UNDECIDED); CAS(d->a, d, success ? d->n : d->e); }
27
word *CCASRead(word **a) { word *v = *a; while ( IsCCASDesc(v) ) { CCASHelp( (ccas_descriptor *)v); v = *a; } return v; }
28
Conflicts 200100102 789456104 777999108 New ValueOld ValueAddress Status: UNKNOWN 102 104 108 200999108 New Value Old ValueAddress Status: UNKNOWN
29
bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d->status); if (v = d->e[i] || v == d) break; if (IsMCASDesc(v) ) mcas_help( (mcas_descriptor *)v ); else goto decision_point; } desired = SUCCESS; decision_point:
30
Conflicts 200100102 789456104 777999108 New ValueOld ValueAddress Status: UNKNOWN 102 104 108 200999108 New Value Old ValueAddress Status: UNKNOWN
31
Conflicts 200100102 789456104 777999108 New ValueOld ValueAddress Status: UNKNOWN 102 104 108200
32
bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d- >status); if (v = d->e[i] || v == d) break; if (IsMCASDesc(v) ) mcas_help( (mcas_descriptor *)v ); else goto decision_point; } desired = SUCCESS; decision_point:
33
Conflicts 200100102 456 104 999 108 New ValueOld ValueAddress Status: UNKNOWN 102 104 108 123456104 200999108 New Value Old ValueAddress Status: UNKNOWN
34
bool mcas_help(mcas_descriptor *d) { word *v, desired = FAILED; bool success; // Phase 1: acquire for (int i=0; i N; i++) { while (TRUE){ v = CCAS(d->a[i], d->e[i], d, &d- >status); if (v = d->e[i] || v == d) break; if (!IsMCASDesc(v) ) goto decision_point; mcas_help( (mcas_descriptor *)v ); } desired = SUCCESS; decision_point:
35
mcas_help continued // PHASE 2: read – not used by MCAS decision_point: CAS(&d->status, UNDECIDED, desired); // PHASE 3: clean up success = (d->status == SUCCESS); for (int i=0; i N; i++) { CAS(d->a[i], d, success ? d->n[i] : d->e[i]); } return success; }
36
CCAS “failure modes” Someone helped us with the CCAS –call CCASHelp with our own descriptor –next time around, return MCAS descriptor –MCAS continues Someone else beat us to CCAS –help them with their CCAS –next time around, return their MCAS descriptor –Help with their MCAS –Our MCAS likely aborts Source value changed –return new value –MCAS aborts
37
word *CCAS(word **a, word *e, word *n, word *cond) { ccas_descriptor *d = new ccas_descriptor(); word *v; (d->a, d->e, d->n, d->cond) = (a,e,n,cond); while ( (v = CAS(d->a, d->e, d)) != d->e ) { if ( !IsCASDesc(v) ) return v; CCASHelp( (ccas_descriptor *)v); } CCASHelp(d); return v; } void CCASHelp(ccas_descriptor *d) { bool success = (*d->cond == UNDECIDED); CAS(d->a, d, success ? d->n : d->e); }
38
CCASHelp “failure modes” MCAS aborted so status isn’t UNKNOWN –old value put back in place MCAS aborted, CCASHelp doesn’t restore value –MCAS cleanup will put old value back in place Race: status switches to SUCCESS between check and CAS –CAS will fail because CCAS descriptor already removed –CCAS return will not cause MCAS failure Race: status switches to FAILURE between check and CAS –CAS will always fail because for MCAS to fail, someone must have read beyond us
39
Cost 3N + 1 CAS instructions (plus all the other code) “it is worth noting that the three batches of N updates all act on the same locations” “[improvements] may be useful if there are systems in which CAS operates substantially more slowly than an ordinary write.”
40
Deep Breath
41
WSTM Remove requirement for space reserved in values being updated WSTM keeps track of locations rather than caller Provides read parallelism Obstruction free, not lock free nor wait free
42
Data Structures 100 200 300 400 version 52 Status: Undecided a1: (100,15) -> (200,16) a2: (200,52) -> (100,53) Orecs
43
Logical contents Orec contains a version number: –value comes direct from memory Orec contains a descriptor reference –descriptor contains address value comes from descriptor based on status –descriptor does not contain address value comes direct from memory
44
Transaction Process Call WSTMRead/WSTMWrite to gather/change data –Builds transaction data structure, but it’s NOT visible WSTMCommitTransaction –Get ownership – update ORecs –Read-Check – check version numbers –Decide –Clean up
45
version 52 version 15 version 53 version 16 Data Structures 100 200 300 400 Status: UNKNOWN a1: (100,15) -> (200,16) a2: (200,52) -> (200,52)a2: (200,52) -> (100,53) 200 100 Status: SUCCESS
46
Complications Fixed number of Orecs Hash collisions lead to false sharing
47
Issues Orec ownership acts like a lock, so simple scheme is not even obstruction free Can’t help with “cleanup” because might overwrite newer data Can’t determine value during READCHECK, so we’re forced to shoot down force_decision() might be circular causing live lock helping requires stealing of transactions Uncontended cost is N+2
48
OSTM Objects are represented as opaque handles –can’t use pointers directly –must rewrite data structures Get accessible pointers via OSTMOpenForReading/OSTMOpenForWr iting Eliminates need for Orecs/aliasing
49
Evaluation “We use … reference-counting garbage collection” Evaluated with one thread/CPU “Since we know the number of threads participating in our experiments…”
50
Uncontended Performance
51
Contended Locks
52
Data Contention
53
Data/Lock Contention
54
Spare Slides
55
word WSTMRead(wstm_transaction *tx, word *addr) { if (entry_exists) return entry->new_value; if (orec->type != descriptor) create entry [current value, orec version] else { force_decision(descriptor); // can’t be ours: not in commit if (descriptor contains our address) if (status == SUCCESS) create entry [descr.new_val, descr.new_ver] else create entry [descr.old_val, descr.old_ver] else create entry [current value, descr.aliased.new_ver] } if (aliased) { if (entry->old_version != aliased->old_version) status = FAILED; entry->old_version = aliased->old_version; entry->new_version = aliased->new_version; } return entry->new_value; }
56
void WSTMWrite(wstm_transaction *tx, word *addr, word new_value { get entry using WSTMRead logic entry->new_value = new_value; for each aliased entry { entry->new_version++; }
57
bool WSTMCommit(wstm_transaction *tx) { if (tx->status == FAILED) return false; sort descriptor entries desired_status = FAILED; for each update if (!acquire_orec) goto decision_point; CAS(status, UNDECIDED, READ_CHECK); for each read if (!read_check) goto decision_point; desired_status = SUCCESS; decision_point:
58
status = tx->status; while (status != FAILED && status != SUCCESS) { CAS(tx->status, status, desired_status); status = tx->status; } if (tx->status == SUCCESS) for each update *addr = entry->new_value; for each update release_orec return (tx->status == SUCCESS); }
59
bool read_check(wstm_transaction *tx, wstm_entry *entry) { if (orec is WSTM_descriptor) { force_decision() if (SUCCESS) version = new_version; else version = old_version } else { version = orec_version; } return (version == entry->old_version); }
60
Data Structures 100 200 300 400 version 52 Status: Undecided a1: (100,15) -> (200,16) a2: (200,52) -> (100,53) a3: (300,15) -> (300,16) Orecs a1 a2 a3
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.