Download presentation
Presentation is loading. Please wait.
Published byKiersten Tetley Modified over 10 years ago
1
Thread-Level Speculation as a Memory Consistency Protocol for Software DSM? Marcelo Cintra University of Edinburgh http://www.dcs.ed.ac.uk/home/mc
2
Dagstuhl Seminar - October 20032 Thread-Level Speculation (TLS) Speculatively run whole “threads” and backtrack if necessary Track data accesses to detect cross-thread “conflicting” memory accesses Buffer state of speculative threads and commit when appropriate Enforce some expected correct execution behavior
3
Dagstuhl Seminar - October 20033 Example 1: Speculative Parallelization Original code: sequential with non-decidable dependences Squash on data flow dependences for(i=0; i<100; i++) { … = A[L[i]]+… A[K[i]] = … } Iteration J+2 … = A[5]+… A[5] =... Iteration J+1 … = A[2]+… A[2] =... Iteration J … = A[4]+… A[5] =... RAW
4
Dagstuhl Seminar - October 20034 Example 2: Speculative Synchronization [Martinez and Torrellas, ASPLOS02] Original code: parallel with locks and barriers Squash on conflicting accesses Thread A acquire release … = A[4]+… A[5] = … release … = A[2]+… A[2] = … release … = A[5]+… A[5] = … Thread B acquire Thread C acquire RAW WAW
5
Dagstuhl Seminar - October 20035 Example 2: Speculative Synchronization Non-conflicting memory operations can perform out- of-order Conflicting memory operations eventually complete in- order after rollback –Relaxes the order of non-conflicting memory operations while still providing RC abstraction At release/commit all pending stores must complete TLS used to enforce RC in a more “relaxed” way by means of speculation and rollback
6
Dagstuhl Seminar - October 20036 Outline Background and motivation A TLS-based protocol for software DSM Summary Related work Conclusions
7
Dagstuhl Seminar - October 20037 LRC Consistency Protocol Block on acquires and wait for lock Obtain lock along with invalidations On load page fault allocate local page and get diff update On store page fault generate twin copy On release compare twin and private copy to generate twin; send invalidations and lock to next thread in line
8
Dagstuhl Seminar - October 20038 Example LRC Operation Thread A acquire … = A[4]+… … A[5] = … release Thread B acquire … = A[2]+… … A[2] = … release Thread C acquire … = A[5]+… … A[5] = … release Generate diff Obtain diff from Thread A
9
Dagstuhl Seminar - October 20039 TLS-based Consistency Protocol On load or write miss allocate local page and twin copy Expand loads and stores to keep a record of the accesses to individual fields of shared objects On commit –Wait for “diff” from non-speculative thread –Check for violations –Merge “diff’s” and pass to next speculative thread in line If violation detected –Incorporate received “diff” into twin copy and discard local copy –Discard own “diff” –Discard some private data (may require extra buffering) –Re-execute
10
Dagstuhl Seminar - October 200310 TLS “diff” and Violations 3 possible states for each field of shared object: –NotAccessed: thread did not touch this field –Loaded: thread loaded this field but did not store to it –Modified: thread stored to this field and possibly loaded it Violation and merging of “diff”s Non-spec Modified Speculative Loaded NotAccessed Modified NotAccessed Loaded Violation NotAccessed Modified Violation Modified
11
Dagstuhl Seminar - October 200311 Example TLS DSM Operation Thread A TLS_start … = A[4]+… TLS_load … A[5] = … TLS_store TLS_end Thread B TLS_start … = A[2]+… TLS_load … A[2] = … TLS_store TLS_end Thread C TLS_start … = A[5]+… TLS_load … A[5] = … TLS_store TLS_end Update “diff” to have A[5] as Modified Update “diff” to have A[2] as Loaded Wait for non-spec (A) to finish. Obtain “diff” from A. Compare “diff” with own “diff”. No violations, so become non-spec. Merge “diff’s” Get page with stale data No need to update “diff” Wait for non-spec (B) to finish. Obtain “diff” from B. Compare “diff” with own “diff”. Violation detected.
12
Dagstuhl Seminar - October 200312 Example Implementation TLS_load: TLS_store: TLS_start: –Try to acquire lock with a non-blocking operation –If successful then become non-speculative –Otherwise get a place in line for the lock, and execute speculatively if (SA[i]==NotAccessed) SA[i]=Loaded SA[i]=Modified
13
Dagstuhl Seminar - October 200313 Example Implementation TLS_end: –If non-speculative then “pass” lock to next thread in line; next thread becomes non-speculative –Else, if next thread waiting for lock then Wait for non-speculative to finish Get “diff” from non-speculative thread Check for violations Merge “diff”s “Pass” lock to next thread in line –Else, wait for lock
14
Dagstuhl Seminar - October 200314 Outline Background and motivation A TLS-based protocol for software DSM Summary Related work Conclusions
15
Dagstuhl Seminar - October 200315 Will It Work? Overheads –Augmented loads and stores Both speculative parallelization and optimistic concurrency control in software have been done successfully Compiler instrumentation for write trapping in DSM is not so bad [Adve et. al., HPCA96] –Serialization of commits Implementation –Hopefully not much more complex than a software DSM –Use source code augmentation and user help Applications –Irregular applications with little overlap of modifications in critical sections –Easy to switch back to normal DSM operation
16
Dagstuhl Seminar - October 200316 Outline Background and motivation A TLS-based protocol for software DSM Summary Related work Conclusions
17
Dagstuhl Seminar - October 200317 Related Work Speculative Synchronization: –Martinez and Torrellas (ASPLOS 2002); Rajwar and Goodman (MICRO 2001) Hardware-based Optimistic Concurrency Control and Software Transactional Memory –Herlihy (ACM TDBS 1990); Kung and Robinson (ACM TDBS 1981) Source-code level speculation for transaction processing –Shavit and Touitou (PODC 1995); Herlihy et. al., (PODC 2003) Run-time system speculation on top of hardware coherent systems
18
Dagstuhl Seminar - October 200318 Related Work Speculation and consistency models: –Gniady, Falsafi, and Vijaykumar (ISCA 1999) SC plus speculation in hardware Speculation only within instruction window and ld/st queue
19
Dagstuhl Seminar - October 200319 Related Work Software Speculative Parallelization: –Dang, Yu, and Rauchwerger (IPDPS 2002); Rundberg and Stenström (WSSMM 2000); Cintra and Llanos (PPoPP 2003) Speculative parallelization at source-code level –Papadimitriou and Mowry (CMU-CS-01-145) Speculative parallelization on software DSM protocol
20
Dagstuhl Seminar - October 200320 Related Work Software DSM systems: –Treadmarks: Amza et. al. (IEEE Computer 1996) Lazy RC (LRC) –Midway: Bershad, Zekauskas, and Sawdon (CompCon 1993) Entry Consistency (EC) –Adve et. al. (HPCA 1996) Compared LRC versus EC Compared twinning versus compiler instrumentation for write trapping
21
Dagstuhl Seminar - October 200321 Outline Background and motivation A TLS-based protocol for software DSM Summary Related work Conclusions
22
Dagstuhl Seminar - October 200322 Conclusions and Future Work TLS can provide RC with more relaxed synchronization Hardware speculative synchronization and software speculative parallelization have been successful Must find applications Must perform detailed performance evaluation ?
23
Thread-Level Speculation as a Memory Consistency Protocol for Software DSM? Marcelo Cintra University of Edinburgh http://www.dcs.ed.ac.uk/home/mc
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.