An Efficient Lock Protocol for Home-based Lazy Release Consistency Electronics and Telecommunications Research Institute (ETRI) 2001/5/16 HeeChul Yun.

Slides:



Advertisements
Similar presentations
The Effect of Network Total Order, Broadcast, and Remote-Write on Network- Based Shared Memory Computing Robert Stets, Sandhya Dwarkadas, Leonidas Kontothanassis,
Advertisements

CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,
Multiple-Writer Distributed Memory. The Sequential Consistency Memory Model P1P2 P3 switch randomly set after each memory op ensures some serial order.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Distributed Shared Memory
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
1 Release Consistency Slides by Konstantin Shagin, 2002.
1 Munin, Clouds and Treadmarks Distributed Shared Memory course Taken from a presentation of: Maya Maimon (University of Haifa, Israel).
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
November 1, 2005Sebastian Niezgoda TreadMarks Sebastian Niezgoda.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 25: Distributed Shared Memory All slides © IG.
(Software) Distributed Shared Memory (aka Shared Virtual Memory)
Implementing an OpenMP Execution Environment on InfiniBand Clusters Jie Tao ¹, Wolfgang Karl ¹, and Carsten Trinitis ² ¹ Institut für Technische Informatik.
Distributed Resource Management: Distributed Shared Memory
CSS434 DSM1 CSS434 Distributed Shared Memory Textbook Ch18 Professor: Munehiro Fukuda.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Shared Memory.
TreadMarks Distributed Shared Memory on Standard Workstations and Operating Systems Pete Keleher, Alan Cox, Sandhya Dwarkadas, Willy Zwaenepoel.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
TECHNIQUES FOR REDUCING CONSISTENCY- RELATED COMMUNICATION IN DISTRIBUTED SHARED-MEMORY SYSTEMS J. B. Carter University of Utah J. K. Bennett and W. Zwaenepoel.
A Performance Comparison of DSM, PVM, and MPI Paul Werstein Mark Pethick Zhiyi Huang.
1 Lecture 13: LRC & Interconnection Networks Topics: LRC implementation, interconnection characteristics.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
Distributed Memory and Cache Consistency (some slides courtesy of Alvin Lebeck)
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
Distributed Shared Memory Presentation by Deepthi Reddy.
Distributed Shared Memory (part 1). Distributed Shared Memory (DSM) mem0 proc0 mem1 proc1 mem2 proc2 memN procN network... shared memory.
Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.
Implementation and Performance of Munin (Distributed Shared Memory System) Dongying Li Department of Electrical and Computer Engineering University of.
DISTRIBUTED COMPUTING
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems Present By: Blair Fort Oct. 28, 2004.
CIS 720 Distributed Shared Memory. Shared Memory Shared memory programs are easier to write Multiprocessor systems Message passing systems: - no physically.
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Making a DSM Consistency Protocol Hierarchy-Aware: An Efficient Synchronization Scheme Gabriel Antoniu, Luc Bougé, Sébastien Lacour IRISA / INRIA & ENS.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z. By Nooruddin Shaik.
OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.
1 March 17, 2006Zhiyi’s RSL VODCA: View-Oriented, Distributed, Cluster-based Approach to parallel computing Dr Zhiyi Huang Dept of Computer Science University.
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
Region-Based Software Distributed Shared Memory Song Li, Yu Lin, and Michael Walker CS Operating Systems May 1, 2000.
Implementation and Performance of Munin (Distributed Shared Memory System) Dongying Li Department of Electrical and Computer Engineering University of.
Distributed Memory and Cache Consistency (some slides courtesy of Alvin Lebeck)
vNUMA: Virtual Multiprocessors on Clusters of Workstations
Software Coherence Management on Non-Coherent-Cache Multicores
Distributed Shared Memory
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Relaxed Consistency models and software distributed memory
Reactive Synchronization Algorithms for Multiprocessors
Pete Keleher, Alan L. Cox, Sandhya Dwarkadas and Willy Zwaenepoel
The University of Adelaide, School of Computer Science
Lecture 26 A: Distributed Shared Memory
Outline Midterm results summary Distributed file systems – continued
Distributed Shared Memory
Implementing an OpenMP Execution Environment on InfiniBand Clusters
Exercises for Chapter 16: Distributed Shared Memory
The University of Adelaide, School of Computer Science
Lecture 26 A: Distributed Shared Memory
A Novel Home Migration Protocol in Home-based DSM
Lecture 17 Multiprocessors and Thread-Level Parallelism
Database System Architectures
Distributed Resource Management: Distributed Shared Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

An Efficient Lock Protocol for Home-based Lazy Release Consistency Electronics and Telecommunications Research Institute (ETRI) 2001/5/16 HeeChul Yun

2/18 Contents Introduction Motivation Our Approach Performance Evaluation Conclusion

3/18 Introduction (1/2) SVM(Shared Virtual Memory) Page based software DSM Cost effective but performance is limited Proc1Proc2Proc3 ProcN Mem1Mem2Mem3MemN Network Shared memory abstraction

4/18 Introduction (2/2) Lazy Release Consistency (LRC) Most popular Allow multiple writer for a page using diff mechanism Home-based LRC (HLRC) Home is assigned for each page

5/18 HLRC (1/2) P0 Acq(L) P1 W(X) Rel(L) make twin fetch diff Acq(L) Invalidate R(X) create diff P0 Acq(L) P1 W(X) Rel(L) make twin Acq(L) Invalidate R(X) apply diff fetch page LRC HLRC P2(home) create diff

6/18 HLRC (2/2) Advantage No diff creation at Home Short lifetime of diff (low memory overhead) Problems Home assignment is important for performance Fetching a whole page on page fault  Poor lock performance

7/18 Motivation (1/2) Characteristics of lock protected data Migratory pattern Modification is small (fine grained) Poor lock performance of HLRC Proper fixed home assignment is difficult Migration schemes are not effective Fetching a whole page for small modification

8/18 P0 Acq(L) P1 W(X) Rel(L) Acq(L) Invalidate R(X) Fetch page P2(home) X(4byte) 4KB Apply diff Motivation (2/2)

9/18 Our Approach (1/5) Update small lock protected data Removing page fetching overhead Selectively piggyback diffs in lock grant message Diff selection metric Acquirer ’ s access history inside critical section Diff granularity

10/18 Our Approach (2/5) P0 Acq(L) P1 W(a) Rel(L) Acq(L) R(a) fetch page P2(home) apply diff P0 Acq(L) P1 W(a) Rel(L) Acq(L) R(a) P2(home) apply diff apply diff send diff create diff create diff HLRCOurs a Page X

11/18 Our Approach (3/5) Hint Access history Filter Only select small diff Apply Diff Lazy until fault occurs Acq(L) W(a) W(b) R(a) R(b) Rel(L) ACQ(X,Y) Diff(X0) P0 P1 Fetch page(Y) Hint (X, Y) Apply Diff(a) Filter Y a Page X b Page Y

12/18 Our Approach (4/5) Goodness No additional messages Lower message amount diff size < page size Avoid page requests inside critical section Small diff application is much faster than remote page fetching Minimize lock serialization

13/18 Our Approach (5/5) Overhead Home page diff creation Only for hinted page Stop if size is bigger than threshold Memory overhead for maintaining diffs Only for hinted page Simply free old diffs at some interval do not affect correctness  small overhead

14/18 Performance Evaluation Platform 8 node PIII 500Mhz Linux cluster 100Mbps switched Ethernet Implementation KDSM (KAIST Distributed Shared Memory) Base HLRC KDSM + Ours Application From SPLASH2 TSP, Water, Raytrace(o), Raytrace(r), IS

15/18 Page Requests Inside Critical Section

16/18 Message amount

17/18 Execution Time Breakdown

18/18 Conclusion Lock is inefficient in HLRC Home is at fixed location Whole page must be fetched on page fault Our Protocol Removing page fetching overhead for small data Improve lock performance with negligible overhead

19/18 Diff Creation X: Twin: Encode Changes X: Create Twin Write(x) Release Diff X: