CSS490 Distributed Shared Memory

CSS490 Distributed Shared Memory
Textbook Ch5 Instructor: Munehiro Fukuda These slides were compiled from the course textbook and the reference books. Hello! Everyone, My name is Shinya Kobayashi. Today, I am going to present our paper titled “Inter-Cluster Job Coordination Using Mobile Agents” on behalf of the first author, Munehiro Fukuda. Munehiro was hoping to show up and present the paper at AMS2001, however he got to wait in Japan until he will get an H1B visa. Since I received the presentation materials from him quite recently, please allow me to present this paper using this script. I can respond to your questions as far as I know, however you can also ask Munehiro by . His address is on the title page of our paper. (time 1:05) Winter, 2004 CSS490 DSM

Basic Concept : : : … address write(address, data);
Distributed Shared Memory (exists only virtually) Data = read(address); write(address, data); CPU 1 CPU n MMU Page Mgr : Memory Node 1 CPU 1 CPU n MMU Page Mgr : Memory Node 1 CPU 1 CPU n MMU Page Mgr : Memory Node 1 … Communication Network Winter, 2004 CSS490 DSM

Why DSM? Simpler abstraction
Underlying tedious communication primitives are all shielded by memory accesses Better portability of distributed application programs Natural transition from sequential to distributed application Better performance of some applications Data locality, one-demand data movement, and large memory space reduce network traffic and paging/swapping activities. Flexible communication environment Sender and receiver have no need to know each other. They even need not coexist. Ease of process migration Migration is completed only by transferring the corresponding PCB to the destination. Winter, 2004 CSS490 DSM

Main Issues Granularity Structure of shared-memory space
Fine (less false sharing but more network traffic) Cache line (e.g. Dash and Alewife), Object (e.g. Orca and Linda), Page (e.g. Ivy)  Coarse(more false sharing but less network traffice) Structure of shared-memory space Memory coherence and access synchronization Strict, Sequential, Causal, Weak, and Release Consistency models Data location and access Broadcasting, centralized data locator, fixed distributed data locator, and dynamic distributed data locator Replacement strategy LRU or FIFO, and using secondary store or the memory space of other nodes (COMA) Thrashing How to prevent a block from being exchanged back and forth between two nodes. Heterogeneity Winter, 2004 CSS490 DSM

Consistency Models Strict Consistency
Wi(x, a): Processor i writes a on variable x. BRi(x): Processor i reads b from variable x. Any read on x must return the value of the most recent write on x. Strict Consistency Not Strict Consistency P1 P2 P3 P1 P2 P3 W2(x, a) W2(x, a) aR1(x) nilR1(x) aR1(x) aR3(x) aR3(x) aR1(x) Winter, 2004 CSS490 DSM

Consistency Models Linearizability and Sequential Consistency
Linearlizability: Operations of each individual process appear to all processes in the same order as they happen. Sequential Consistency: Operations of each individual process appear in the same order to all processes. Linearlizability Sequential Consistency P1 P2 P3 P4 P1 P2 P3 P4 W2(x, a) W2(x, a) W3(x, b) W3(x, b) aR1(x) bR1(x) aR4(x) bR4(x) bR1(x) bR4(x) aR1(x) aR4(x) Winter, 2004 CSS490 DSM

Consistency Models FIFO and Processor Consistency
FIFO Consistency: writes by a single process are visible to all other processes in the order in which they were issued. Processor Consistency: FIFO Consistency + all write to the same memory location must be visible in the same order. FIFO Consistency Processor Consistency P1 P2 P4 P3 P4 P1 P2 P3 W2(x, a) W2(x, a) W3(x, 0) W2(x, b) aR1(x) W3(y, 0) aR1(x) W2(x, b) W3(x, 1) 0R1(x) aR1(x) 0R1(x) aR1(x) W3(y, 1) bR1(x) 0R1(x) 0R1(x) 1R1(x) 1R1(x) 1R1(x) W3(z, a) W3(z, 1) W2(y, a) bR1(x) 1R1(x) W2(y, a) bR1(x) bR1(x) aR1(z) aR1(z) aR1(y) aR1(y) aR1(z) aR1(y) aR1(z) aR1(y) Winter, 2004 CSS490 DSM

Consistency Models Causal Consistency
Causally related write must be visible to all processes in the same order. Concurrent writes may be propagated in a different order. Causal Consistency Not Causal Consistency P1 P2 P3 P4 P1 P2 P3 P4 W2(x, a) W2(x, a) aR3(x) aR4(x) aR3(x) aR3(x) W2(x, c) W3(x, b) W3(x, b) bR4(x) cR1(x) aR1(x) bR4(x) bR1(x) cR4(x) bR1(x) aR4(x) Winter, 2004 CSS490 DSM

Consistency Models Weak Consistency
Accesses to synchronization variables must obey sequential consistency. All previous writes must be completed before an access to a synchronization variable. All previous accesses to synchronization variables must be completed before access to non-synchronization variable. Weak Consistency Not Weak Consistency P1 P2 P3 P1 P2 P3 W2(x, a) W2(x, a) W2(x, b) W2(y, c) W2(y, c) bR4(x) aR4(x) W2(x, b) NilR4(y) S3 S3 S1 S1 S2 S2 bR4(x) cR4(y) bR4(x) aR4(x) cR4(y) cR4(y) cR4(y) bR4(x) Winter, 2004 CSS490 DSM

Consistency Models Release Consistency
Access to acquire and release variables obey processor consistency. Previous acquires requested by a process must be completed before the process performs a data access. All previous data accesses performed by a process must be completed before the process performs a release. P1 P2 P3 Acq1(L) W1(x, a) W1(x, b) Rel1(L) Acq2(L) bR2(x) bR2(x) aR3(x) Rel2(L) Winter, 2004 CSS490 DSM

Implementing Sequential Consistency Replicated and Migrating Data Blocks
Node 1 memory cache a b Processor Node 2 Node 3 memory cache x y Processor memory cache m n Processor x Duplicate x b m Then what if Node 2 updates x? Winter, 2004 CSS490 DSM

Implementing Sequential Consistency Write Invalidation
Client wants to write: new copy new copy 2. Replicate block 3. Invalidate block 1. Request block a copy of block block a copy of block Winter, 2004 CSS490 DSM

Implementing Sequential Consistency Write Update
Client wants to write: new copy 3. Update block new copy 2. Replicate block 1. Request block a copy of block block a copy of block Winter, 2004 CSS490 DSM

Implementing Sequential Consistency Read/Write Request
Unused Read (Read a copy from the onwer) Replacement Replacement Replacement Replacement Read only Write invalidate Nil Read (Read from memory and get an ownership) Write (invalidate others if they have a copy and get an ownership) Write invalidate Write (invalidate others if they have a copy and get an ownership) Write invalidate Read-owned Writable Write (invalidate others if they have a copy) Winter, 2004 CSS490 DSM

Implementing Sequential Consistency Locating Data –Fixed Distributed-Server Algorithms
Processor 0 Processor 1 Processor 2 Address Owner P0 1 2 P2 Address Owner 3 P1 4 P2 5 P0 Address Owner 6 P2 7 P1 8 Location search Read request Addr0 writable Addr3 read owned Addr2 read owned Addr1 read owned Addr7 writable Block replication Addr4 read owned Addr5 writable Addr6 writable Read addr2 Addr2 read only Addr8 read owned Winter, 2004 CSS490 DSM

Implementing Sequential Consistency Locating Data – Dynamic Distributed-Server Algorithms
Processor 0 Processor 1 Processor 2 Breaking the chain of nodes: When the node receives an invalidation When the node relinquishes ownership When the node forwards a fault request The node points to a new owner Address Probable P0 1 2 P2 Address Probable 3 P1 4 P2 5 P0 Address Probable 2 P2 7 P1 8 p1 Location search Read request Addr0 writable Addr3 read owned Addr2 read owned Addr2 read only Addr1 read owned Addr7 writable Block replication Addr4 read owned Addr5 writable Addr8 read owned Read addr2 Addr2 read owned Winter, 2004 CSS490 DSM

Replacement Strategy Which block to replace
Non-usage based (e.g. FIFO) Usage based (e.g. LRU) Mixed of those (e.g. Ivy ) Unused/Nil: replaced with the highest priority Read-only: the second priority Read-owned: the third priority Writable: the lowest priority and LRU used. Where to place a replaced block Invalidating a block if other nodes have a copy. Using secondary store Using the memory space of other nodes Winter, 2004 CSS490 DSM

Thrashing Thrashing: Two or more processes try to write the same shared block. An owner keeps writing its block shared by two or more reader processes. The larger a block, the more chances of false sharing that causes thrashing. Solutions: Allow a process to prevent a block from accessed from the others, using a lock. Allow a process to hold a block for a certain amount of time. Apply a different coherence algorithm to each block. What do those solutions require users to do? Are there any perfect solutions? Winter, 2004 CSS490 DSM

Paper Review by Students
IVY Dash Munin Linda/Jini JavaSpace Winter, 2004 CSS490 DSM

CSS490 Distributed Shared Memory

Similar presentations

Presentation on theme: "CSS490 Distributed Shared Memory"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSS490 Distributed Shared Memory

Similar presentations

Presentation on theme: "CSS490 Distributed Shared Memory"— Presentation transcript:

Similar presentations

About project

Feedback