Region-Based Software Distributed Shared Memory Song Li, Yu Lin, and Michael Walker CS Operating Systems May 1, 2000
Outline Motivation Problems DSM Models Proposed Solution Conclusion
Motivation Share distributed resources to increase computing power –First Solution: Tightly-coupled system –Problem: Central memory bus –Alternative: Loosely-coupled system –New Problem: Need IPC in distributed systems
Motivation: RPC Solution: Remote Procedure Calls (RPC) –Snd/Rcv protocol –Pass-by-value –OS Assignment 1 Problems with RPC –Explicit awareness of communication –Marshalling complex data structures is hard
Motivation: DSM Solution: Distributed Shared Memory (DSM) [Li86--Yale] –Idea: Nonresident pages are fetched from network –Illusion of tightly-coupled system in a loosely- coupled one –Abstraction on top of message-passing model – Programmers use memory-accessing paradigm
DSM: Potential Benefits Shared access to memory Avoids Von-Neumann bottleneck Familiar abstraction Pass-by-reference Spreads communication load over time Provides more memory than is locally available
Problems Who handles remote access? What should be shared? Cache, cache, cache…but how? Page Replacement & Thrashing More than one way to solve the problems...
DSM Models: Decisions Hardware: MMU controls message passing, data migration, and caching Software: Control managed by OS or library Page-based shared memory versus shared variable versus Object Cache consistency models
Proposed Solution: Overview Software-based model using libraries Variable-size regions -- similar to segments –Every granularity has its advantages/drawbacks –Make the region size flexible, provide reasonable defaults (sound familiar?) Multiple reader, single writer (MRSW) caching support
Solution: User Interface r_handle sm_malloc(size); int sm_regionat(r_handle, attrib); int sm_read(r_handle, offset, buf, size); int sm_write(r_handle, offset, buf, size); int sm_regiondt(r_handle); int sm_free(r_handle);
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (1) Allocate memory
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (2) Query table & attach region
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (3) Send region info to client
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (4) Client communicates with provider
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (5) Client uses cached copy
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (6) Client detaches and frees memory
Solution: Mode of Operation... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory (7) Memory is freed
Solution: MRSW Caching Model Replication is good for reads, so allow MR –No communication overhead Writer requires cache invalidation, so restrict to SW –MW is too complicated
Solution: MRSW Caching Model... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory Readers (1) Multiple readers
Solution: MRSW Caching Model... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory ReadersWriter ? (2) Write request
Solution: MRSW Caching Model... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory ReadersWriter (3) Cache invalidation to selected clients only
Solution: MRSW Caching Model... Provider 1 Shared Memory Manager Master Table Client 1 Cache Client Table Client 2 Cache Client Table Client n Cache Client Table Provider 2 Shared Memory Provider n Shared Memory ReadersWriter (4) Single writer, multiple readers
Conclusion: Current Status Basic client/manager model is implemented
Conclusion: Remaining Work Page replacement strategy: LRU Region protection mechanism Replicated managers for better performance and fault tolerance Evaluation...
Conclusion: Evaluation Find distributed apps to use DSM model Performance evaluation: –Test apps with DSM model and message- passing model –Desired results: DSM performance is better