Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Shared Memory Systems and Programming

Similar presentations


Presentation on theme: "Distributed Shared Memory Systems and Programming"— Presentation transcript:

1 Distributed Shared Memory Systems and Programming
By: Kenzie MacNeil Adapted from Parallel Programming Techniques and Applications using networked workstations and parallel computers by Barry Wilkinson and Michael Allen, and

2 Distributed Shared Memory Systems
Shared memory programming model on al cluster Has physically distributed and separate memory Programming Viewpoint: Memory is grouped together and sharable between processes Known as Distributed Shared Memory (DSM)

3 Distributed Shared Memory Systems
Can be achieved by software or hardware Software: Easy to use on clusters Inferior to using explicit message passing on the same cluster Utilizes the same techniques as true shared memory systems (Chapter 8)

4 Distributed Shared Memory
Shared memory programming is generally more convenient than message passing Data can be accessed by individual processors without explicitly sending data Shared data has to be controlled Locks or other means Both message passing and shared memory often require synchronization

5 Distributed Shared Memory
Distributed Shared Memory is a group of interconnected computers appearing to have a sing memory with a single address space Each computer having its own memory which is physically distributed Any memory location can be accessed by any processor in the cluster Regardless of the memory residing locally

6 Distributed Shared Memory

7 Advantages of DMS Normal shared memory programming techniques can be used Easily scalable, compared to traditional bus-connected shared memory multiprocessors Message passing is hidden from the user Can handle complex and large data bases without replication or sending the data to processes

8 Disadvantages of DMS Lower performance than true shared memory multiprocessor systems Must provide for protection against simultaneous access to shared data Locks, etc. Little programmer control over actual messages being generated Incur performance penalties when compared to message passing routines on a cluster

9 Hardware DSM Systems Special network interfaces and cache coherence circuits are required Several interfaces that support shared memory operations Higher level of performance More expensive

10 Software DSM Systems Requires no hardware changes
Preformed by software routines Software layer added between the operating system and the applications Kernel may or may not be modified Software layer can be Page based Shared variable based Object based

11 Page Based DMS Existing virtual memory is used to instigate movement of data between computer Occurs when page referenced does not reside locally Referred to as virtual shared memory system Page based systems include: The first DMS system by Li(1986), TreadMarks (1996), Locust (1998)

12 Page Based DSM System

13 Page Based DMS Disadvantages
Size of the unit of the data, a page, can be too big More than the specific data is usually referenced Leads to longer messages Not portable, because they are tied to a particular virtual memory hardware and software False sharing effects appear at the page level Situation in which different parts of a page are required by different processors without any actual sharing of information, but each page must be shared by each process to access different parts

14 Shared Variable DMS Only variables declared as shared are transferred
Transferred on demand Paging mechanism is not used Software routines perform the actions Shared Variable DMS approach includes: Munin (1990), JIAJIA (1999), Adsmith (1996)

15 Object Based DMS Shared data is embodied in objects
Includes data items and procedures/methods Methods used to access data Similar to shared variable approach, even considered an extension Easily implemented in OO languages

16 Managing Shared Data Many ways a processor can be given access to shared data Simplest is the use of a central server Responsible for all read write operations on shared data Requests sent to this server Occurs sequentially on the server Implements a single reader/ single writer policy

17 Managing Shared Data Single reader/writer policy incurs bottleneck
Additional servers can be added to relieve this bottleneck by sharing variables However multiple copies of data is preferable Allows simultaneous access to the data by different processors Coherence policy must be used to maintain these copies

18 Multiple Reader / Single Writer
Allows multiple processors to read shared data Which can be achieved by replicating data Allows only one processor, the owner, to alter data at any instant When an owner alters data two policies are available: Update policy Invalidate policy

19 Multiple Reader/Single Writer Policy
Update policy Utilizes broadcast All copies are altered to reflect broadcast message Invalidate policy All unaltered copies of the data are flagged as invalid Requires a processor to make a request from the owner Any copies of the data that are not accessed remain invalid Both policies are needed to be reliable

20 Multiple Reader/Single Writer Policy
Page based approach Complete page, which holds the variable, is transferred A variable stored on a page which is not shared will be moved or invalidated Protocols offered by applications like TreadMarks for dual writing to a single page

21 Achieving Consistent Memory in DSM
Memory consistency addresses when the current value of a shared variable is seen by other processors Various models are available: Strict Consistency Sequential Consistency Relaxed Consistency Weak consistency Release Consistency Lazy Release Consistency

22 Strict Consistency Variable is obtained from the most recent write to the shared variable As soon as a variable is altered all other processors are informed Can be done by update or invalidity Disadvantage is the large number of messages and changes are not instantaneous Relaxed memory consistency, writes are delayed to reduce message passing

23 Strict Consistency

24 Sequential and Weak Consistency
Sequential consistency, result of any execution same as an interleaving of individual programs Weak consistency, synchronized operations are used by the programmer to enforce sequential consistency Any accesses to shared data can be controlled with synchronized operations Locks, etc

25 Release Consistency Extension of weak consistency
Specified synchronization operation Acquire operation, used before a shared variable or variables are to be read Release operations, used after the shared variable or variable have been altered Acquire is performed with a lock operation Release is performed with an unlock operation

26 Release Consistency

27 Lazy Release Consistency
Version of release consistency Update is only done at the time of acquire rather than at release Generates fewer messages that release consistency

28 Lazy Release Consistency

29 Distributed Shared Memory Programming Primitives
Four fundamental and necessary operations of shared memory programming: Process/thread creations and termination Shared data creation Mutual exclusion synchronization, controlled access to shared data Process/thread and event synchronization Typically provided by user-level library calls

30 Process Creation Set of routines are defined by DSM systems
Such as Adsmith and TreadMarks Used to start new process if process creation is supported dsm_spawn(filename, num_processes);

31 Shared Data Creation Routine is necessary to declare shared data
dsm_shared(&x); or shared int x; Dynamically creates memory space for shared data in the manner of a C malloc After memory space can be discarded

32 Shared Data Access Various forms of data access are provided depending on the memory consistency used Some systems provide efficient routines for difference classes of accesses Adsmith provides three types of accesses: Ordinary Accesse Synchronization Access Non-Synchronization Access

33 Synchronization Accesses
Two principle forms: Global synchronization and process-process pair synchronization Global is usually done through barrier routines Process-process pair can be done by the same routine or separate routines through simple synchronous send/receive routines DSM systems could also provide their own routines

34 Overlapping Computations with Communications
Can be provided by starting a nonblocking communication before it results are needed Called a prefetch routine Program continues execution after the prefetch has been called and while the data is being fetched Could even be done speculatively Special mechanism must be in place to handle memory exceptions Similar to speculative load mechanism used in advanced processors that overlap memory operations with program execution

35 Distributed Shared Memory Programming
DSM programming on a cluster uses the same concepts as shared memory programming on a shared memory multiprocessor system Uses user level library routines or methods Message passing is hidden from the user

36 Basic Shared-Variable Implementation
Simplest DSM implementation is to use a shared variable approach with user level DSM library routines Sitting on top of an existing message passing systems, such as MPI Routines can be embodied into classes and methods The routines could send messages to a central location that is responsible for the shared variables

37 Simple DSM System using a Centralized Server
Single reader/writer protocol

38 Basic Shared-Variable Implementation
A simple DSM system using a centralized server can easily result in a bottleneck One method to reduce this bottleneck is to have multiple servers running on different processors Each server responsible for specific shared variables This is a single reader / single writer protocol

39 Simple DSM System using Multiple Servers

40 Basic Shared-Variable Implementation
Also can provide multiple reader capability A specific server is responsible for the shared variable Other local copies are invalidated

41 Simple DSM System using Multiple Servers and Multiple Reader Policy

42 Overlapping Data Groups
Existing interconnections structure Access patterns of the application Static overlapping Defined by the programmer prior to execution Shared variables can migrate according to usage

43 Symmetrical Multiprocessor System with Overlapping Data Regions

44 Simple DSM System using Multiple Servers and Multiple Reader Policy

45 Questions or Comments?


Download ppt "Distributed Shared Memory Systems and Programming"

Similar presentations


Ads by Google