Download presentation
Presentation is loading. Please wait.
Published byCarol Fleming Modified over 9 years ago
1
Chapter 6 Multiprocessor System
2
Introduction Each processor in a multiprocessor system can be executing a different instruction at any time. The major advantages of MIMD system –Reliability –High performance The overhead involved with MIMD –Communication between processors –Synchronization of the work –Waste of processor time if any processor runs out of work to do –Processor scheduling
3
Introduction (continued) task –An entity to which a processor is assigned –a program, a function or a procedure in execution process –another word for a task processor (or processing element) –hardware resource on which tasks are executed
4
Introduction (continued) Thread –The sequence of tasks performed in succession by a given processor –The path of execution of a processor through a number of tasks. –Multiprocessors provide for the simultaneous presence of a number of threads of execution in an application. –Refer to Example 6.1 (degree of parallelism =3)
5
R-to-C ratio A measure of how much overhead is produced per unit of computation. –R: the length of the run time of the task (=computation time) –C: the communication overhead This ratio signifies task granularity A high R-to-C ratio implies that communication overhead is insignificant compared to computation time.
6
Task granularity Task granularity –Coarse grain parallelism High R-to-C ratio –Fine grain parallelism Low R-to-C ratio –The general tendency to maximum performance is to resort to the finest possible granularity. providing for the highest degree of parallelism. –Maximum parallelism does not lead to maximum overhead. a trade-off is required to reach an optimum level.
7
6.1 MIMD Organization (Figure 6.2) Two popular MIMD organizations –Shared memory (or tightly coupled ) architecture –Message passing (or loosely coupled) architecture Share memory architecture –UMA (uniform memory architecture) –Rapid memory access –Memory contention
8
6.1 MIMD Organization (continued) Message-passing architecture –Distributed memory MIMD system –NUMA (nonuniform memory access) –Heavy communication overhead for remote memory access –No memory contention problem Other models –Mixed of two
9
6.2 Memory Organization Two parameters of interest in MIMD memory system design – bandwidth – latency. Memory latency is reduced by increasing the memory bandwidth. –By building the memory system with multiple independent memory modules (Banked and interleaved memory architecture) –By reducing the memory access and cycle times
10
Multi-port memories Figure 6.3 (b) –Each memory module is a three-port memory device. –All three ports can be active simultaneously. –The only restriction is that only one location can be write data into a memory location.
11
Cache incoherence The problem wherein the value of a data item is not consistent throughout the memory system. –Write-through A processor updates the cache and also the corresponding entry in the main memory. –Updating protocol –Invalidating protocol – Write-back An updated cache-block is written back to the main memory just before that block is replaced in the cache.
12
6.2 Memory Organization (continued) Cache coherence schemes –Not to use private caches (Figure 6.4) –With private cache architecture, but to cache only non-sharable data items. –Cache flushing Shared data are allowed to be cached only when it is known that only one processor will be accessing the data
13
6.2 Memory Organization (continued) Cache coherence schemes (continued) –Bus watching (or bus snooping) (Figure 6.5) Bus watching schemes incorporate hardware that monitors the shared bus for data LOAD and STORE into each processor ’ s cache controller. –Write-once The first STORE causes a write-through to the main memory. Ownership protocol
14
6.3 Interconnection Network Bus (Figure 6.6) –Bus window (Figure 6.7(a)) –Fat tree (Figure 6.7 (b)) Loop or ring –token ring standard Mesh
15
6.3 Interconnection Network(continued) Hypercube –Routing is straightforward. –The number of nodes must be increased by powers of two. Crossbar –It offers multiple simultaneous communications but at a high hardware complexity. Multistage switching networks
16
6.4 Operating System Considerations The major functions of the multiprocessor system –Keeping track of the status of all the resources at all time –Assigning tasks to processors in a justifiable manner –Spawning and creating new processors such that they can be executed in parallel or independently of each other. –Collecting their individual results when all the spawned processed are completed and passing them to other processors as required.
17
6.4 Operating System Considerations (continued) Synchronization mechanisms –Processes in an MIMD operate in a cooperative manner and a sequence control mechanism is needed to ensure the ordering of operations. –Processes compete with each other to gain access to shared data items. –An access control mechanism is needed to maintain orderly access
18
6.4 Operating System Considerations (continued) Synchronization mechanisms –The most primitive synchronization techniques Test & set Semaphores Barrier synchronization Fetch & add Heavy-weight process and Light-weight process Scheduling –Static –Dynamic : load balancing
19
6.5 Programming (continued) Four main structures of parallel programming –Parbegin / parend –Fork / join –Doall –Processes, tasks, procedures, and so on can be declared for parallel execution.
20
6.6 Performance Evaluation and Scalability Performance evaluation –Speed-up : S = Ts / Tp To= TpP-Ts Tp=(To+Ts)/P S = Ts P/(To+Ts) –Efficiency : E = S/p = Ts/(Ts+To) = 1/(1+To/Ts)
21
Scalability Scalability: the ability to increase speedup as the number of processors increase. A parallel system is scalable if its efficiency can be maintained at a fixed value by increasing the number of processors as the problem size increases. –Time-constrained scaling –Memory-constrained scaling
22
Isoefficiency function E = 1/(1+To/Ts) To/Ts=(1-E)/E. Hence, Ts=ETo/(1-E) For a given value of E, E/(1-E) is a constant, K. Then Ts=KTo (Isoefficency function) A small isoeffiency function indicates that small increments in problem size are sufficient to maintain efficiency when p is increased.
23
6.6 Performance Evaluation and Scalability (continued) Performance models –The basic model Each task is equal and takes R time units to be executed on a processor. If two tasks on different processors wish to communicate with each other, they do so at a cost C time units. –Model with linear communication overhead –Model with overlapped communication –Stochastic model
24
Examples Alliant FX series –Figure 6.17 –Parallelism Instruction level Loop level Task level
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.