Download presentation
Presentation is loading. Please wait.
Published byPearl Benson Modified over 9 years ago
1
Multiprocessor Use large number of processor design for workstation or PC market Has an efficient medium for communication among the processor memory I/O Provide High performance 1
2
2 Multiprocessor Systems Multiprocessors Multiple CPUs with shared memory Memory access delays about 10 – 50 nsec
3
3
4
4
5
5 Shared memory architectures One large memory –One the same side of the interconnect Mostly Bus –Memory reference has the same latency –Uniform memory access (UMA) Many small memories –Local and remote memory –Memory latency is different –Non-uniform memory access (NUMA)
6
6 Multiprocessor Architecture UMA (Uniform Memory Access) Time to access each memory word is the same Bus-based UMA CPUs connected to memory modules through switches NUMA (Non-uniform memory access) Memory distributed (partitioned among processors) Different access times for local and remote accesses Communicate between system component will be done in media called a Bus
7
A UMA multiprocessor 7
8
A NUMA multiprocessor 8
9
9
10
Multiprocessing performance Many computation can proceed in parallel Difficulty: the application must be broken down into small task that can be assigned to individual processor Processors must communicate with each other to exchange data 10
11
11 Bus-based UMA All CPUs and memory module connected over a shared bus To reduce traffic, each CPU also has a cache Key design issue: how to maintain coherency of data that appears in multiple places? Each CPU can have a local memory module also that is not shared with others Compilers can be designed to exploit the memory structure
12
12 Switched UMA Goal: To reduce traffic on bus, provide multiple connections between CPUs and memory units so that many accesses can be concurrent Crossbar Switch: Grid with horizontal lines from CPUs and vertical lines from memory modules Crossbar at (i,j) can connect i-th CPU with j-th memory module As long as different processors are accessing different modules, all requests can be in parallel Non-blocking: waiting caused only by contention for memory, but not for bus
13
13 Crossbar Switch
14
14 Cache Coherence Many processors can have locally cached copies of the same object We want to maximize concurrency If many processors just want to read, then each one can have a local copy, We want to ensure coherence If a processor writes a value, then all subsequent reads by other processors should return the latest value Coherence refers to a logically consistent global ordering of reads and writes of multiple processors
15
15 Example X = 1 P1’s cache P2’s cache Memory X = 1
16
16 Invalidate vs. update protocols X = 3 X = 1 P1’s cache P2’s cache Memory X = 1 X = 3 P1’s cache P2’s cache Memory X = 3
17
17 Shared memory architectures Multiple CPU’s (or cores) One memory with a global address space All CPUs access all memory through the global address space All CPUs can make changes to the shared memory Changes made by one processor are visible to all other processors?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.