Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that appear
The types of multiprocessor Multiprocessors can be used in different ways: Uniprossesors (single-instruction, single-data or SISD) Within a single system to execute multiple, independent sequences of instructions in multiple contexts (multiple- instruction, multiple-data or MIMD); A single sequence of instructions in multiple contexts (single-instruction, multiple-data or SIMD, often used in vector processing); Multiple sequences of instructions in a single context (multiple-instruction, single-data or MISD, used for redundancy in fail-safe systems and sometimes applied to describe pipelined processors or hyper threading).
Multiprocessor types used The first multi SIMD processors were kind, and this architecture is still used for certain specialized machines MIMD type seems to be nowadays choice target for the current application computers: The MIMD are flexible: they can be used as a single user machine, or as multi-programmed machinery The MIMD can be built from existing processors
In the center of MIMD processors: memory We Can be classified into two classes MIMD processors depending on the number of processors in the machine. Ultimately, it is the memory organization that is affected: - Centralized shared memory - distributed memory
Centralized shared memory
The centralized shared memory is used by machines at most a dozen processors 1995 We use a bus that connects the processor and memory, with the help of local cache. We call this type of memory structure the Uniform Memory Access (UMA).
Distributed memory
The distributed memory is used in machines using "a lot" of processors, which require too much bandwidth for a single memory Advantages of distributed memory: it is easier to increase the bandwidth of memory while most memory accesses are local. Latency is also improved when using the local memory
Distributed memory models There are two memory models distributed: Unique address space, accessible by all processors, but distributed among the processors. It is said that this system is Non-Uniform Memory Access (NUMA) because the access time to the memory depends on the location of the area that is addressed (local or remote) Private address space, where each processor has exclusive access to the local memory. Multi-computer systems are sometimes called these systems
Distributed memory models For both models, the communication mode is different: Message Passing Processors communicate via message passing Processors have private memories Focuses attention on costly non-local operations Shared Memory Processors communicate by memory read/write Easy on small-scale machines Lower latency SMP or NUMA The kind that we will focus on today
Advantages and disadvantages of communication mechanisms Shared memory: Well-known mechanism Easy to program (and easy to build compilers) Better use of bandwidth (memory protection at the hardware level, not at the level of the operating system Possibility of using caching techniques Message-passing: simplified equipment Explicit communication, requiring the intervention of the programmer
Types of Shared-Memory Architectures UMA Uniform Memory Access Access to all memory occurred at the same speed for all processors. We will focus on UMA today. NUMA Non-Uniform Memory Access Typically interconnection is grid or hypercube. Access to some parts of memory is faster for some processors than other parts of memory. Harder to program, but scales to more processors
Shared Memory Multiprocessors Memory is shared either globally or locally, or a combination of the two.
Shared Memory Access Uniform Memory Access(UMA) systems use a shared memory pool, where all memory takes the same amount of time to access. Quickly becomes expensive when more processors are added.
Shared Memory Access Non-Uniform Memory Access(NUMA) systems have memory distributed across all the processors, and it takes less time for a processor to read from its own local memory than from non-local memory. Prone to cache coherence problems, which occur when a local cache isn’t in sync with non-local caches representing the same data. Dealing with these problems require extra mechanisms to ensure coherence.
References cgi.cse.unsw.edu.au/~cs3231/06s1/lectures