Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Computer Architectures

Similar presentations


Presentation on theme: "Parallel Computer Architectures"— Presentation transcript:

1 Parallel Computer Architectures
Chapter 8 Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

2 Parallel Computer Architectures
(a) On-chip parallelism. (b) A coprocessor. (c) A multiprocessor. (d) A multicomputer. (e) A grid. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

3 Instruction-Level Parallelism
(a) A CPU pipeline. (b) A sequence of VLIW instructions. (c) An instruction stream with bundles marked. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

4 The TriMedia VLIW CPU (1)
A typical TriMedia instruction, showing five possible operations. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

5 The TriMedia VLIW CPU (2)
The TM3260 functional units, their quantity, latency, and which instruction slots they can use. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

6 The TriMedia VLIW CPU (3)
The major groups of TriMedia custom operations. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

7 The TriMedia VLIW CPU (4)
(a) An array of 8-bit elements. (b) The transposed array. (c) The original array fetched into four registers. (d) The transposed array in four registers. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

8 On-Chip Multithreading (1)
(a) – (c) Three threads. The empty boxes indicated that the thread has stalled waiting for memory. (d) Fine-grained multithreading. (e) Coarse-grained multithreading. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

9 On-Chip Multithreading (2)
Multithreading with a dual-issue superscalar CPU. (a) Fine-grained multithreading. (b) Coarse-grained multithreading. (c) Simultaneous multithreading. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

10 Hyperthreading on the Pentium 4
Resource sharing between threads in the Pentium 4 NetBurst microarchitecture. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

11 Homogeneous Multiprocessors on a Chip
Single-chip multiprocessors. (a) A dual-pipeline chip. (b) A chip with two cores. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

12 Heterogeneous Multiprocessors on a Chip (1)
The logical structure of a simple DVD player contains a heterogeneous multiprocessor containing multiple cores for different functions. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

13 Heterogeneous Multiprocessors on a Chip (2)
An example of the IBM CoreConnect architecture. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

14 Introduction to Networking (1)
How users are connected to servers on the Internet. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

15 Introduction to Networking (2)
A packet as it appears on the Ethernet. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

16 Introduction to Network Processors
A typical network processor board and chip. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

17 The Nexperia Media Processor
The Nexperia heterogeneous multiprocessor on a chip. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

18 Multiprocessors (a) A multiprocessor with 16 CPUs sharing a common memory. (b) An image partitioned into 16 sections, each being analyzed by a different CPU. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

19 Multicomputers (1) (a) A multicomputer with 16 CPUs, each with its own private memory. (b) The bit-map image of Fig split up among the 16 memories. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

20 Multicomputers (2) Various layers where shared memory can be implemented. (a) The hardware. (b) The operating system. (c) The language runtime system. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

21 Taxonomy of Parallel Computers (1)
Flynn’s taxonomy of parallel computers. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

22 Taxonomy of Parallel Computers (2)
A taxonomy of parallel computers. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

23 Sequential Consistency
(a) Two CPUs writing and two CPUs reading a common memory word. (b) - (d) Three possible ways the two writes and four reads might be interleaved in time. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

24 Weak Consistency Weakly consistent memory uses synchronization operations to divide time into sequential epochs. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

25 UMA Symmetric Multiprocessor Architectures
Three bus-based multiprocessors. (a) Without caching. (b) With caching. (c) With caching and private memories. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

26 Snooping Caches The write through cache coherence protocol.
The empty boxes indicate that no action is taken. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

27 The MESI Cache Coherence Protocol
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

28 UMA Multiprocessors Using Crossbar Switches
(a) An 8 × 8 crossbar switch. (b) An open crosspoint. (c) A closed crosspoint. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

29 UMA Multiprocessors Using Multistage Switching Networks (1)
(a) A 2 × 2 switch. (b) A message format. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

30 UMA Multiprocessors Using Multistage Switching Networks (2)
An omega switching network. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

31 NUMA Multiprocessors A NUMA machine based on two levels of buses. The Cm* was the first multiprocessor to use this design. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

32 Cache Coherent NUMA Multiprocessors
(a) A 256-node directory-based multiprocessor. (b) Division of a 32-bit memory address into fields. (c) The directory at node 36. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

33 The Sun Fire E25K NUMA Multiprocessor (1)
The Sun Microsystems E25K multiprocessor. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

34 The Sun Fire E25K NUMA Multiprocessor (2)
The SunFire E25K uses a four-level interconnect. Dashed lines are address paths. Solid lines are data paths. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

35 Message-Passing Multicomputers
A generic multicomputer. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

36 Topology Various topologies. The heavy dots represent switches. The CPUs and memories are not shown. (a) A star. (b) A complete interconnect. (c) A tree. (d) A ring. (e) A grid. (f) A double torus. (g) A cube. (h) A 4D hypercube. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

37 The BlueGene/L custom processor chip.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

38 The BlueGene/L. (a) Chip. (b) Card. (c) Board.
(d) Cabinet. (e) System. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

39 Packaging of the Red Storm components.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

40 The Red Storm system as viewed from above.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

41 A Comparison of BlueGene/L and Red Storm
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

42 Processing of a Google query.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

43 Google (2) A typical Google cluster.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

44 Scheduling Scheduling a cluster. (a) FIFO. (b) Without head-of-line blocking. (c) Tiling. The shaded areas indicate idle CPUs. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

45 Distributed Shared Memory (1)
A virtual address space consisting of 16 pages spread over four nodes of a multicomputer. (a) The initial situation. …. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

46 Distributed Shared Memory (2)
A virtual address space consisting of 16 pages spread over four nodes of a multicomputer. … (b) After CPU 0 references page … Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

47 Distributed Shared Memory (3)
A virtual address space consisting of 16 pages spread over four nodes of a multicomputer. … (c) After CPU 1 references page 10, here assumed to be a read-only page. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

48 Linda Three Linda tuples.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

49 A simplified ORCA stack object, with internal data and two operations.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

50 Software Metrics (1) Real programs achieve less than the perfect speedup indicated by the dotted line. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

51 Software Metrics (2) (a) A program has a sequential part and a parallelizable part. (b) Effect of running part of the program in parallel. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

52 Achieving High Performance
(a) A 4-CPU bus-based system. (b) A 16-CPU bus-based system. (c) A 4-CPU grid-based system. (d) A 16-CPU grid-based system. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved

53 Grid Computing The grid layers.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved


Download ppt "Parallel Computer Architectures"

Similar presentations


Ads by Google