Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.

Similar presentations


Presentation on theme: "CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies."— Presentation transcript:

1 CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies Multiprocessors / Multicomputers Flynn’s Taxonomy Analysis of Interconnection Networks

2 Theoretic Computer Architectures Turing Machine Von Neumann Architecture Fetch/Execute Cycle Memory Models RAM model PRAM model extension Shared Memory vs. Distributed Shared Memory vs. Distributed Memory

3 Processors and the Memory Hierarchy Registers (1 clock cycle, 100s of bytes) 1 st level cache (3-5 clock cycles, 100s KBytes) 2 nd level cache (~10 clock cycles, MBytes) Main memory (~100 clock cycles, GBytes) Disk (milliseconds, 100GB to gianormous) registers 1st level Instructions 1st level Data 2 nd Level unified (Instructions & Data) CPU

4 IBM Dual Core From Intel® 64 and IA-32 Architectures Optimization Reference Manual http://www.intel.com/design/processor/manuals/248966.pdf

5 Shared Memory Multiprocessor One or more memories Global address space (all system memory visible to all processors) Transfer of data between processors is usually implicit, just read (write) to (from) a given address (OpenMP) Complex Cache-coherency protocols to maintain consistency between processors. Interconnection Network Memory CPU Memory CPU Memory CPU (UMA) Uniform-memory-access Shared-memory System

6 Distributed Shared Memory Single address space with implicit communication Hardware support for read/write to non-local memories, cache coherency Latency for a memory operation is greater when accessing non local data than when accessing date within a CPU’s own memory (NUMA)Non-Uniform-memory-access Shared-memory System Interconnection Network Memory CPU Memory CPU Memory CPU

7 Distributed Memory / Message Passing Each processor has access to its own memory only Data transfer between processors is explicit, user calls message passing functions Common Libraries for message passing –MPI, PVM User has complete control/responsibility for data placement and management Interconnection Network Memory CPU Memory CPU Memory CPU

8 Hybrid Systems Distributed memory system with multiprocessor shared memory nodes. Most common architecture for current generation of parallel machines Interconnection Network CPU Memory CPU Network Interface CPU Memory CPU Network Interface CPU Memory CPU Network Interface

9 Flynn’s Taxonomy (figure 2.20 from Quinn) SISD Uniprocessor SIMD Processor arrays Pipelined vector processors MISD Systolic array MIMD Multiprocessors Multicomputers SingleMultiple Single Multiple Data stream Instruction stream

10 Analysis of Switch Network Topologies View switched network as a graph – n - Vertices = processors or switches – m - Edges = communication paths Two kinds of topologies –Direct - ratio of switches to processors 1:1 –Indirect - ratio is d:1

11 Evaluating Switch Topologies Diameter Bisection width Number of edges / node (d = degree) Constant edge length? (yes/no) –Layout area/wire length

12 2-D Mesh Network Direct topology Switches arranged into a 2-D lattice Communication allowed only between neighboring switches Variants allow wraparound connections between switches on edge of mesh

13 2-D Meshes

14 Evaluating 2-D Meshes Diameter:  (n 1/2 ) Bisection width:  (n 1/2 ) Number of edges per switch: 4 Constant edge length? Yes

15 Binary Tree Network Indirect topology n = 2 d processor nodes, n-1 switches

16 Evaluating Binary Tree Network Diameter: 2 log n Bisection width: 1 Edges / node: 3 Constant edge length? Yes/No?

17 Hypertree Network Indirect topology Shares low diameter of binary tree Greatly improves bisection width From “front” looks like k-ary tree of height d From “side” looks like upside down binary tree of height d

18 Hypertree Network

19 Evaluating 4-ary Hypertree Diameter: log n Bisection width: n / 2 Edges / node: 6 Constant edge length? No

20 Butterfly Network Indirect topology n = 2 d processor nodes connected by n(log n + 1) switching nodes

21 Butterfly Network Routing

22 Evaluating Butterfly Network Diameter: log n Bisection width: n / 2 Edges per node: 4 Constant edge length? No

23 Hypercube Directory topology 2 x 2 x … x 2 mesh Number of nodes a power of 2 Node addresses 0, 1, …, 2 k -1 Node i connected to k nodes whose addresses differ from i in exactly one bit position

24 Hypercube Addressing

25 Evaluating Hypercube Network Diameter: log n Bisection width: n / 2 Edges per node: log n Constant edge length? No

26 Shuffle-exchange Direct topology Number of nodes a power of 2 Nodes have addresses 0, 1, …, 2 k -1 Two outgoing links from node i –Shuffle link to node LeftCycle(i) –Exchange link to node [xor (i, 1)]

27 Shuffle-exchange Illustrated 01234567

28 Shuffle-exchange Addressing

29 Evaluating Shuffle-exchange Diameter: 2log n - 1 Bisection width:  n / log n Edges per node: 2 Constant edge length? No

30 Comparing Networks All have logarithmic diameter except 2-D mesh Hypertree, butterfly, and hypercube have bisection width n / 2 All have constant edges per node except hypercube Only 2-D mesh keeps edge lengths constant as network size increases


Download ppt "CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies."

Similar presentations


Ads by Google