CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.

CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies Multiprocessors / Multicomputers Flynn’s Taxonomy Analysis of Interconnection Networks

Theoretic Computer Architectures Turing Machine Von Neumann Architecture Fetch/Execute Cycle Memory Models RAM model PRAM model extension Shared Memory vs. Distributed Shared Memory vs. Distributed Memory

Processors and the Memory Hierarchy Registers (1 clock cycle, 100s of bytes) 1 st level cache (3-5 clock cycles, 100s KBytes) 2 nd level cache (~10 clock cycles, MBytes) Main memory (~100 clock cycles, GBytes) Disk (milliseconds, 100GB to gianormous) registers 1st level Instructions 1st level Data 2 nd Level unified (Instructions & Data) CPU

IBM Dual Core From Intel® 64 and IA-32 Architectures Optimization Reference Manual http://www.intel.com/design/processor/manuals/248966.pdf

Shared Memory Multiprocessor One or more memories Global address space (all system memory visible to all processors) Transfer of data between processors is usually implicit, just read (write) to (from) a given address (OpenMP) Complex Cache-coherency protocols to maintain consistency between processors. Interconnection Network Memory CPU Memory CPU Memory CPU (UMA) Uniform-memory-access Shared-memory System

Distributed Shared Memory Single address space with implicit communication Hardware support for read/write to non-local memories, cache coherency Latency for a memory operation is greater when accessing non local data than when accessing date within a CPU’s own memory (NUMA)Non-Uniform-memory-access Shared-memory System Interconnection Network Memory CPU Memory CPU Memory CPU

Distributed Memory / Message Passing Each processor has access to its own memory only Data transfer between processors is explicit, user calls message passing functions Common Libraries for message passing –MPI, PVM User has complete control/responsibility for data placement and management Interconnection Network Memory CPU Memory CPU Memory CPU

Hybrid Systems Distributed memory system with multiprocessor shared memory nodes. Most common architecture for current generation of parallel machines Interconnection Network CPU Memory CPU Network Interface CPU Memory CPU Network Interface CPU Memory CPU Network Interface

Flynn’s Taxonomy (figure 2.20 from Quinn) SISD Uniprocessor SIMD Processor arrays Pipelined vector processors MISD Systolic array MIMD Multiprocessors Multicomputers SingleMultiple Single Multiple Data stream Instruction stream

Analysis of Switch Network Topologies View switched network as a graph – n - Vertices = processors or switches – m - Edges = communication paths Two kinds of topologies –Direct - ratio of switches to processors 1:1 –Indirect - ratio is d:1

Evaluating Switch Topologies Diameter Bisection width Number of edges / node (d = degree) Constant edge length? (yes/no) –Layout area/wire length

2-D Mesh Network Direct topology Switches arranged into a 2-D lattice Communication allowed only between neighboring switches Variants allow wraparound connections between switches on edge of mesh

2-D Meshes

Evaluating 2-D Meshes Diameter:  (n 1/2 ) Bisection width:  (n 1/2 ) Number of edges per switch: 4 Constant edge length? Yes

Binary Tree Network Indirect topology n = 2 d processor nodes, n-1 switches

Evaluating Binary Tree Network Diameter: 2 log n Bisection width: 1 Edges / node: 3 Constant edge length? Yes/No?

Hypertree Network Indirect topology Shares low diameter of binary tree Greatly improves bisection width From “front” looks like k-ary tree of height d From “side” looks like upside down binary tree of height d

Hypertree Network

Evaluating 4-ary Hypertree Diameter: log n Bisection width: n / 2 Edges / node: 6 Constant edge length? No

Butterfly Network Indirect topology n = 2 d processor nodes connected by n(log n + 1) switching nodes

Butterfly Network Routing

Evaluating Butterfly Network Diameter: log n Bisection width: n / 2 Edges per node: 4 Constant edge length? No

Hypercube Directory topology 2 x 2 x … x 2 mesh Number of nodes a power of 2 Node addresses 0, 1, …, 2 k -1 Node i connected to k nodes whose addresses differ from i in exactly one bit position

Hypercube Addressing

Evaluating Hypercube Network Diameter: log n Bisection width: n / 2 Edges per node: log n Constant edge length? No

Shuffle-exchange Direct topology Number of nodes a power of 2 Nodes have addresses 0, 1, …, 2 k -1 Two outgoing links from node i –Shuffle link to node LeftCycle(i) –Exchange link to node [xor (i, 1)]

Shuffle-exchange Illustrated 01234567

Shuffle-exchange Addressing

Evaluating Shuffle-exchange Diameter: 2log n - 1 Bisection width:  n / log n Edges per node: 2 Constant edge length? No

Comparing Networks All have logarithmic diameter except 2-D mesh Hypertree, butterfly, and hypercube have bisection width n / 2 All have constant edges per node except hypercube Only 2-D mesh keeps edge lengths constant as network size increases

CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.

Similar presentations

Presentation on theme: "CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.

Similar presentations

Presentation on theme: "CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies."— Presentation transcript:

Similar presentations

About project

Feedback