شبکه های میان ارتباطی 1 به نام خدا دکتر محمد کاظم اکبری مرتضی سرگلزایی جوان

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

Super computers Parallel Processing By: Lecturer \ Aisha Dawood.
Classification of Distributed Systems Properties of Distributed Systems n motivation: advantages of distributed systems n classification l architecture.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Multiprocessors CSE 4711 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor –Although.
Parallel Architectures: Topologies Heiko Schröder, 2003.
Parallel Architectures: Topologies Heiko Schröder, 2003.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 2 Program and Network Properties 2.4 System Interconnect Architectures.
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
1 CSE 591-S04 (lect 14) Interconnection Networks (notes by Ken Ryu of Arizona State) l Measure –How quickly it can deliver how much of what’s needed to.
NUMA Mult. CSE 471 Aut 011 Interconnection Networks for Multiprocessors Buses have limitations for scalability: –Physical (number of devices that can be.

Parallel Computing Platforms
Lecture 10 Outline Material from Chapter 2 Interconnection networks Processor arrays Multiprocessors Multicomputers Flynn’s taxonomy.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
1 Static Interconnection Networks CEG 4131 Computer Architecture III Miodrag Bolic.
Introduction to Parallel Processing Ch. 12, Pg
Chapter 5 Array Processors. Introduction  Major characteristics of SIMD architectures –A single processor(CP) –Synchronous array processors(PEs) –Data-parallel.
Interconnect Network Topologies
Course Outline Introduction in software and applications. Parallel machines and architectures –Overview of parallel machines –Cluster computers (Myrinet)
Interconnect Networks
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
Parallel Computing Basic Concepts Computational Models Synchronous vs. Asynchronous The Flynn Taxonomy Shared versus Distributed Memory Interconnection.
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
MIMD Shared Memory Multiprocessors. MIMD -- Shared Memory u Each processor has a full CPU u Each processors runs its own code –can be the same program.
CSE Advanced Computer Architecture Week-11 April 1, 2004 engr.smu.edu/~rewini/8383.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
A.Broumandnia, 1 4 Models of Parallel Processing Topics in This Chapter 4.1 Development of Early Models 4.2 SIMD versus MIMD Architectures.
Parallel Computer Architecture and Interconnect 1b.1.
Chapter 2 Parallel Architectures. Outline Interconnection networks Interconnection networks Processor arrays Processor arrays Multiprocessors Multiprocessors.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
1 Dynamic Interconnection Networks Miodrag Bolic.
Chapter 6 Multiprocessor System. Introduction  Each processor in a multiprocessor system can be executing a different instruction at any time.  The.
Switches and indirect networks Computer Architecture AMANO, Hideharu Textbook pp. 92~13 0.
Lecture 3 Innerconnection Networks for Parallel Computers
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 January Session 4.
An Overview of Parallel Computing. Hardware There are many varieties of parallel computing hardware and many different architectures The original classification.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters,
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Lecture 3 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Birds Eye View of Interconnection Networks
1 Interconnection Networks. 2 Interconnection Networks Interconnection Network (for SIMD/MIMD) can be used for internal connections among: Processors,
2016/1/6Part I1 A Taste of Parallel Algorithms. 2016/1/6Part I2 We examine five simple building-block parallel operations and look at the corresponding.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Super computers Parallel Processing
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 February Session 9.
INTERCONNECTION NETWORKS Work done as part of Parallel Architecture Under the guidance of Dr. Edwin Sha By Gomathy Gowri Narayanan Karthik Alagu Dynamic.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Tree-Based Networks Cache Coherence Dr. Xiao Qin Auburn University
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
INTERCONNECTION NETWORK
Overview Parallel Processing Pipelining
Parallel Architecture
Distributed and Parallel Processing
Multiprocessor Systems
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Connection System Serve on mutual connection processors and memory .
Course Outline Introduction in algorithms and applications
Refer example 2.4on page 64 ACA(Kai Hwang) And refer another ppt attached for static scheduling example.
Overview Parallel Processing Pipelining
Multiprocessors Interconnection Networks
Outline Interconnection networks Processor arrays Multiprocessors
High Performance Computing & Bioinformatics Part 2 Dr. Imad Mahgoub
Advanced Computer and Parallel Processing
Advanced Computer and Parallel Processing
Multiprocessors and Multi-computers
Presentation transcript:

شبکه های میان ارتباطی 1 به نام خدا دکتر محمد کاظم اکبری مرتضی سرگلزایی جوان

Taxonomy 2

MIMD Multiprocessor (shared memory) IN P1 P2 Pn M1 M2Mn Processors Interconnection Network Memory modules (Tightly Coupled Architecture) 3

Shared Memory 4 Uniform Memory Access (UMA) Tightly Coupled system Non-Uniform Memory Access (NUMA) Loosely Coupled system Cedar from University of Illinois BBN Butterfly Cache Only Memory Access (COMA) Using global distributed caches Kendal Square Research-1 (KSR-1) 4

MIMD (cont.) Global Interconnection Network (Global IN) Global Memory GM 1 Global Memory GM 2 Global Memory GM n P1P1 P2P2 PnPn CINCIN CM 1 CM 2 CM 3 P1P1 P2P2 PnPn CINCIN CM 1 CM 2 CM 3 P1P1 P2P2 PnPn CINCIN CM 1 CM 2 CM 3 (Loosely Coupled Architecture) - Cedar 5

MIMD (cont.) P1P1 M1M1 P2P2 M2M2 PnPn MnMn Interconnection Network (IN) (Loosely Coupled Architecture) – BBN Butterfly 6

MIMD (cont.) Interconnection Network (IN) D1D1 C1C1 P1P1 D2D2 C2C2 P2P2 DnDn CnCn PnPn (COMA Architecture) 7

MIMD (cont.) Multicomputer (Message passing) IN P1 M1 P2 M2 Pn Mn 8

MIMD (cont.) Data flow machine an instruction is ready for execution when data for its operands have been made available Purely self-contained No program counter 9

SIMD Array Processor centralized control unit

MISD Pipelined vector processor

MISD (cont.) Systolic array 12

Hybrid Architecture Combine features of different architectures to provide better performance for parallel computations. Two type of parallelism Control parallelism (MIMD) Data parallelism (SIMD) 13

Special Purpose Devices Artificial Neural Networks (ANN) Fuzzy logic 14

Neural Networks (Definition) A large number of PEs Connected in Parallel Capable of learning Adaptive to changing Able to cope with serious disruptions Power of Connectivity Power of Processors vs 15

Fuzzy logic (Definition) Approximate reasoning Formal principals of reasoning 16

Interconnection Network (IN) The measure of an IN is “how quickly it can deliver how much of what’s needed to the right place, reliably and at good cost and value”. 17

Performance Criteria for IN Latency Transit time for a single msg. Bandwidth how much msg. traffic the IN can handle, e.g., Mbytes/s Connectivity How many immediate neighbors each node has, and how often each neighbor can be reached Hardware cost What fraction of the total hardware cost the IN represents E.g., wires, switches, connectors, arbitration logic, … 18

Performance Criteria for IN (cont.) Reliability Redundancy paths, Functionality Additional functions performed by the IN, such as combining of msg. and fault tolerance e.g., data routing, interrupt handling, request/ message combining, coherence Scalability The ability to be expandable 19

Definitions Node degree: node degree is the number of links (edges) connected to the node Diameter: the diameter of a network is defined as the largest minimum distance between any pair of nodes. The minimum distance between a pair of nodes is the minimum number of communication links (hops) that data from one of the nodes must traverse in order to reach the other node. Network Size The number of nodes in the IN 20

Data Routing Functions in data routing Shifting Rotation Permutation (one-to-one) Broadcast (one-to-all) Multicast (many-to-many) Personalized communication (one-to-many) Shuffle / Exchange 21

Types of IN Static Networks Dynamic Networks 22

Static Networks Shared Bus Degree = 1 Diameter = 1 23

Static Networks (cont.) Linear Array Degree = 2 Diameter = n-1 24

Static Networks (cont.) Ring Degree = 2 Diameter: unidirectional: n-1 bidirectional: Ceil(n-1)/2 25

Static Networks (cont.) Binary tree Degree: Leaf=1 Root=2 Others=3 Diameter: 2(h-1) 26

Static Networks (cont.) Fat tree. Degree and Diameter is the same as binary tree Due to heavy traffic towards root, the number of links gradually increases (e.g., CM-5). 27

Static Networks (cont.) 28 Star. Degree: Central = n-1 Others = 1 Diameter= 2

Static Networks (cont.) Shuffle(s n-1 s n-2... s 0 ) = s n-2 s n-3... s 0 s n-1 Exchange(s n-1 s n-2... s 1 s 0 ) = s n-1 s n-2... s 1 s 0 SourceDestination 000        

Shuffle-Exchange Network For N=8 Applications: The shuffle-exchange network provides suitable interconnection patterns for implementing certain parallel algorithms, such as polynomial evaluation, Fast Fourier Transform (FFT), sorting, and matrix transposition. 30

Static Networks (cont.) Mesh. Degree: Corner= 2 Sides = 3 Middle= 4 Diameter= 2(n-1) 31

Mesh Routing Algorithm Simple routing algorithm routes a packet from source S to destination D in a mesh with n 2 nodes. 1. Compute the row distance R as 2. Compute the column distance C as 3. Add the values R and C to the packet header at the source node. 4. Starting from the source, send the packet for R rows and then for C columns. 32

Example (Mesh) 33 to route a packet from node 6 (i.e., S=6) to node 12 (i.e., D =12), the packet goes through two paths, as shown in the figure:

Static Networks (cont.) Illiac Degree= 4 Diameter= n-1 chordal ring 34

Static Networks (cont.) Torus Degree= 4 Diameter= 2(Ceil(n/2)) 35

HyperCube Degree= n Diameter= n Address Bits= n Dimensions= n Neighbors= n 36 Static Networks (cont.)

Example Embedding a 4-by-4 mesh in a 4-cube 37

Static Networks (cont.) n-Mesh Degree: Corner= n Internal= 2n n < Others < 2n Diameter= 38

Static Networks (cont.) k-Ary n-cube Degree: If k=2 then Degree = n If k>2 then Degree = 2n Diameter= (a) 4-ary 2-cube network (b) 3-ary 3-cube network 39

Cache Coherence Multiprocessor environment Cache dedicated to each processor Cache coherence problem How to keep multiple copies of the data consistent during execution ? 40

Cache Coherence Mechanisms 1. Hardware-based schemes Snoopy cache protocols If INs have broadcast features Directory cache protocols No broadcast features in INs 2. Software-based schemes 3. Combination 41

Cache Coherence Mechanisms (cont.) Action taken on Read Miss Write Hit Write Miss 42

Snoopy Cache Protocol 43 A two-processor configuration with copies of data block x  write-through  write-back

Centralized Directory Protocols Full-map protocol directory 44

Scalable Cache Coherency 45

Classification of Dynamic Networks 46

Dynamic Networks (Crossbar) 47

Dynamic Networks (Single-Stage) In Single-Stage Network any permutation can be reached by at most 3(log N 2 ) -1 pass. 48

Multi Stages - Blocking Example: Multi Stage Cube, Omega 49

Multi Stages – Nonblocking Example: Three-stage Clos 50

Dynamic Networks (Clos) 51

Multi Stages - Rearrangeable Example: 8-to-8 (Benes) 52

Interconnection Design Decisions Considerations about selecting the Architecture of Interconnection Network Operation Mode Control Strategy Network Topology Switching Methodology Functional characteristics of the switch 53

Interconnection Design Decisions Operation mode: Synchronous Asynchronous Combined Control Strategy Centralized control Distributed control Switching methodology circuit switching packet switching integrated switching 54

ابر و باران 55