Download presentation
Presentation is loading. Please wait.
Published byLydia Bocook Modified over 9 years ago
1
Fundamental of Computer Architecture By Panyayot Chaikan panyayot@coe.psu.ac.th 240-208 November 01, 2003
2
Chapter 10 - Introduction to Parallel processing 2 240-208 Fundamental of Computer Architecture Chapter 10 แนะนำการประมวลผลแบบ ขนาน Introduction to Parallel processing
3
Chapter 10 - Introduction to Parallel processing 3 240-208 Fundamental of Computer Architecture เนื้อหา แนะนำสถาปัตยกรรมการประมวลผลแบบขนาน มัลติโพรเซสเซอร์ เวกเตอร์คอมพิวเตอร์ คลัส เตอร์ Interconnection network แบบต่างๆ แนะนำการเขียนโปรแกรมแบบขนาน
4
Chapter 10 - Introduction to Parallel processing 4 240-208 Fundamental of Computer Architecture High performance computer Large computing capacity Required to compute large amount of data in a reasonable amount of time Often called Supercomputer
5
Chapter 10 - Introduction to Parallel processing 5 240-208 Fundamental of Computer Architecture Supercomputer Applications Weather forecasting Finite element analysis in structural design Fluid flow analysis Simulation of large complex physical system Computer Aided Design (CAD)
6
Chapter 10 - Introduction to Parallel processing 6 240-208 Fundamental of Computer Architecture Parallel processing Picture from http://www.byte.com/art/9601/img/509029c2.htm
7
Chapter 10 - Introduction to Parallel processing 7 240-208 Fundamental of Computer Architecture 3 ways to construct Supercomputer Vector processing Multiprocessing Distributed computer system
8
Chapter 10 - Introduction to Parallel processing 8 240-208 Fundamental of Computer Architecture Vector Supercomputing Using fastest possible circuit Wide path for access large main memory Extensive I/O capability Dissipate considerable power and require expensive cooling arrangement Provide excellent performance but at very high price
9
Chapter 10 - Introduction to Parallel processing 9 240-208 Fundamental of Computer Architecture Vector Supercomputing NEC SX5 CRAY CRAY1, Y-MP Fujitsu VP5000 Hitachi SR8000
10
Chapter 10 - Introduction to Parallel processing 10 240-208 Fundamental of Computer Architecture Cray Supercomputer Picture from http://www.meteo.fr/scem/images/cray.gif
11
Chapter 10 - Introduction to Parallel processing 11 240-208 Fundamental of Computer Architecture Multiprocessor Use large number of processor design for workstation or PC market Has an efficient high bandwidth medium for communication among the processor memory I/O Provide High performance but cheaper than vector processing
12
Chapter 10 - Introduction to Parallel processing 12 240-208 Fundamental of Computer Architecture Distributed computer system Using many workstation connected by Local area network Provide large computing capabilities at a reasonable cost
13
Chapter 10 - Introduction to Parallel processing 13 240-208 Fundamental of Computer Architecture Multiprocessing performance Many computation can proceed in parallel Difficulty: the application must be broken down into small task that can be assigned to individual processor Processors must communicate with each other to exchange data
14
Chapter 10 - Introduction to Parallel processing 14 240-208 Fundamental of Computer Architecture Classification of Parallel structure Proposed by Flynn[1966] 4 types of computation SISD SIMD MIMD MISD
15
Chapter 10 - Introduction to Parallel processing 15 240-208 Fundamental of Computer Architecture SISD Single Instruction stream, Single Data stream Used in single-processor computer system
16
Chapter 10 - Introduction to Parallel processing 16 240-208 Fundamental of Computer Architecture SIMD Single Instruction stream, Multiple Data stream Single stream of instruction is broadcast to a number of processor Each processor operates on its own data Each processor has its own memories All processors executes the same program but operate on different data
17
Chapter 10 - Introduction to Parallel processing 17 240-208 Fundamental of Computer Architecture MIMD Multiple Instruction stream, Multiple Data stream Many processor execute a different program and access its own sequence of data
18
Chapter 10 - Introduction to Parallel processing 18 240-208 Fundamental of Computer Architecture MISD Multiple Instruction stream, Single Data stream Common data structure is manipulated by separate processor Each processor executes a different program This form does not occur often in practice
19
Chapter 10 - Introduction to Parallel processing 19 240-208 Fundamental of Computer Architecture Array processing Is the SIMD form of parallel processing Instruction is broadcast from a central processor
20
Chapter 10 - Introduction to Parallel processing 20 240-208 Fundamental of Computer Architecture 2 types of Array processing Use small number of powerful processor ILLIAC-IV: 64 processors, each processor is 64-bit Use large number of very simple processor CM2: 65536 processors, each processor is 1-bit MP-1216: 16384 processors, each processor is 4-bit Gamma II plus: 4096 processors, each processor is 8- bit
21
Chapter 10 - Introduction to Parallel processing 21 240-208 Fundamental of Computer Architecture Array processing Well suited to numerical problem that can be expressed in matrix or vector format
22
Chapter 10 - Introduction to Parallel processing 22 240-208 Fundamental of Computer Architecture The structure of general- purpose multiprocessors UMA multiprocessor NUMA multiprocessor Distributed memory system
23
Chapter 10 - Introduction to Parallel processing 23 240-208 Fundamental of Computer Architecture A UMA multiprocessor
24
Chapter 10 - Introduction to Parallel processing 24 240-208 Fundamental of Computer Architecture A NUMA multiprocessor
25
Chapter 10 - Introduction to Parallel processing 25 240-208 Fundamental of Computer Architecture A distributed memory system
26
Chapter 10 - Introduction to Parallel processing 26 240-208 Fundamental of Computer Architecture Taxonomy of parallel processing
27
Chapter 10 - Introduction to Parallel processing 27 240-208 Fundamental of Computer Architecture Interconnection network Single bus Crossbar networks Multistage networks Hypercube networks Mesh networks Tree networks Ring networks
28
Chapter 10 - Introduction to Parallel processing 28 240-208 Fundamental of Computer Architecture Crossbar interconnection network
29
Chapter 10 - Introduction to Parallel processing 29 240-208 Fundamental of Computer Architecture Multistage shuffle network
30
Chapter 10 - Introduction to Parallel processing 30 240-208 Fundamental of Computer Architecture A 3-dimensional Hypercube Network
31
Chapter 10 - Introduction to Parallel processing 31 240-208 Fundamental of Computer Architecture A 2-dimensional mesh network
32
Chapter 10 - Introduction to Parallel processing 32 240-208 Fundamental of Computer Architecture Four-way tree network
33
Chapter 10 - Introduction to Parallel processing 33 240-208 Fundamental of Computer Architecture Flat tree network
34
Chapter 10 - Introduction to Parallel processing 34 240-208 Fundamental of Computer Architecture Ring network
35
Chapter 10 - Introduction to Parallel processing 35 240-208 Fundamental of Computer Architecture HP Convex architecture Picture from http://www.byte.com/art/9601/img/509029g2.htm
36
Chapter 10 - Introduction to Parallel processing 36 240-208 Fundamental of Computer Architecture HP Convex Hypernode Picture from http://www.byte.com/art/9601/img/509029f2.htm
37
Chapter 10 - Introduction to Parallel processing 37 240-208 Fundamental of Computer Architecture SGI Power Challenge Picture from http://www.byte.com/art/9601/img/509029l2.htm
38
Chapter 10 - Introduction to Parallel processing 38 240-208 Fundamental of Computer Architecture Clustered Supercomputer Picture from http://arstechnica.com/cpu/2q00/klat2/klat2-1.html
39
Chapter 10 - Introduction to Parallel processing 39 240-208 Fundamental of Computer Architecture Clusters
40
Chapter 10 - Introduction to Parallel processing 40 240-208 Fundamental of Computer Architecture Benefits of clustering Incremental scalability High availability Superior price/performance
41
Chapter 10 - Introduction to Parallel processing 41 240-208 Fundamental of Computer Architecture Parallel programming Task must be broken down into small task that can be assigned to individual processors at program level Need operating system support Different architecture, different programming method
42
Chapter 10 - Introduction to Parallel processing 42 240-208 Fundamental of Computer Architecture A sequential program to compute the dot product integer array a[1..N], b[1..N] integer dot_product. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N] for k:= 1 to N dot_product := dot_product + x[k] * y[k] end for end do_dot
43
Chapter 10 - Introduction to Parallel processing 43 240-208 Fundamental of Computer Architecture First attempt of 2- processor computation shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer id id := mypid() for k:= (id*N/2)+1 to (id+1)*N/2 lock (dot_product_lock) dot_product := dot_product + x[k] * y[k] unlock (dot_product_lock) end barrier (done) end do_dot
44
Chapter 10 - Introduction to Parallel processing 44 240-208 Fundamental of Computer Architecture An efficient 2-processor computation of a shared memory machine shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer local_dot_product private integer id id := mypid() local_dot_product := 0 for k:= (id*N/2)+1 to (id+1)*N/2 local_dot_product := local_dot_product + x[k] * y[k] end lock (dot_product_lock) dot_product := dot_product + local_dot_product unlock (dot_product_lock) barrier (done) end do_dot
45
Chapter 10 - Introduction to Parallel processing 45 240-208 Fundamental of Computer Architecture Performance considerations
46
Chapter 10 - Introduction to Parallel processing 46 240-208 Fundamental of Computer Architecture จบ บทที่ 10
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.