Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fundamental of Computer Architecture By Panyayot Chaikan 240-208 November 01, 2003.

Similar presentations


Presentation on theme: "Fundamental of Computer Architecture By Panyayot Chaikan 240-208 November 01, 2003."— Presentation transcript:

1 Fundamental of Computer Architecture By Panyayot Chaikan panyayot@coe.psu.ac.th 240-208 November 01, 2003

2 Chapter 10 - Introduction to Parallel processing 2 240-208 Fundamental of Computer Architecture Chapter 10 แนะนำการประมวลผลแบบ ขนาน Introduction to Parallel processing

3 Chapter 10 - Introduction to Parallel processing 3 240-208 Fundamental of Computer Architecture เนื้อหา แนะนำสถาปัตยกรรมการประมวลผลแบบขนาน มัลติโพรเซสเซอร์ เวกเตอร์คอมพิวเตอร์ คลัส เตอร์ Interconnection network แบบต่างๆ แนะนำการเขียนโปรแกรมแบบขนาน

4 Chapter 10 - Introduction to Parallel processing 4 240-208 Fundamental of Computer Architecture High performance computer  Large computing capacity  Required to compute large amount of data in a reasonable amount of time  Often called Supercomputer

5 Chapter 10 - Introduction to Parallel processing 5 240-208 Fundamental of Computer Architecture Supercomputer Applications  Weather forecasting  Finite element analysis in structural design  Fluid flow analysis  Simulation of large complex physical system  Computer Aided Design (CAD)

6 Chapter 10 - Introduction to Parallel processing 6 240-208 Fundamental of Computer Architecture Parallel processing Picture from http://www.byte.com/art/9601/img/509029c2.htm

7 Chapter 10 - Introduction to Parallel processing 7 240-208 Fundamental of Computer Architecture 3 ways to construct Supercomputer  Vector processing  Multiprocessing  Distributed computer system

8 Chapter 10 - Introduction to Parallel processing 8 240-208 Fundamental of Computer Architecture Vector Supercomputing  Using fastest possible circuit  Wide path for access large main memory  Extensive I/O capability  Dissipate considerable power and require expensive cooling arrangement  Provide excellent performance but at very high price

9 Chapter 10 - Introduction to Parallel processing 9 240-208 Fundamental of Computer Architecture Vector Supercomputing  NEC SX5  CRAY CRAY1, Y-MP  Fujitsu VP5000  Hitachi SR8000

10 Chapter 10 - Introduction to Parallel processing 10 240-208 Fundamental of Computer Architecture Cray Supercomputer Picture from http://www.meteo.fr/scem/images/cray.gif

11 Chapter 10 - Introduction to Parallel processing 11 240-208 Fundamental of Computer Architecture Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient high bandwidth medium for communication among  the processor  memory  I/O  Provide High performance but cheaper than vector processing

12 Chapter 10 - Introduction to Parallel processing 12 240-208 Fundamental of Computer Architecture Distributed computer system  Using many workstation connected by Local area network  Provide large computing capabilities at a reasonable cost

13 Chapter 10 - Introduction to Parallel processing 13 240-208 Fundamental of Computer Architecture Multiprocessing performance  Many computation can proceed in parallel  Difficulty:  the application must be broken down into small task that can be assigned to individual processor  Processors must communicate with each other to exchange data

14 Chapter 10 - Introduction to Parallel processing 14 240-208 Fundamental of Computer Architecture Classification of Parallel structure  Proposed by Flynn[1966]  4 types of computation  SISD  SIMD  MIMD  MISD

15 Chapter 10 - Introduction to Parallel processing 15 240-208 Fundamental of Computer Architecture SISD  Single Instruction stream, Single Data stream  Used in single-processor computer system

16 Chapter 10 - Introduction to Parallel processing 16 240-208 Fundamental of Computer Architecture SIMD  Single Instruction stream, Multiple Data stream  Single stream of instruction is broadcast to a number of processor  Each processor operates on its own data  Each processor has its own memories  All processors executes the same program but operate on different data

17 Chapter 10 - Introduction to Parallel processing 17 240-208 Fundamental of Computer Architecture MIMD  Multiple Instruction stream, Multiple Data stream  Many processor execute a different program and access its own sequence of data

18 Chapter 10 - Introduction to Parallel processing 18 240-208 Fundamental of Computer Architecture MISD  Multiple Instruction stream, Single Data stream  Common data structure is manipulated by separate processor  Each processor executes a different program  This form does not occur often in practice

19 Chapter 10 - Introduction to Parallel processing 19 240-208 Fundamental of Computer Architecture Array processing  Is the SIMD form of parallel processing  Instruction is broadcast from a central processor

20 Chapter 10 - Introduction to Parallel processing 20 240-208 Fundamental of Computer Architecture 2 types of Array processing  Use small number of powerful processor  ILLIAC-IV: 64 processors, each processor is 64-bit  Use large number of very simple processor  CM2: 65536 processors, each processor is 1-bit  MP-1216: 16384 processors, each processor is 4-bit  Gamma II plus: 4096 processors, each processor is 8- bit

21 Chapter 10 - Introduction to Parallel processing 21 240-208 Fundamental of Computer Architecture Array processing  Well suited to numerical problem that can be expressed in matrix or vector format

22 Chapter 10 - Introduction to Parallel processing 22 240-208 Fundamental of Computer Architecture The structure of general- purpose multiprocessors  UMA multiprocessor  NUMA multiprocessor  Distributed memory system

23 Chapter 10 - Introduction to Parallel processing 23 240-208 Fundamental of Computer Architecture A UMA multiprocessor

24 Chapter 10 - Introduction to Parallel processing 24 240-208 Fundamental of Computer Architecture A NUMA multiprocessor

25 Chapter 10 - Introduction to Parallel processing 25 240-208 Fundamental of Computer Architecture A distributed memory system

26 Chapter 10 - Introduction to Parallel processing 26 240-208 Fundamental of Computer Architecture Taxonomy of parallel processing

27 Chapter 10 - Introduction to Parallel processing 27 240-208 Fundamental of Computer Architecture Interconnection network  Single bus  Crossbar networks  Multistage networks  Hypercube networks  Mesh networks  Tree networks  Ring networks

28 Chapter 10 - Introduction to Parallel processing 28 240-208 Fundamental of Computer Architecture Crossbar interconnection network

29 Chapter 10 - Introduction to Parallel processing 29 240-208 Fundamental of Computer Architecture Multistage shuffle network

30 Chapter 10 - Introduction to Parallel processing 30 240-208 Fundamental of Computer Architecture A 3-dimensional Hypercube Network

31 Chapter 10 - Introduction to Parallel processing 31 240-208 Fundamental of Computer Architecture A 2-dimensional mesh network

32 Chapter 10 - Introduction to Parallel processing 32 240-208 Fundamental of Computer Architecture Four-way tree network

33 Chapter 10 - Introduction to Parallel processing 33 240-208 Fundamental of Computer Architecture Flat tree network

34 Chapter 10 - Introduction to Parallel processing 34 240-208 Fundamental of Computer Architecture Ring network

35 Chapter 10 - Introduction to Parallel processing 35 240-208 Fundamental of Computer Architecture HP Convex architecture Picture from http://www.byte.com/art/9601/img/509029g2.htm

36 Chapter 10 - Introduction to Parallel processing 36 240-208 Fundamental of Computer Architecture HP Convex Hypernode Picture from http://www.byte.com/art/9601/img/509029f2.htm

37 Chapter 10 - Introduction to Parallel processing 37 240-208 Fundamental of Computer Architecture SGI Power Challenge Picture from http://www.byte.com/art/9601/img/509029l2.htm

38 Chapter 10 - Introduction to Parallel processing 38 240-208 Fundamental of Computer Architecture Clustered Supercomputer Picture from http://arstechnica.com/cpu/2q00/klat2/klat2-1.html

39 Chapter 10 - Introduction to Parallel processing 39 240-208 Fundamental of Computer Architecture Clusters

40 Chapter 10 - Introduction to Parallel processing 40 240-208 Fundamental of Computer Architecture Benefits of clustering  Incremental scalability  High availability  Superior price/performance

41 Chapter 10 - Introduction to Parallel processing 41 240-208 Fundamental of Computer Architecture Parallel programming  Task must be broken down into small task that can be assigned to individual processors at program level  Need operating system support  Different architecture, different programming method

42 Chapter 10 - Introduction to Parallel processing 42 240-208 Fundamental of Computer Architecture A sequential program to compute the dot product integer array a[1..N], b[1..N] integer dot_product. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N] for k:= 1 to N dot_product := dot_product + x[k] * y[k] end for end do_dot

43 Chapter 10 - Introduction to Parallel processing 43 240-208 Fundamental of Computer Architecture First attempt of 2- processor computation shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer id id := mypid() for k:= (id*N/2)+1 to (id+1)*N/2 lock (dot_product_lock) dot_product := dot_product + x[k] * y[k] unlock (dot_product_lock) end barrier (done) end do_dot

44 Chapter 10 - Introduction to Parallel processing 44 240-208 Fundamental of Computer Architecture An efficient 2-processor computation of a shared memory machine shared integer array a[1..N], b[1..N] shared integer dot_product shared lock dot_product_lock shared barrier done. read a[1..N] from vector_a read b[1..N] from vector_b dot_product := 0 create_thread (do_dot, a, b) do_dot (a,b) print dot_product. do_dot (integer array x[1..N], integer array y[1..N]) private integer local_dot_product private integer id id := mypid() local_dot_product := 0 for k:= (id*N/2)+1 to (id+1)*N/2 local_dot_product := local_dot_product + x[k] * y[k] end lock (dot_product_lock) dot_product := dot_product + local_dot_product unlock (dot_product_lock) barrier (done) end do_dot

45 Chapter 10 - Introduction to Parallel processing 45 240-208 Fundamental of Computer Architecture Performance considerations

46 Chapter 10 - Introduction to Parallel processing 46 240-208 Fundamental of Computer Architecture จบ บทที่ 10


Download ppt "Fundamental of Computer Architecture By Panyayot Chaikan 240-208 November 01, 2003."

Similar presentations


Ads by Google