Download presentation
Presentation is loading. Please wait.
Published byBritton Farmer Modified over 9 years ago
1
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder Chapter 1: Introduction
2
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Types of parallelism Like an assembly line A call center Building a house 1-2
3
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley ILP – instruction level parallelism (a+b)*(c+d) 1-3
4
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Computer types Super computers Clusters Cloud servers Grid computers 1-4
5
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Concurrency vs Parallelism Similar but different. Parallelism can enhance sequential execution 1-5
6
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-6 Figure 1.1
7
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-7 Figure 1.1 (cont.)
8
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-8 Figure 1.2 Summing in sequence. The order of combining a sequence of numbers (7, 3, 15, 10, 13, 18, 6, 4) when adding them to an accumulation variable.
9
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-9 Figure 1.3 Summing in pairs. The order of combining a sequence of numbers (7, 3, 15, 10, 13, 18, 6, 4) by (recursively) combining pairs of values, then pairs of results, and so on.
10
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-10 Figure 1.4
11
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-11 Figure 1.5 Organization of a multi-core computer system on which the experiments are run. Each processor has a private L1 cache; it shares an L2 cache with its “chip-mate” and shares an L3 cache with the other processors.
12
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-12 Figure 1.6 Schematic diagram of data allocation to threads. Allocations are consecutive indices.
13
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-13 Figure 1.7 The first try at a Count 3s solution using threads.
14
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-14 Figure 1.7 The first try at a Count 3s solution using threads. (cont.)
15
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-15 Figure 1.8 One of several possible interleavings of references to the unprotected variable count, illustrating a race condition.
16
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-16 Figure 1.9 The second try at a Count 3s solution showing the count3s_thread() with mutex protection for the count variable.
17
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-17 Figure 1.10 Performance of our second Count 3s solution.
18
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-18 Figure 1.11 The count3s_thread() for our third Count 3s solution using private_count array elements.
19
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-19 Figure 1.12 Performance results for our third Count 3s solution.
20
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-20 Figure 1.13
21
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-21 Figure 1.14 The count3s_thread() for our fourth solution to the Count 3s computations; the private count elements are padded to force them to be allocated to different cache lines.
22
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-22 Figure 1.15
23
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1-23 Figure 1.16 Performance for our fourth solution to the Count 3s problem on an array that does not contain any 3s suggests that memory bandwidth limitations are preventing performance gains for eight processors.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.