Download presentation
Presentation is loading. Please wait.
Published byArline McDowell Modified over 8 years ago
1
Concurrency Idea
2
2 Concurrency idea Challenge –Print primes from 1 to 10 10 Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold speedup (or close)
3
3 Load Balancing Split the work evenly Each thread tests range of 10 9 … … 10 910 2·10 9 1 P0P0 P1P1 P9P9
4
4 Procedure for Thread i void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*10 9 +1, j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); }
5
5 Issues Higher ranges have fewer primes Yet larger numbers harder to test Thread workloads –Uneven –Hard to predict
6
6 Issues Higher ranges have fewer primes Yet larger numbers harder to test Thread workloads –Uneven –Hard to predict Need dynamic load balancing rejected
7
7 17 18 19 Shared Counter each thread takes a number
8
8 Procedure for Thread i int counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); }
9
9 Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Procedure for Thread i Shared counter object
10
10 Where Things Reside cache Bus cache 1 shared counter shared memory void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*10 9 +1, j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); } code Local variables
11
11 Procedure for Thread i Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Stop when every value taken
12
12 Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Procedure for Thread i Increment & return each new value
13
13 Counter Implementation public class Counter { private long value; public long getAndIncrement() { return value++; }
14
14 Counter Implementation public class Counter { private long value; public long getAndIncrement() { return value++; } OK for single thread, not for concurrent threads
15
15 What It Means public class Counter { private long value; public long getAndIncrement() { return value++; }
16
16 What It Means public class Counter { private long value; public long getAndIncrement() { return value++; } temp = value; value = temp + 1; return temp;
17
17 time Not so good… Value… 1 read 1 read 1 write 2 read 2 write 3 write 2 232
18
18 Is this problem inherent? If we could only glue reads and writes together… read write read write !!
19
19 Challenge public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; }
20
20 Challenge public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } Make these steps atomic (indivisible)
21
21 Hardware Solution public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } ReadModifyWrite() instruction
22
22 An Aside: Java™ public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; }
23
23 An Aside: Java™ public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; } Synchronized block
24
24 An Aside: Java™ public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; } Mutual Exclusion
25
25 Why do we care? We want as much of the code as possible to execute concurrently (in parallel) A larger sequential part implies reduced performance Amdahl’s law: this relation is not linear…
26
26 Amdahl’s Law Speedup= …of computation given n CPUs instead of 1
27
27 Amdahl’s Law Speedup=
28
28 Amdahl’s Law Speedup= Parallel fraction
29
29 Amdahl’s Law Speedup= Parallel fraction Sequential fraction
30
30 Amdahl’s Law Speedup= Parallel fraction Number of processors Sequential fraction
31
31 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup?
32
32 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup? Speedup = 2.17=
33
33 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup?
34
34 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup? Speedup = 3.57=
35
35 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup?
36
36 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup? Speedup = 5.26=
37
37 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup?
38
38 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup? Speedup = 9.17=
39
Back to Real-World Multicore Scaling 39 1.8x 2x 2.9x User code Multicore Speedup Must not be managing to reduce sequential % of code
40
Back to Real-World Multicore Scaling 40 1.8x 2x 2.9x User code Multicore Speedup Not reducing sequential % of code
41
Shared Data Structures 75% Unshared 25% Shared cc cc cc cc Coarse Grained c c c c c c c c cc cc cc cc Fine Grained c c c c c c c c The reason we get only 2.9 speedup 75% Unshared 25% Shared Fine grained parallelism has huge performance benefit
42
Diminishing Returns
43
43 Multiprocessor Programming This is what this course is about… –The % that is not easy to make concurrent yet may have a large impact on overall speedup
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.