Advanced Computer Networks Lecture 1 - Parallelization 1
Scale increases complexity 2 Single-core machine Cluster Multicore server Large-scale distributed system Wide-area network More challenges True concurrency Network Message passing More failure modes (faulty nodes,...) Wide-area network Even more failure modes Incentives, laws,...
Parallelization The algorithm works fine on one core Can we make it faster on multiple cores? – Difficult - need to find something for the other cores to do – There are other sorting algorithms where this is much easier – Not all algorithms are equally parallelizable 3 void bubblesort(int nums[]) { boolean done = false; while (!done) { done = true; for (int i=1; i<nums.length; i++) { if (nums[i-1] > nums[i]) { swap(nums[i-1], nums[i]); done = false; }
Parallelization If we increase the number of processors, will the speed also increase? – Yes, but (in almost all cases) only up to a point 4 Numbers sorted per second Cores used Ideal Expected Speedup: Completion time with one core Completion time with n cores
Amdahl's law Usually, not all parts of the algorithm can be parallelized Let f be the fraction of the algorithm that can be parallelized, and let S i be the corresponding speedup Then 5 Time.... Parallel part Sequential parts Core #1 Core #2 Core #3 Core #1 Core #2 Core #3 Core #4 Core #5 Core #6
Amdahl's law We are given a sequential task which is split into four consecutive parts: P1, P2, P3 and P4 with the percentages of runtime being 11%, 18%, 23% and 48% respectively. Then we are told that P1 does not speed up, so S1 = 1, while P2 speeds up 5×, P3 speeds up 20×, and P4 speeds up 1.6×. New sequential running time is: 6
Amdahl's law Or a little less than 1 ⁄ 2 the original running time The overall speed boost is 1 / = 2.186, or a little more than double the original speed. 7
Is more parallelism always better? Increasing parallelism beyond a certain point can cause performance to decrease! – Example: Need to send a message to each core to tell it what to do. Messages back and forth 8 Numbers sorted per second Cores Ideal Expected Reality (often) Sweet spot
Parallelization What size of task should we assign to each core? Frequent coordination creates overhead – Need to send messages back and forth, wait for other cores... – Result: Cores spend most of their time communicating – Bad: Ask each core to sort three numbers – Good: Ask each core to sort a million numbers 9