Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed and Parallel Processing George Wells.

Similar presentations


Presentation on theme: "Distributed and Parallel Processing George Wells."— Presentation transcript:

1 Distributed and Parallel Processing George Wells

2 Performance Analysis Sources of performance loss – Overheads – not present in sequential program – Sequential code – Idle processors (load balancing) – Contention for resources

3 Parallel Computing Terminology The worst-case time complexity of a parallel algorithm is a function f(n) that is the maximum over all inputs of size n. The cost of a parallel algorithm is defined as its complexity times p, the number of processors.

4 Parallel Computing Terminology The speedup achieved by a parallel algorithm running on p processors is the ratio between the time taken on the parallel computer executing the fastest sequential algorithm and executing the parallel algorithm with p processors The efficiency of a parallel algorithm running p processors is speedup / p

5 Speedup Linear speedup: s = p – Efficiency = 100% Superlinear speedup: s > p – Unusual (but wonderful!) Usually: – s < p – Efficiency < 100%

6 Amdahl’s Law If f is the inherently sequential fraction of a computation to be solved by p processors, then the speedup s is limited according to the formula:

7 The ≤ symbol represents both sequential overhead, and non-ideal load balancing. f = fraction of the overall computation that is inherently sequential Part that can be done in parallel

8 Assume f = 0 i.e. you can’t get better than linear speedup (according to Amdahl)

9 This time try substituting for f = 0.1, and p = ∞ The sequential component is a limiting factor

10 Suppose that a parallel computer E has been built by using the herd-of-elephants approach, where each processor in the computer is capable of performing sequential operations at the rate of X megaflops. Suppose that another parallel computer A has been built by using the army-of-ants approach; each processor in this computer is capable of performing sequential operations at the rate of β X megaflops, for some β where 0 < β << 1. If parallel computer A attempts a computation whose inherently sequential fraction f is greater than β, then A will execute the algorithm more slowly than a single processor of parallel computer E. Why? Because the time taken to execute the sequential part of A is greater than the time taken to execute the entire algorithm on E.

11 If a parallel computer built by using the army-of- ants approach is to be successful, one of the following conditions must be true: – at least one processor must be capable of extremely fast sequential computations, or – the problem being solved must admit a solution containing virtually no sequential components. Computers capable of high processing speeds must have high memory bandwidth and high I/O rates too.

12 Multithreading

13 Multithreading Example: Mandelbrot Set A fractal calculation – embarrassingly parallel The value of each point in a 2D space can be calculated independently of any other – Values are false-coloured to give graphical representation

14 Mandelbrot Set

15 Sequential Code A little complicated by graphics event handling in Java

16 Parallel Code Can create threads to perform the calculations Threads in Java: – Implement Runnable interface Specifies one method: public void run (); – Create Thread object – Call start method In turn calls the run method

17 Creating Threads in Java Implement the Runnable interface Can also extend the Thread class – Override the run() method First approach is preferred – Single-inheritance limitation in Java – Better OO style is-a relationship implied by inheritance

18 Java Threads We can think of a thread as having three parts: – A virtual CPU (provides execution ability) – Code – Data

19 Mandelbrot Application WorkerThread w = new WorkerThread(i, i+taskSize); Thread t = new Thread(w); t.start(); class WorkerThreadw Thread t

20 Threads in Java Simplified lifecycle New Runnable Running Dea d Blocked Blocking event Unblocked Scheduler run() completes start() Thread scheduling is preemptive, but not necessarily time-sliced – Use Thread.sleep(x) or Thread.yield()

21


Download ppt "Distributed and Parallel Processing George Wells."

Similar presentations


Ads by Google