Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Computing Lecture

Similar presentations


Presentation on theme: "Parallel Computing Lecture"— Presentation transcript:

1 Parallel Computing Lecture

2 Why Parallel Want to do more per unit of time
Because we care for performance We want to better exploit/use HW resources Divide and conquer Load Balancing (be careful)

3 Why Parallel Solve large problems Regular PC using 1 core  (X time)
Regular PC under Y cores, where (Z time) < (X time) Regular PC under Y cores, + additional features  (P time) << Z time (P time) <<< X time

4 Speedup

5 Speedup Example 1: Objective Convert Seq. Code to Parallel Code Conditions  the % of time that is spent in the part that can be parallelized is 30% . Assume that you can reach/achieve a 100x speedup on the parallel portion Question 1 What is the Total Speedup? Question 2  in what % the execution time decreases? Question 3  assume NOW that you can reach an infinite speedup on the parallel version, in what % the execution time decreases? Question 4 What is the Total Speedup ?

6 Speedup Example 2: Objective Convert Seq. Code to Parallel Code Conditions  the % of time that is spent in the part that can be parallelized is 99% . Assume that you can reach/achieve a 100x speedup on the parallel portion Question 1 What is the Total Speedup? Question 1  in what % the execution time decreases?

7 Homogeneous Multicore Architectures.

8 Heterogeneous Multicore Architectures.

9 Heterogeneous Multicore Architectures.
The CBE Cell Broadband Engine Have I used one before ? Quite Possible

10 Heterogeneous Multicore Architectures.
SIMD  Single Instructions Multiple Data

11 Heterogeneous Multicore Architectures.
SIMD  Single Instructions Multiple Data

12 Heterogeneous Multicore Architectures.
SIMD  Single Instructions Multiple Data

13 SIMT  Single Instructions Multiple Thread
Nvidia GPU’s. SIMT  Single Instructions Multiple Thread

14 SIMT  Single Instructions Multiple Thread
Nvidia GPU’s. SIMT  Single Instructions Multiple Thread THIS IS A G80 SP = Streaming Processor SM = Streaming Multiprocessor 2 SM = 1 Building Block 128 SP, grouped as follows 16 SM, each one with 8 SP 768 Threads Per SM 768 Threads* 16 SM = Threads for Chip THIS IS A GT200 1024 Threads per SM ~ 30K threads


Download ppt "Parallel Computing Lecture"

Similar presentations


Ads by Google