Download presentation
Presentation is loading. Please wait.
1
Parallel Computing Lecture
2
Why Parallel Want to do more per unit of time
Because we care for performance We want to better exploit/use HW resources Divide and conquer Load Balancing (be careful)
3
Why Parallel Solve large problems Regular PC using 1 core (X time)
Regular PC under Y cores, where (Z time) < (X time) Regular PC under Y cores, + additional features (P time) << Z time (P time) <<< X time
4
Speedup
5
Speedup Example 1: Objective Convert Seq. Code to Parallel Code Conditions the % of time that is spent in the part that can be parallelized is 30% . Assume that you can reach/achieve a 100x speedup on the parallel portion Question 1 What is the Total Speedup? Question 2 in what % the execution time decreases? Question 3 assume NOW that you can reach an infinite speedup on the parallel version, in what % the execution time decreases? Question 4 What is the Total Speedup ?
6
Speedup Example 2: Objective Convert Seq. Code to Parallel Code Conditions the % of time that is spent in the part that can be parallelized is 99% . Assume that you can reach/achieve a 100x speedup on the parallel portion Question 1 What is the Total Speedup? Question 1 in what % the execution time decreases?
7
Homogeneous Multicore Architectures.
8
Heterogeneous Multicore Architectures.
9
Heterogeneous Multicore Architectures.
The CBE Cell Broadband Engine Have I used one before ? Quite Possible
10
Heterogeneous Multicore Architectures.
SIMD Single Instructions Multiple Data
11
Heterogeneous Multicore Architectures.
SIMD Single Instructions Multiple Data
12
Heterogeneous Multicore Architectures.
SIMD Single Instructions Multiple Data
13
SIMT Single Instructions Multiple Thread
Nvidia GPU’s. SIMT Single Instructions Multiple Thread
14
SIMT Single Instructions Multiple Thread
Nvidia GPU’s. SIMT Single Instructions Multiple Thread THIS IS A G80 SP = Streaming Processor SM = Streaming Multiprocessor 2 SM = 1 Building Block 128 SP, grouped as follows 16 SM, each one with 8 SP 768 Threads Per SM 768 Threads* 16 SM = Threads for Chip THIS IS A GT200 1024 Threads per SM ~ 30K threads
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.