Download presentation
Presentation is loading. Please wait.
Published byJohnathan Benson Modified over 9 years ago
1
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign ECE408 / CS483 Applied Parallel Programming Lecture 24: Application Case Study – Electrostatic Potential Calculation Part 2
2
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign Objective To learn how to apply parallel programming techniques to an application –Thread coarsening for more work efficiency –Data structure padding for reduced divergence –Memory access locality and pre-computation techniques
3
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign Outline of A Fast Sequential Code for all z { for all atoms {pre-compute dz 2 } for all y { for all atoms {pre-compute dy 2 (+ dz 2 ) } for all x { for all atoms { compute contribution to current x,y,z point using pre-computed dy 2 + dz 2 } 3
4
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign More Thoughts on Fast Sequential Code Need temporary arrays for pre-calculated dz 2 and dy 2 + dz 2 values So, why does this code has better cache behaior on CPUs? 4
5
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign Reuse Distance Calculation for More Computation Efficiency
6
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign Thread Coarsening
7
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign A Compute Efficient Gather Kernel
8
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign Thread Coarsening for More Computation Efficiency
9
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign Performance Comparison
10
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign More Work is Needed to Feed a GPU
11
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2012 ECE408/CS483, ECE 498AL, University of Illinois, Urbana-Champaign ANY QUESTIONS?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.