Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimizing the trace transform Using OpenMP and CUDA Tim Besard 2013-06-19.

Similar presentations


Presentation on theme: "Optimizing the trace transform Using OpenMP and CUDA Tim Besard 2013-06-19."— Presentation transcript:

1 Optimizing the trace transform Using OpenMP and CUDA Tim Besard 2013-06-19

2 Trace transform needs to be real-time MATLAB – Slow – Difficult to optimize C++ base implementation – Allows for optimizations

3 Optimizing the trace transform How to parallelize? OpenMP CUDA Performance

4 How to parallelize? p   p 

5 Coarse-grained parallelism Rotate 0° T-functionals 359° …… Sinogram row Sinogram

6 How to parallelize? Fine-grained parallelism – Rotation – Functionals: prefix sum

7 OpenMP Compiler directives – #pragma omp parallel for – #pragma omp critical – #pragma omp barrier

8 OpenMP Compiler directives Address coarse-grained parallelism – Unobtrusive – Significant overhead 5× speed-up – 8-core machine – Unoptimized

9 CUDA Parallel computing platform Programming model – Lightweight threads – Massively parallel Address fine-grained parallelism – Pixel-centric approach – Complete re-implementation

10 CUDA Low-level details matter a lot – Memory access patterns – Branch divergence 10× speed-up – GeForce GTX TITAN (20% usage) – Unoptimized

11 Performance for 10 signatures Execution time in milliseconds MEX C++ OpenMP CUDA

12 Future work Optimize CUDA – Compare against state of the art Julia implementation – Algorithmic IR

13 Optimizing the trace transform Using OpenMP and CUDA Tim Besard 2013-06-19


Download ppt "Optimizing the trace transform Using OpenMP and CUDA Tim Besard 2013-06-19."

Similar presentations


Ads by Google