Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 179: Lecture 2 Lab Review 1. The Problem  Add two arrays  A[] + B[] -> C[]

Similar presentations


Presentation on theme: "CS 179: Lecture 2 Lab Review 1. The Problem  Add two arrays  A[] + B[] -> C[]"— Presentation transcript:

1 CS 179: Lecture 2 Lab Review 1

2 The Problem  Add two arrays  A[] + B[] -> C[]

3 GPU Computing: Step by Step  Setup inputs on the host (CPU-accessible memory)  Allocate memory for inputs on the GPU  Copy inputs from host to GPU  Allocate memory for outputs on the host  Allocate memory for outputs on the GPU  Start GPU kernel  Copy output from GPU to host  (Copying can be asynchronous)

4 The Kernel  Determine a thread index from block ID and thread ID within a block:

5 Calling the Kernel …

6 CUDA implementation (2)

7 Fixing the Kernel  For large arrays, our kernel doesn’t work!  Bounds-checking – be on the lookout!  Also, need a way for kernel to handle a few more elements…

8 Fixing the Kernel – Part 1

9 Fixing the Kernel – Part 2

10 Fixing our Call

11 Lab 1!  Sum of polynomials – Fun, parallelizable example!  Suppose we have a polynomial P(r) with coefficients c 0, …, c n-1, given by:  We want, for r 0, …, r N-1, the sum:  Output condenses to one number!

12 Calculating P(r) once  Pseudocode (one possible method): Given r, coefficients[] result <- 0.0 power <- 1.0 for all coefficient indecies i from 0 to n-1: result += (coefficients[i] * power) power *= r

13 Accumulation  atomicAdd() function  Important for safe operations!

14 Accumulation

15 Shared Memory  Faster than global memory  Per-block  One block

16 Linear Accumulation  atomicAdd() has a choke point!  What if we reduced our results in parallel?

17 Linear Accumulation …

18 Linear Accumulation (2)

19 Can we do better?

20 Last notes  minuteman.cms.caltech.edu – the easiest option  CMS accounts!  Office hours  Kevin: Monday, 8-10 PM  Connor: Tuesday, 8-10 PM


Download ppt "CS 179: Lecture 2 Lab Review 1. The Problem  Add two arrays  A[] + B[] -> C[]"

Similar presentations


Ads by Google