Presentation is loading. Please wait.

Presentation is loading. Please wait.

OpenCL Usman Roshan Department of Computer Science NJIT.

Similar presentations


Presentation on theme: "OpenCL Usman Roshan Department of Computer Science NJIT."— Presentation transcript:

1 OpenCL Usman Roshan Department of Computer Science NJIT

2 OpenCL Universal language for parallel programming https://www.khronos.org/opencl/ Increasing usage in GPU computing Pros: your GPU program will run not just on NVIDIA but other GPUs as well (such as AMD) Cons: not as easy to program in as CUDA

3 SimpleOpenCL Open source API for writing OpenCL programs Main challenge in OpenCL programs is the setup SimpleOpenCL provides simple functions for setting up the GPU https://code.google.com/p/simple-opencl/

4 Strategy to convert Chi2 in CUDA to OpenCL Define blocks and threads – With arrays global_work_size[2] and local_work_size[2] – global_work_size[0] = BLOCKS * THREADS; – global_work_size[1] = 1; – local_work_size[0] = THREADS; – local_work_size[1] = 1; Initialize hardware – hardware = sclGetAllHardware(&found); – sclPrintHardwareStatus(*hardware); Initialize software – software = sclGetCLSoftware(OPENCL_KERNEL_FILE, ”name_of_kernel_function", hardware[0]);

5 CUDA to OpenCL Device arrays defined with cl_mem Replace cudamalloc with – dev_results_clmem = sclMalloc( hardware[0], CL_MEM_READ_WRITE, size * sizeof(float) ); To write to GPU memory replace cudamemcpy with – sclWrite( hardware[0], size * sizeof(unsigned char), dev_dataT_clmem, (void*) dataT ); To read from GPU memory replace cudamemcpy with – sclRead( hardware[0], cols * sizeof(float), results_clmem, host_results );

6 CUDA to OpenCL Replace kernel call by first setting kernel parameters – sclSetKernelArg( software, 0, sizeof(uint), &var) – sclSetKernelArg( software, 1, sizeof(cl_mem), (void*) &dev_var_clmem) – sclSetKernelArg( software, 2, sizeof(cl_mem), (void*) &dev_const_var_clmem) Then call the kernel with – sclLaunchKernel( hardware[0], software, global_work_size, local_work_size );

7 Modifications to GPU kernel code Use __kernel to define kernel function Use __global and __local for global and local variables Use __constant for constant memory definitions Get thread id with get_global_id(0);


Download ppt "OpenCL Usman Roshan Department of Computer Science NJIT."

Similar presentations


Ads by Google