Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 732: Advance Machine Learning

Similar presentations


Presentation on theme: "CS 732: Advance Machine Learning"— Presentation transcript:

1 CS 732: Advance Machine Learning
Usman Roshan Department of Computer Science NJIT

2 Course material Parallel programming: emphasis on GPU CUDA and OpenCL languages Optimization results Deep learning results Methods for learning new features Grading: Each student presents one paper: 45 min powerpoint talk Each student will do a project and present it towards the end of the semester. This would involve some programming and would be an experimental performance study. Either implement an algorithm in C or Python or compare existing algorithms on different datasets

3 Parallel computing Why in an advance machine learning course?
Some machine learning programs take a long time to finish. For example large neural networks and kernel methods. Dataset sizes are getting larger. While linear classification and regression programs are generally very fast they can be slow on large datasets.

4 Examples Dot product evaluation Gradient descent algorithms
Cross-validation Evaluating many folds in parallel Parameter estimation

5 Parallel computing Multi-core programming Graphics Processing Units:
OpenMP: ideal for running same program on different inputs MPI: master slave setup that allows message passing Graphics Processing Units: Equipped with hundred to thousand cores Designed for running in parallel hundreds of short functions called threads

6 GPU programming Memory has four types with different sizes and access times Global: largest, ranges from 3 to 6GB, slow access time Local: same as global but specific to a thread Shared: on-chip, fastest, and limited to threads in a block Constant: cached global memory and accessible by all threads Coalescent memory access is key to fast GPU programs. Main idea is that consecutive threads access consecutive memory locations.

7 GPU programming Designed for running in parallel hundreds of short functions called threads Threads are organized into blocks which are in turn organized into grids Ideal for running the same function on millions of different inputs

8 Languages CUDA: OpenCL: C-like language introduced by NVIDIA
CUDA programs run only on NVIDIA GPUs OpenCL: OpenCL programs run on all GPUs Same as C Requires no special compiler except for opencl header and object files (both easily available)

9 CUDA We will compile and run a program for determining interacting SNPs in a genome-wide association study Location:


Download ppt "CS 732: Advance Machine Learning"

Similar presentations


Ads by Google