Multi-Layer Perceptron On A GPU

Slides:



Advertisements
Similar presentations
Exploiting Execution Order and Parallelism from Processing Flow Applying Pipeline-based Programming Method on Manycore Accelerators Shinichi Yamagiwa University.
Advertisements

Introduction to the CUDA Platform
Scott Wiese ECE 539 Professor Hu
Utilization of GPU’s for General Computing Presenter: Charlene DiMeglio Paper: Aspects of GPU for General Purpose High Performance Computing Suda, Reiji,
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
GPU Computing with CUDA as a focus Christie Donovan.
Multi Agent Simulation and its optimization over parallel architecture using CUDA™ Abdur Rahman and Bilal Khan NEDUET(Department Of Computer and Information.
CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.
Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
GPGPU platforms GP - General Purpose computation using GPU
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
CSU0021 Computer Graphics © Chun-Fa Chang CSU0021 Computer Graphics September 10, 2014.
2012/06/22 Contents  GPU (Graphic Processing Unit)  CUDA Programming  Target: Clustering with Kmeans  How to use.
Computer Graphics Graphics Hardware
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.
General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.
Publication: Ra Inta, David J. Bowman, and Susan M. Scott. Int. J. Reconfig. Comput. 2012, Article 2 (January 2012), 1 pages. DOI= /2012/ Naveen.
The Java profiler based on byte code analysis and instrumentation for many-core hardware accelerators Marcin Pietroń 1,2, Michał Karwatowski 1,2, Kazimierz.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
Robert Liao Tracy Wang CS252 Spring Overview Traditional GPU Architecture The NVIDIA G80 Processor CUDA (Compute Unified Device Architecture) LAPACK.
Fast Support Vector Machine Training and Classification on Graphics Processors Bryan Catanzaro Narayanan Sundaram Kurt Keutzer Parallel Computing Laboratory,
GPU Architecture and Programming
Non-Bayes classifiers. Linear discriminants, neural networks.
An Artificial Neural Network Approach to Surface Waviness Prediction in Surface Finishing Process by Chi Ngo ECE/ME 539 Class Project.
Debunking the 100X GPU vs. CPU Myth An Evaluation of Throughput Computing on CPU and GPU Present by Chunyi Victor W Lee, Changkyu Kim, Jatin Chhugani,
University of Michigan Electrical Engineering and Computer Science Adaptive Input-aware Compilation for Graphics Engines Mehrzad Samadi 1, Amir Hormati.
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs Allen D. Malony, Scott Biersdorff, Sameer Shende, Heike Jagode†, Stanimire.
Estimation of car gas consumption in city cycle with ANN Introduction  An ANN based approach to estimation of car fuel consumption  Multi Layer Perceptron.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
An Out-of-core Implementation of Block Cholesky Decomposition on A Multi-GPU System Lin Cheng, Hyunsu Cho, Peter Yoon, Jiajia Zhao Trinity College, Hartford,
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Pragmatic appliance of GPGPU technology
Computer Graphics Graphics Hardware
Suren Chilingaryan, Andreas Kopmann
Prof. Zhang Gang School of Computer Sci. & Tech.
Generalized and Hybrid Fast-ICA Implementation using GPU
Analysis of Sparse Convolutional Neural Networks
A survey of Exascale Linear Algebra Libraries for Data Assimilation
Brad Baker, Wayne Haney, Dr. Charles Choi
CUDA Interoperability with Graphical Environments
Our Graphics Environment
Machine Learning Developments in ROOT Sergei Gleyzer, Lorenzo Moneta
Other Classification Models: Neural Network
Artificial Intelligence with .NET
Enabling machine learning in embedded systems
Other Classification Models: Neural Network
D. Gratadour : Introducing YoGA, Yorick with GPU acceleration
Predicting Stock Prices with Multi-Layer Perceptron
Schizophrenia Classification Using
GPU Programming using OpenCL
NEURAL NETWORK APPROACHES FOR AUTOMOBILE MPG PREDICTION
MLP Based Feedback System for Gas Valve Control in a Madison Symmetric Torus Andrew Seltzman Dec 14, 2010.
Pipeline parallelism and Multi–GPU Programming
Introduction to cuBLAS
Lecture 9 MLP (I): Feed-forward Model
Balancing an Inverted Pendulum with a Multi-Layer Perceptron
NVIDIA Fermi Architecture
Neural Networks Advantages Criticism
Introduction to Deep Learning with Keras
CS539: Project 3 Zach Pardos.
Prediction of Wine Grade
The Free Lunch Ended 7 Years Ago
Multi-Layer Perceptron
Computer Graphics Graphics Hardware
Sketch Object Prediction
6- General Purpose GPU Programming
Multicore and GPU Programming
Artificial Neural Networks / Spring 2002
Presentation transcript:

Multi-Layer Perceptron On A GPU Scott Finley ECE 539 Fall 2008 UW-Madison

General Purpose GPU Modern GPUs are have 100s of “stream processors” Can now be used for non-graphics computing nVida CUDA (used for this project) openCL

Three MLP Implementations Basic Linear Algebra Subprograms (BLAS) CPU-Only nVidia’s cuBLAS library No explicit GPU use, library uses GPU “under the hood” Lots of copies of data from CPU to GPU cuBLAS with CUDA Same cuBLAS use as above, non-BLAS operations done with CUDA.

Classifying Forestry Data Data from US forestry service Large feature vectors: 54 Large number of training samples: 500 per epoch Two hidden layers Number of neurons per layer varied

Small, Contrived Data Set

Cross-Platform GUI

Conclusion GPU is very powerful parallel processor Up to two orders of magnitude improvement possible Much more effective for large comutations Many improvements possible CUDA-only version needed