Tim Madden ODG/XSD.  Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More.

Slides:



Advertisements
Similar presentations
Click Here to Begin. Objectives Purchasing a PC can be a difficult process full of complex questions. This Computer Based Training Module will walk you.
Advertisements

GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
GRAPHICS AND COMPUTING GPUS Jehan-François Pâris
GPUs on Clouds Andrew J. Younge Indiana University (USC / Information Sciences Institute) UNCLASSIFIED: 08/03/2012.
Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:
OS Case Study: The Xbox 360  Instructor: Rob Nash  Readings: See citations in the slides.
Tools for Investigating Graphics System Performance
GPU PROGRAMMING David Gilbert California State University, Los Angeles.
CUDA (Compute Unified Device Architecture) Supercomputing for the Masses by Peter Zalutski.
Using Random Numbers in CUDA ITCS 4/5145 Parallel Programming Spring 2012, April 12a, 2012.
Contemporary Languages in Parallel Computing Raymond Hummel.
By Steven Taylor.  Basically a video game engine is a software system designed for the creation and development of video games.  There are many game.
Windows Core OS Services JavaScript (Chakra) C C++ C# VB Metro style apps Communication & Data Application Model Devices & Printing WinRT APIs Graphics.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Jared Barnes Chris Jackson.  Originally created to calculate pixel values  Each core executes the same set of instructions Mario projected onto several.
GAM531 DPS931 – Week 1 Introduction. Professors Joseph Hughes Info: scs.senecac.on.ca/~jp.hughes T2104 Roles: Primary Lecturer.
CSU0021 Computer Graphics © Chun-Fa Chang CSU0021 Computer Graphics September 10, 2014.
An Introduction to. What is XNA?  “A set of tools with a managed runtime environment provided my Microsoft that facilitates computer game development.
Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.
CSC300 Visual Programming Dr. Craig Reinhart. Objectives Teach the basics of C++ –You won’t be an expert but hopefully a very good novice –GUI development.
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
Windows Windows Phone Azure … and WakeUpAndCode.com.
Computer Graphics Graphics Hardware
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.
Tim Madden ODG/XSD.  Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More.
1 ITCS 4/5010 GPU Programming, UNC-Charlotte, B. Wilkinson, Jan 14, 2013 CUDAProgModel.ppt CUDA Programming Model These notes will introduce: Basic GPU.
High Performance Computing with GPUs: An Introduction Krešimir Ćosić, Thursday, August 12th, LSST All Hands Meeting 2010, Tucson, AZ GPU Tutorial:
CUDA All material not from online sources/textbook copyright © Travis Desell, 2012.
C# Game Development with XNA Philip Degarmo. Introduction What is XNA? –Microsoft’s replacement for “Managed DirectX” –“XNA” = “XNA Game Studio” – de.
Lecture 8 : Manycore GPU Programming with CUDA Courtesy : Prof. Christopher Cooper’s and Prof. Chowdhury’s course note slides are used in this lecture.
GPU Architecture and Programming
CSS 700: MASS CUDA Parallel‐Computing Library for Multi‐Agent Spatial Simulation Fall Quarter 2014 Nathaniel Hart UW Bothell Computing & Software Systems.
Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Tim Madden ODG/XSD.  Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More.
OpenCL Joseph Kider University of Pennsylvania CIS Fall 2011.
Martin Kruliš by Martin Kruliš (v1.0)1.
GAM666 – Introduction To Game Programming ● Programmer's perspective of Game Industry ● Introduction to Windows Programming ● 2D animation using DirectX.
Introduction to Game Programming Pertemuan 11 Matakuliah: T0944-Game Design and Programming Tahun: 2010.
Introduction to CUDA CAP 4730 Spring 2012 Tushar Athawale.
Ray Tracing using Programmable Graphics Hardware
Lecture 8 : Manycore GPU Programming with CUDA Courtesy : SUNY-Stony Brook Prof. Chowdhury’s course note slides are used in this lecture note.
11 Computers, C#, XNA, and You Session 1.1. Session Overview  Find out what computers are all about ...and what makes a great programmer  Discover.
CUDA Simulation Benjy Kessler.  Given a brittle substance with a crack in it.  The goal is to study how the crack propagates in the substance as a function.
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Graphic Processing Units Presentation by John Manning.
1 ITCS 4/5145GPU Programming, UNC-Charlotte, B. Wilkinson, Nov 4, 2013 CUDAProgModel.ppt CUDA Programming Model These notes will introduce: Basic GPU programming.
NVIDIA® TESLA™ GPU Based Super Computer By : Adam Powell Student # For COSC 3P93.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
GPUs (Graphics Processing Units). Information from Textbook Online Appendix C includes information on GPUs Access online resources from: –
What is DirectX? DirectX is built by Microsoft as a collection of API’s (Application Programming Interfaces) for the purpose of multimedia processing.
Computer Engg, IIT(BHU)
Computer Graphics Graphics Hardware
GCSE Computing - The CPU
Unit 20 – Computer Game Platforms & Technology – Software Technology
CUDA Introduction Martin Kruliš by Martin Kruliš (v1.1)
CUDA Programming Model
Our Graphics Environment
What is GPU? how does it work?
OpenCL 소개 류관희 충북대학교 소프트웨어학과.
Spatial Analysis With Big Data
Basic CUDA Programming
CS 286 Computer Organization and Architecture
Unit 20 Software Part 2.
Unit 20 Software Part 2.
Computer Graphics Graphics Hardware
CUDA Programming Model
GCSE Computing - The CPU
Overview of System Development for Windows CE.NET
Presentation transcript:

Tim Madden ODG/XSD

 Graphics Processing Unit  Graphics card on your PC.  “Hardware accelerated graphics”  Video game industry is main driver.  More recently used for non-graphics applications.

 Card on the PCI-Express buss.  GPU card contains its own RAM and processor(s).  What is a CORE?  A core is an ALU, arithmetic logic unit.  ALU is basically a single processor that can run a computer program.  Modern PCs have “Quad Core.” Basically 4 processors. This refers to the processor on the motherboard that runs Windows.  GPU has hundreds of Cores!

 Programmer uses some API to write graphics code  Open GL: Silicon Graphics Corp. Now a common standard on most computers.  Direct X: X-Box, Windows  Programmer calls functions with above APIs and compiles.  If the computer has proper GPU, (“Direct 3D compatible” etc.) the code magically runs on the GPU. Compiled program can link at runtime to libraries that run on the GPU. “Hardware Acceleration”  These APIs only useful for drawing and responding to the mouse, or joystick etc.  MSDN- us/directx/default

 DirectX, OpenGL is predefined set of graphics functions that can run on GPU  Nvidea created CUDA for non-graphics applications for GPUs  CUDA allows writing a C++ program to run on the GPU.  Cross-compiler- You write code by typing into a windows box. You compile on the windows box. The code runs on the GPU.  CUDA tools interfaces with Microsoft compiler.  CUDA allows the creation of your OWN functions that run on the GPU, not just whatever DirectX gives you.

 Parallel programming?  What is a “Thread?”  A sequence of commands in a program that run after another. void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); }

 A typical program on a PC has many threads running at once.  An EPICS IOC has about 20 threads running.  This Powerpoint program is running 8 threads (at time of typing this sentence). void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); }

 The more threads running, the slower each thread.  Solution is to add more processors. A “core” is a processor.  “Quad Core” PC has 4 processors, each running hundreds of threads. void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } PROCESSOR void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } PROCESSOR void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } PROCESSOR void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } void oneThread(int N) { int counter=0; while(1) { printf(“Thread %d Count %d\n”, N, counter++); Sleep(1000); } PROCESSOR

 Instead of running 100’s of threads, let us run millions of threads!  GPU can have 1024 processors. Each processor can run 1000’s of threads at once.  Adding more processors speeds up the program.

Thread

// My image data Short *image = new short[1024*1024]; Int k For (k=0; k<1024*1024; k++) { image[k] = image[k] + 1; }  On the host (not the GPU) we write a single thread to process an image.  1 pixel at a time.  For a 1kx1k image, this is 1M operations in sequence.

 Write code for a single pixel, and call the code in 1M separate threads.  Cuda will dole out threads to Cores for you on the GPU.  Pixel X runs on thread X. __global__ void subtractDarkImage_k( unsigned short *d_Dst, unsigned short *d_Src, int dataSize ){ const int i = blockDim.x * blockIdx.x + threadIdx.x; if(i >= dataSize) return; d_Dst[i] =d_Src[i] +1; }

 Install Microsoft compiler.  Download Cuda and install.  Open examples.  Cuda plugs into Microsoft Visual Studio.  When you build, both the host code and GPU code are built. .cpp runs on the host..cu and.cuh runs on GPU