Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011.

Slides:



Advertisements
Similar presentations
Client Server. Server Client Model Servers- Wait for requests from clients - Sends requested data to client - May have to communicate with other servers.
Advertisements

1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 28, 2011 GPUMemories.ppt GPU Memories These notes will introduce: The basic memory hierarchy.
Potential for parallel computers/parallel programming
1 Workshop 20: Teaching a Hands-on Undergraduate Grid Computing Course SIGCSE The 41st ACM Technical Symposium on Computer Science Education Friday.
Solving Equations = 4x – 5(6x – 10) -132 = 4x – 30x = -26x = -26x 7 = x.
1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Feb 26, 2013, DyanmicParallelism.ppt CUDA Dynamic Parallelism These notes will outline CUDA.
GPU Computing with CUDA as a focus Christie Donovan.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
AssignPrelim1.1 ITCS 4146/5146 Grid Computing, 2007, UNC-Charlotte, B. Wilkinson. Jan 13, 2007 Course Preliminaries.
Understanding Operating Systems 1 Overview Introduction Operating System Components Machine Hardware Types of Operating Systems Brief History of Operating.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
1a-1.1 Parallel Computing Demand for High Performance ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson Dec 27, 2012 slides1a-1.
ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 22, 2011assignprelim.1 Assignment Preliminaries ITCS 6010/8010 Spring 2011.
STRATEGIES INVOLVED IN REMOTE COMPUTATION
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, 2012, Jan 18, 2012assignprelim.1 Assignment Preliminaries ITCS 4145/5145 Spring 2012.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
UNIT - 1Topic - 2 C OMPUTING E NVIRONMENTS. What is Computing Environment? Computing Environment explains how a collection of computers will process and.
Extracted directly from:
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 3, 2011outline.1 ITCS 6010/8010 Topics in Computer Science: GPU Programming for High Performance.
1 © 2012 The MathWorks, Inc. Parallel computing with MATLAB.
ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Dec 28, 2012assignprelim.1 Assignment Preliminaries ITCS 4010/5010 Spring 2013.
Heterogeneity-Aware Peak Power Management for Accelerator-based Systems Heterogeneity-Aware Peak Power Management for Accelerator-Based Systems Gui-Bin.
1 Short Course on Grid Computing Jornadas Chilenas de Computación 2010 INFONOR-CHILE 2010 November 15th - 19th, 2010 Antofagasta, Chile Dr. Barry Wilkinson.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 25, 2011 Synchronization.ppt Synchronization These notes will introduce: Ways to achieve.
ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson, Dec 26, 2012outline.1 ITCS 4145/5145 Parallel Programming Spring 2013 Barry Wilkinson Department.
1 Client-Server Interaction. 2 Functionality Transport layer and layers below –Basic communication –Reliability Application layer –Abstractions Files.
1 Workshop 9: General purpose computing using GPUs: Developing a hands-on undergraduate course on CUDA programming SIGCSE The 42 nd ACM Technical.
Shell Interface Shell Interface Functions Data. Graphical Interface Graphical Interface Command-line Interface Command-line Interface Experiments Private.
Synchronization These notes introduce:
Milestone 3 Ernie Costa Michael Daniels Lindsay Graham Erik Olson Dion St. Hilaire.
1a.1 Parallel Computing and Parallel Computers ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
教育卡(电子卡) 身份信息认证指导 (学生). 身份信息认证渠道 教育卡管理中心为学生提供了 “ 教育卡官方网站 ” 和 “ 教育人人通客户端 ” 两种认证渠道。 1 教育人人通客户端 2 ●● 您可以在教育卡网站的 “ 人人通客户端 ” 版块下载江苏教育人人通客户端。
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.
Potential for parallel computers/parallel programming
Parallel Computing and Parallel Computers
CS427 Multicore Architecture and Parallel Computing
Parallel Computing Demand for High Performance
Constructing a system with multiple computers or processors
What is Parallel and Distributed computing?
Dr. Barry Wilkinson © B. Wilkinson Modification date: Jan 9a, 2014
Do-more Technical Training
Parallel Computing Demand for High Performance
This presentation was adapted from explicitGPU_INT_drive.odp
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Stencil Quiz questions
Using Shared memory These notes will demonstrate the improvements achieved by using shared memory, with code and results running on coit-grid06.uncc.edu.
Constructing a system with multiple computers or processors
Quiz Questions Parallel Programming Parallel Computing Potential
Dr. Barry Wilkinson University of North Carolina Charlotte
Stencil Quiz questions
Parallel Computing Demand for High Performance
Radoslaw Jedynak, PhD Poland, Technical University of Radom
} 2x + 2(x + 2) = 36 2x + 2x + 4 = 36 4x + 4 = x =
Parallel Computing Demand for High Performance
Parallel Computing and Parallel Computers
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Jean Joseph DBA\DEVELOPER
Potential for parallel computers/parallel programming
Introduction to High Performance Computing Lecture 16
Quiz Questions Parallel Programming Parallel Computing Potential
Quiz Questions Parallel Programming Parallel Computing Potential
Quiz Questions Parallel Programming Parallel Computing Potential
Shared memory programming
Potential for parallel computers/parallel programming
Synchronization These notes introduce:
Types of Parallel Computers
Presentation transcript:

Demo of running CUDA programs on GPU and potential speed-up over CPU ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 10, 2011

2 Xclock running on client PC Xclock running on coit- grid01.uncc.edu Xclock running on coit- grid06.uncc.edu Xterm running on client PC, logged onto coit-grid06.uncc.edu Typical user interface (using a Windows PC) WinSCP running on client PC connected to grid01.uncc.edu To make sure all X servers running

3 Heat distribution problem (Solving Laplace’s equation) 800 x 800 points with 2000 iterations Speed-up = 21.2 (Not sufficiently converged) Fireplace

4 800 x 800 points iterations Different GPU block structure Speed-up = Fireplace

5 200 x 200 points with iterations Different GPU block structure Speed-up = 3.9 Fireplace

6 Potential speed-up Speed-up factor = Execution time on CPU Execution using GPU One can get one or two orders of magnitude speed up just by using a single GPU!! But it will take care to achieve large speed-ups. Algorithm used on GPU may be different to that used on CPU because of constraints on GPU, so should really compare best sequential version on CPU with algorithm used on GPU

7 N Body problem

8

Questions