NVIDIA® TESLA™ GPU Based Super Computer By : Adam Powell Student # 3198371 For COSC 3P93.

Slides:



Advertisements
Similar presentations
Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
Advertisements

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Instructor Notes We describe motivation for talking about underlying device architecture because device architecture is often avoided in conventional.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 28, 2011 GPUMemories.ppt GPU Memories These notes will introduce: The basic memory hierarchy.
A Complete GPU Compute Architecture by NVIDIA Tamal Saha, Abhishek Rawat, Minh Le {ts4rq, ar8eb,
Optimization on Kepler Zehuan Wang
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
Appendix A. Appendix A — 2 FIGURE A.2.1 Historical PC. VGA controller drives graphics display from framebuffer memory. Copyright © 2009 Elsevier, Inc.
Appendix A — 1 FIGURE A.2.2 Contemporary PCs with Intel and AMD CPUs. See Chapter 6 for an explanation of the components and interconnects in this figure.
GPU Programming and CUDA Sathish Vadhiyar Parallel Programming.
GRAPHICS AND COMPUTING GPUS Jehan-François Pâris
© NVIDIA Corporation 2009 Mark Harris NVIDIA Corporation Tesla GPU Computing A Revolution in High Performance Computing.
Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
Big Kernel: High Performance CPU-GPU Communication Pipelining for Big Data style Applications Sajitha Naduvil-Vadukootu CSC 8530 (Parallel Algorithms)
Panda: MapReduce Framework on GPU’s and CPU’s
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 April 4, 2013 © Barry Wilkinson CUDAIntro.ppt.
To GPU Synchronize or Not GPU Synchronize? Wu-chun Feng and Shucai Xiao Department of Computer Science, Department of Electrical and Computer Engineering,
Shekoofeh Azizi Spring  CUDA is a parallel computing platform and programming model invented by NVIDIA  With CUDA, you can send C, C++ and Fortran.
1 Integrating GPUs into Condor Timothy Blattner Marquette University Milwaukee, WI April 22, 2009.
1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Dec 31, 2012 Emergence of GPU systems and clusters for general purpose High Performance Computing.
GPGPU Ing. Martino Ruggiero Ing. Andrea Marongiu
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
Computer Graphics Graphics Hardware
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
High Performance Computing with GPUs: An Introduction Krešimir Ćosić, Thursday, August 12th, LSST All Hands Meeting 2010, Tucson, AZ GPU Tutorial:
NVIDIA Tesla GPU Zhuting Xue EE126. GPU Graphics Processing Unit The "brain" of graphics, which determines the quality of performance of the graphics.
NVIDIA Fermi Architecture Patrick Cozzi University of Pennsylvania CIS Spring 2011.
Emergence of GPU systems and clusters for general purpose high performance computing ITCS 4145/5145 April 3, 2012 © Barry Wilkinson.
Robert Liao Tracy Wang CS252 Spring Overview Traditional GPU Architecture The NVIDIA G80 Processor CUDA (Compute Unified Device Architecture) LAPACK.
GPU Architecture and Programming
GPU Programming and CUDA Sathish Vadhiyar Parallel Programming.
Multi-Core Development Kyle Anderson. Overview History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism.
NVIDIA’S FERMI: THE FIRST COMPLETE GPU COMPUTING ARCHITECTURE A WHITE PAPER BY PETER N. GLASKOWSKY Presented by: Course: Presented by: Ahmad Hammad Course:
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Jie Chen. 30 Multi-Processors each contains 8 cores at 1.4 GHz 4GB GDDR3 memory offers ~100GB/s memory bandwidth.
Some key aspects of NVIDIA GPUs and CUDA. Silicon Usage.
ICAL GPU 架構中所提供分散式運算 之功能與限制. 11/17/09ICAL2 Outline Parallel computing with GPU NVIDIA CUDA SVD matrix computation Conclusion.
GPUs – Graphics Processing Units Applications in Graphics Processing and Beyond COSC 3P93 – Parallel ComputingMatt Peskett.
Co-Processor Architectures Fermi vs. Knights Ferry Roger Goff Dell Senior Global CERN/LHC Technologist |
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
My Coordinates Office EM G.27 contact time:
Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Oct 30, 2014.
CS 179: GPU Computing LECTURE 2: MORE BASICS. Recap Can use GPU to solve highly parallelizable problems Straightforward extension to C++ ◦Separate CUDA.
Jun Doi IBM Research – Tokyo Early Performance Evaluation of Lattice QCD on POWER+GPU Cluster 17 July 2015.
GPUs (Graphics Processing Units). Information from Textbook Online Appendix C includes information on GPUs Access online resources from: –
Computer Engg, IIT(BHU)
Appendix C Graphics and Computing GPUs
Computer Graphics Graphics Hardware
Prof. Zhang Gang School of Computer Sci. & Tech.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 July 12, 2012 © Barry Wilkinson CUDAIntro.ppt.
CS427 Multicore Architecture and Parallel Computing
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Nov 4, 2013.
Mattan Erez The University of Texas at Austin
Presented by: Isaac Martin
NVIDIA Fermi Architecture
Computer Graphics Graphics Hardware
Graphics Processing Unit
6- General Purpose GPU Programming
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
CIS 6930: Chip Multiprocessor: Parallel Architecture and Programming
Presentation transcript:

NVIDIA® TESLA™ GPU Based Super Computer By : Adam Powell Student # For COSC 3P93

What is a GPU? It is a GPU (Graphical Processing Unit) First introduced in 1999 First non-graphical applications started 2003 Had many problems for non-graphical applications ie. Programmer needed knowledge of API and architecture

Along came CUDA Introduction of CUDA (Compute Unified Device Architecture) Changed the architecture to better suit general programming CUDA is a software and hardware architecture Supports C Programming

Along came CUDA cont. Replaced pixel and vertex pipelines with a single pipeline Added SIMT (single instruction multiple thread) Still programmers asked for more

Next Gen Code Name “Fermi” Improved double precision performance ECC Support (error correction code) True Cache Hierarchy More Shared Memory Faster Context Switching Faster Atomic Operations

“Fermi” Architecture 16 SMs (streaming multiprocessors) Each SM contains 32 CUDA cores Totaling 512 CUDA cores Has 64-bit memory partitions Supports up to 6 GB GDDR5 DRAM Connects to a host CPU via PCI Express

Build Your Own ComponentTypePrice MotherboardAsus P6T7 Ws SuperComputer$ GPUTesla C1060 x 3$ x 3 = $ Display GPUPNY Quadro FX 570$ CPUIntel Core i7 860$ RAM DDR 3Kingston 4GB DDR3$ * 6 = $ Total$

Programming Model Parallel functions are done on the GPU Non parallel code is done on the host CPU Parallel functions are organized in threads/thread blocks/arrays of thread blocks Threads each have it’s own private memory Thread blocks share memory with each thread in it’s block Arrays of blocks share memory for application

Threads and SMs

Development Tools OpenCL – Already talked about by Ryan CUDA C DirectCompute – Supported byt DirectX 10 and DirectX 11 CUDA Fortran Compiler Sadly there is no NVIDIA support for ADA NVIDIA Nexus or NSight

Nexus or Parallel NSight ??? NVIDIA refers to the tool kit for Visual Studio Currently in a private beta Planned to support OpenCL, CUDA C, DirectCompute, Direct 3D and OpenGL It has 3 major components

Parallel NSight 3 components Debugger – Break points – Memory inspection Analyzer – Tool for viewing performance – CPU/GPU Events such as waits and core allocation Graphics Inspector – Texture viewer, vertex buffers, API state

References s/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf s/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf er2-CudaProgrammingModel.pdf er2-CudaProgrammingModel.pdf