© NVIDIA Corporation 2009 Background Founded 2006 by NVIDIA Chief Scientist David Kirk Mission: long-term strategic research Discover & invent new markets.

Slides:

Advertisements

Similar presentations

Introduction to the CUDA Platform

Advertisements

GPU Programming using BU Shared Computing Cluster

University of Michigan Electrical Engineering and Computer Science Transparent CPU-GPU Collaboration for Data-Parallel Kernels on Heterogeneous Systems.

ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu.

HPC at University of Moratuwa & Sri Lanka Dilum Bandara, PhD Senior Lecturer Dept. of Computer Science & Engineering, University of.

GPU Computing with CUDA as a focus Christie Donovan.

Community of Science The Leading Internet Site for Researchers Worldwide

Team Members: Tyler Drake Robert Wrisley Kyle Von Koepping Justin Walsh Faculty Advisors: Computer Science – Prof. Sanjay Rajopadhye Electrical & Computer.

Software Group © 2006 IBM Corporation Compiler Technology Task, thread and processor — OpenMP 3.0 and beyond Guansong Zhang, IBM Toronto Lab.

External Research & Programs Overview Sailesh Chutani Director, External Research & Programs Microsoft Research.

CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.

CS 732: Advance Machine Learning Usman Roshan Department of Computer Science NJIT.

A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.

NC State University Information Technology Division College of Engineering NC State College of Engineering Sept 14, 2006/EDUCAUSE-Live1 Virtual Computing.

NVIDIA Confidential. Product Description World’s most popular 3D content creation tool Used across Design, Games and VFX markets Over +300k 3ds Max licenses,

Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.

Jawwad A Shamsi Nouman Durrani Nadeem Kafi Systems Research Laboratories, FAST National University of Computer and Emerging Sciences, Karachi Novelties.

An Introduction to the Thrust Parallel Algorithms Library.

USGBC Students Sustainability and Student Involvement Pamela Wallentiny, M.Ed. LEED Green Associate Howard Community College Little Patuxent Parkway.

Introduction to Computer and Programming CS-101 Lecture 6 By : Lecturer : Omer Salih Dawood Department of Computer Science College of Arts and Science.

Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.

“High-performance computational GPU-stand for teaching undergraduate and graduate students the basics of quantum-mechanical calculations“ “Komsomolsk-on-Amur.

Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.

Dr. Chris Musselle – Consultant R Meets Julia Dr Chris Musselle.

About Dynamic Sites (Front End / Back End Implementations) by Janssen & Associates Affordable Website Solutions for Individuals and Small Businesses.

2005 UCAR Office of Program Annual Report Jack Fellows,UOP Director Open House. Not going over the Annual Report -- I’ll be summarizing UOP and its programs.

David Luebke NVIDIA Research GPU Computing: The Democratization of Parallel Computing.

UIUC CSL Global Technology Forum © NVIDIA Corporation 2007 Computing in Crisis: Challenges and Opportunities David B. Kirk.

Extracted directly from:

Science & Technology Centers Program Center for Science of Information National Science Foundation Science & Technology Centers Program Bryn Mawr Howard.

Copperhead: A Python-like Data Parallel Language & Compiler Bryan Catanzaro, UC Berkeley Michael Garland, NVIDIA Research Kurt Keutzer, UC Berkeley Universal.

1 Addressing Critical Skills Shortages at the NWS Environmental Modeling Center S. Lord and EMC Staff OFCM Workshop 23 April 2009.

CS6963 L15: Design Review and CUBLAS Paper Discussion.

ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 3, 2011outline.1 ITCS 6010/8010 Topics in Computer Science: GPU Programming for High Performance.

Spreading Curricular Change in PDC: Some Practical Tools This material is based on work supported by the National Science Foundation under Grant Nos. DUE / /

Computer Science and Engineering College of Engineering The Ohio State University CSE Graduate Program Dr. Rephael Wenger Graduate Admissions Committee.

Diane Marinkas CDA 6938 April 30, Outline Motivation Algorithm CPU Implementation GPU Implementation Performance Lessons Learned Future Work.

QUICK TIPS (--THIS SECTION DOES NOT PRINT--) This PowerPoint template requires basic PowerPoint (version 2007 or newer) skills. Below is a list of commonly.

Learning and Engagement in Library Spaces Suzanne E. Thorin Ruth Lilly University Dean of University Libraries and Associate Vice President for Digital.

Visualization Workshop David Bock Visualization Research Programmer National Center for Supercomputing Applications - NCSA University of Illinois at Urbana-Champaign.

Be Part of the AVS Community Join Us in Advancing the Science & Technology of Materials, Processing, & Interfaces Networking Career Services Training &

GPU Architecture and Programming

Introducing collaboration members – Korea University (KU) ALICE TPC online tracking algorithm on a GPU Computing Platforms – GPU Computing Platforms Joohyung.

Experiences Accelerating MATLAB Systems Biology Applications Heart Wall Tracking Lukasz Szafaryn, Kevin Skadron University of Virginia.

GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.

ARCHES: GPU Ray Tracing I.Motivation – Emergence of Heterogeneous Systems II.Overview and Approach III.Uintah Hybrid CPU/GPU Scheduler IV.Current Uintah.

Introducing CILT Roy Pea SRI International Nora Sabelli National Science Foundation.

Slide 1 Using OpenACC in IFS Physics’ Cloud Scheme (CLOUDSC) Sami Saarinen ECMWF Basic GPU Training Sept 16-17, 2015.

TEMPLATE DESIGN © BOINC: Middleware for Volunteer Computing David P. Anderson Space Sciences Laboratory University of.

Computational Science & Engineering meeting national needs Steven F. Ashby SIAG-CSE Chair March 24, 2003.

1 Workshop 9: General purpose computing using GPUs: Developing a hands-on undergraduate course on CUDA programming SIGCSE The 42 nd ACM Technical.

1 1 Office of Science Jean-Luc Vay Accelerator Technology & Applied Physics Division Lawrence Berkeley National Laboratory HEP Software Foundation Workshop,

GPU Accelerated MRI Reconstruction Professor Kevin Skadron Computer Science, School of Engineering and Applied Science University of Virginia, Charlottesville,

Program Optimizations and Recent Trends in Heterogeneous Parallel Computing Dušan Gajić, University of Niš Program Optimizations and Recent Trends in Heterogeneous.

National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Visualization Support for XSEDE and Blue Waters DOE Graphics.

LCSE – NCSA Partnership Accomplishments, FY01 Paul R. Woodward Laboratory for Computational Science & Engineering University of Minnesota October 17, 2001.

Debunking the 100X GPU vs. CPU Myth An Evaluation of Throughput Computing on CPU and GPU Present by Chunyi Victor W Lee, Changkyu Kim, Jatin Chhugani,

CS 732: Advance Machine Learning

Georgia Tech NSF ADVANCE Research Program Mary Frank Fox Co-Principal Investigator NSF ADVANCE Site Visit June 2004.

Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.

HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.

Using Blackboard as a Tool to Teach Online Technology Skills in College Classrooms Dr. Victoria Haddad Adjunct Professor, College of Technology Wilmington.

Volunteer Computing and BOINC

VisIt Project Overview

A Science-Writing Culture in the Research University: Curricula, Collaborations, and Student Opportunities Chris Thaiss University Writing Program UC Davis.

GPU Computing Jan Just Keijser Nikhef Jamboree, Utrecht

Mentoring the Next Generation of Science Gateway Developers and Users

CS 179 Project Intro.

Emerging Frontiers of Science of Information

Presentation transcript:

© NVIDIA Corporation 2009 Background Founded 2006 by NVIDIA Chief Scientist David Kirk Mission: long-term strategic research Discover & invent new markets Influence product roadmaps Follow, support, and focus academic research Improve parallel computing education

© NVIDIA Corporation 2009 Topics Visual computing Real-time rendering, cinematic rendering, animation, modeling, visualization, computational photography Parallel computing Programming languages, compilers, numerics, HPC applications, architecture, circuit design, interconnects Mobile computing Low-power computing, networks, HCI

© NVIDIA Corporation 2009 Personnel Currently 25 full-time researchers in CA, NC, MI, MN, VA, UT, Berlin, Helsinki 2 National Academy members 1 Academy Award 5 recent former faculty

© NVIDIA Corporation 2009 External Research Collaborations UC Berkeley: parallel programming UC Davis – parallel algorithms U British Columbia – imaging, architecture U North Carolina – ray tracing, hybrid rendering U Virginia – architecture, perceptual psychology UCLA – oceanography U Massachusetts – real-time rendering Chalmers University – real-time rendering U Utah – HPC, ray tracing NC State – rendering algorithms Johns Hopkins – data-intensive computing Brown – computer vision Saarland U – ray tracing U Illinois – parallel programming Weta – cinematic rendering Williams College – real-time rendering

© NVIDIA Corporation 2009 Example: Skin Rendering Real-time subsurface scattering Multilayer translucent materials ~5 minutes  ~11 ms No precomputation Key insight: project diffusion profiles onto sum-of-Gaussians basis

© NVIDIA Corporation 2009 Raytracing

© NVIDIA Corporation 2009 NVIRT: CUDA Ray Tracing API

© NVIDIA Corporation 2009 Example: Programming Languages Copperhead: Cu + Python Copperhead is a subset of Python, designed for data parallelism Python: extant, well accepted high level scripting language Already understands things like map and reduce Comes with a parser & lexer The current Copperhead compiler takes a subset of Python and produces CUDA code

© NVIDIA Corporation 2009 Copperhead is not Pure Python Copperhead is not for arbitrary Python code Most features of Python are unsupported Connecting Python & Copperhead code will require binding similar to Python-C interaction Copperhead is compiled, not interpreted Statically typed Python Copperhead

© NVIDIA Corporation 2009 Saxpy: Hello world Some things to notice: Types are implicit The Copperhead compiler uses a Hindley-Milner type system with typeclasses similar to Haskell Typeclasses are fully resolved in CUDA via C++ templates Functional programming: map, lambda (or equivalent in list comprehensions) you can pass functions around to other functions Closure: the variable ‘a’ is free in the lambda function, but bound to the ‘a’ in its enclosing scope def saxpy(a, x, y): return map(lambda xi, yi: a*xi + yi, x, y)

© NVIDIA Corporation 2009 Example: Parallel Programming thrust is a library of data parallel algorithms & data structures with an interface similar to the C++ Standard Template Library for CUDA C++ template metaprogramming automatically chooses the fastest code path at compile time Data Structures thrust::device_vector thrust::host_vector thrust::device_ptr Etc. Algorithms thrust::sort thrust::reduce thrust::exclusive_scan Etc.

© NVIDIA Corporation 2009 thrust::sort sort.cu #include int main(void) { // generate random data on the host thrust::host_vector h_vec( ); thrust::generate(h_vec.begin(), h_vec.end(), rand); // transfer to device and sort thrust::device_vector d_vec = h_vec; // sort 140M 32b keys/sec on GT200 thrust::sort(d_vec.begin(), d_vec.end()); return 0; } #include int main(void) { // generate random data on the host thrust::host_vector h_vec( ); thrust::generate(h_vec.begin(), h_vec.end(), rand); // transfer to device and sort thrust::device_vector d_vec = h_vec; // sort 140M 32b keys/sec on GT200 thrust::sort(d_vec.begin(), d_vec.end()); return 0; }

© NVIDIA Corporation 2009 thrust::sort sort.cu #include int main(void) { // generate random data on the host thrust::host_vector h_vec( ); thrust::generate(h_vec.begin(), h_vec.end(), rand); // transfer to device and sort thrust::device_vector d_vec = h_vec; // sort 140M 32b keys/sec on GT200 thrust::sort(d_vec.begin(), d_vec.end()); return 0; } #include int main(void) { // generate random data on the host thrust::host_vector h_vec( ); thrust::generate(h_vec.begin(), h_vec.end(), rand); // transfer to device and sort thrust::device_vector d_vec = h_vec; // sort 140M 32b keys/sec on GT200 thrust::sort(d_vec.begin(), d_vec.end()); return 0; }

© NVIDIA Corporation 2009 thrust::reduce reduce.cu #include int main(void) { // generate random data on the host thrust::host_vector h_vec( ); thrust::generate(h_vec.begin(), h_vec.end(), rand); // compute sum thrust::device_vector d_vec = h_vec; int x = thrust::reduce(d_vec.begin(), d_vec.end(), thrust::plus ()); return 0; } #include int main(void) { // generate random data on the host thrust::host_vector h_vec( ); thrust::generate(h_vec.begin(), h_vec.end(), rand); // compute sum thrust::device_vector d_vec = h_vec; int x = thrust::reduce(d_vec.begin(), d_vec.end(), thrust::plus ()); return 0; }

© NVIDIA Corporation 2009 Thrust thrust.googlecode.com Open source (Apache2 license)

© NVIDIA Corporation 2008 Example: Sparse Matrix-Vector CPU Results from “Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms", Williams et al, Supercomputing 2007

© NVIDIA Corporation 2009 Example: Sort Radix Sorting Rate

© NVIDIA Corporation 2009 Example: Fluid Dynamics HOT COLD CIRCULATING CELLS INITIAL TEMPERATURE Rayleigh-Bénard Convection

© NVIDIA Corporation 2009 Rayleigh-Bénard Results Double precision 384 x 384 x 192 grid (max that fits in 4GB) Vertical slice of temperature at y=0 Transition from stratified (left) to turbulent (right) Regime depends on Rayleigh number: Ra = gαΔT/κν 8.5x speedup versus Fortran code running on 8-core 2.5 GHz Xeon

© NVIDIA Corporation 2009 Mission: Support Academic Research Serve as academic liaison Follow, inform, and influence external research Direct support – funding and equipment

© NVIDIA Corporation 2009 Sponsored Research Donate and discount equipment Professor Partnerships Ph.D. Fellowships CUDA Centers of Excellence New programs: CUDA Fellows CUDA Research Awards

© NVIDIA Corporation 2009 Mission: Support Parallel Computing Education Supporting courses & curricular efforts Creating & gathering online training materials Teaching courses (and putting them online) Writing textbooks

© NVIDIA Corporation 2009 Final Thoughts – Education We should teach parallel computing in CS 1 or CS 2 Computers don’t get faster, just wider Manycore is the future of computing Insertion Sort Heap Sort Merge Sort Which goes faster on large data? students need to understand this! now ALLEarly!

© NVIDIA Corporation 2009 NVIDIA Research Summit Sept 30 – Oct 2, 2009 – The Fairmont San Jose, California A cross-disciplinary forum for researchers using GPUs across science and engineering Join your colleagues, researchers in other fields, and the NVIDIA Research team for this valuable opportunity to gather, learn, and collaborate. Share your work with peers from many disciplines; learn from experts at NVIDIA and elsewhere. In-depth sessions on numeric computing, computational science, visual computing trends, and advanced CUDA programming & optimization Opportunities:  Call for Posters open. Showcase your work, learn from your peers.  Research Roundtables Moderated discussions led by your peers. Submit a roundtable to shape the hot topics in GPU computing! Opportunities:  Call for Posters open. Showcase your work, learn from your peers.  Research Roundtables Moderated discussions led by your peers. Submit a roundtable to shape the hot topics in GPU computing! Co-located with the GPU Technology Conference, a technical event focused on developers, engineers, researchers, senior executives, venture capitalists, press and analysts