GPU Based Sound Simulation and Visualization Torbjorn Loken, Torbjorn Loken, Sergiu M. Dascalu, and Frederick C Harris, Jr. Department of Computer Science.

Slides:



Advertisements
Similar presentations
Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
Advertisements

Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Lecture 6: Multicore Systems
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
GPU Programming and CUDA Sathish Vadhiyar Parallel Programming.
Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:
Rasterization and Ray Tracing in Real-Time Applications (Games) Andrew Graff.
Programming with CUDA, WS09 Waqar Saleem, Jens Müller Programming with CUDA and Parallel Algorithms Waqar Saleem Jens Müller.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Chapter.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
IN4151 Introduction 3D graphics 1 Introduction to 3D computer graphics part 2 Viewing pipeline Multi-processor implementation GPU architecture GPU algorithms.
Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
CEG 4131-Fall Graphics Processing Unit GPU CEG4131 – Fall 2012 University of Ottawa Bardia Bandali CEG4131 – Fall 2012.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Emotion Engine A look at the microprocessor at the center of the PlayStation2 gaming console Charles Aldrich.
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Over View of the GPU Architecture CS7080 Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad &
GPU Programming David Monismith Based on notes taken from the Udacity Parallel Programming Course.
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
Introduction to CUDA (1 of 2) Patrick Cozzi University of Pennsylvania CIS Spring 2012.
Introduction to CUDA 1 of 2 Patrick Cozzi University of Pennsylvania CIS Fall 2012.
Lecture 3 : Direct Volume Rendering Bong-Soo Sohn School of Computer Science and Engineering Chung-Ang University Acknowledgement : Han-Wei Shen Lecture.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
Cg Programming Mapping Computational Concepts to GPUs.
Multicore Systems CET306 Harry R. Erwin University of Sunderland.
MS Thesis Defense “IMPROVING GPU PERFORMANCE BY REGROUPING CPU-MEMORY DATA” by Deepthi Gummadi CoE EECS Department April 21, 2014.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Stream Processing Main References: “Comparing Reyes and OpenGL on a Stream Architecture”, 2002 “Polygon Rendering on a Stream Architecture”, 2000 Department.
GPU Architecture and Programming
CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.
Department of Computer Science 1 Beyond CUDA/GPUs and Future Graphics Architectures Karu Sankaralingam University of Wisconsin-Madison Adapted from “Toward.
Multi-Core Development Kyle Anderson. Overview History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Jie Chen. 30 Multi-Processors each contains 8 cores at 1.4 GHz 4GB GDDR3 memory offers ~100GB/s memory bandwidth.
The fetch-execute cycle. 2 VCN – ICT Department 2013 A2 Computing RegisterMeaningPurpose PCProgram Counter keeps track of where to find the next instruction.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
Playstation2 Architecture Architecture Hardware Design.
Introduction to CUDA CAP 4730 Spring 2012 Tushar Athawale.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
Introduction to CUDA 1 of 2 Patrick Cozzi University of Pennsylvania CIS Fall 2014.
Ray Tracing by GPU Ming Ouhyoung. Outline Introduction Graphics Hardware Streaming Ray Tracing Discussion.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
CUDA Compute Unified Device Architecture. Agent Based Modeling in CUDA Implementation of basic agent based modeling on the GPU using the CUDA framework.
System Programming Basics Cha#2 H.M.Bilal. Operating Systems An operating system is the software on a computer that manages the way different programs.
Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Veysi ISLER, Department of Computer Engineering, Middle East Technical University, Ankara, TURKEY Spring
Our Graphics Environment Landscape Rendering. Hardware  CPU  Modern CPUs are multicore processors  User programs can run at the same time as other.
Computer Engg, IIT(BHU)
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 July 12, 2012 © Barry Wilkinson CUDAIntro.ppt.
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Graphics Processing Unit
From Turing Machine to Global Illumination
NVIDIA Fermi Architecture
Graphics Processing Unit
Graphics Processing Unit
6- General Purpose GPU Programming
CSE 502: Computer Architecture
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

GPU Based Sound Simulation and Visualization Torbjorn Loken, Torbjorn Loken, Sergiu M. Dascalu, and Frederick C Harris, Jr. Department of Computer Science and Engineering University of Nevada, Reno

Reno, Nevada

University of Nevada, Reno

GPU Based Sound Simulation and Visualization Torbjorn Loken, Torbjorn Loken, Sergiu M. Dascalu, and Frederick C Harris, Jr. Department of Computer Science and Engineering University of Nevada, Reno

Coprocessors Step back in time Step back in time –Channel I/O

Channel I/O I/O processing was taking a significant amount for processor time I/O processing was taking a significant amount for processor time –Just renting an IBM 709 could cost upwards of $55,000 a month

Channel IO I/O channels hooked into data synchronizer units I/O channels hooked into data synchronizer units –Central Processor free to continue Similar to the modern Southbridge chip found on motherboards Similar to the modern Southbridge chip found on motherboards

Coprocessors Step back in time Step back in time –Channel I/O –Floating Point Units

Floating Point Math Before 1985 and the IEEE 754 standard the implementation of floating point math varied greatly Before 1985 and the IEEE 754 standard the implementation of floating point math varied greatly –Word sizes varied, accuracy varied

Floating Point Math IEEE Standard 754 and a consensus about word sizes (32-bits) helped greatly IEEE Standard 754 and a consensus about word sizes (32-bits) helped greatly Hardware implementations almost required Hardware implementations almost required –Complex –Slow –Valuable

Floating Point Numbers

Floating Point units Initially off chip Initially off chip –One of the most popular coprocessors Moore’s Law made room for them on chip Moore’s Law made room for them on chip

SSE and AVX New instruction sets New instruction sets Compute using bit wide registers Compute using bit wide registers –Multiple floating point numbers per register Non-blocking, CPU free to continue while computations run. Non-blocking, CPU free to continue while computations run. Modern, widely available. Modern, widely available.

Coprocessors Step back in time Step back in time –Channel I/O –Floating Point Units Graphics Processing units Graphics Processing units

Games: FPU Powered

Games: GPU Powered

GPU Design Goal: Floating Point Throughput Goal: Floating Point Throughput SIMD to the core SIMD to the core Hardware Accelerate common operations Hardware Accelerate common operations –Initially Transformation and Lighting calculations –Later Transcendentals, Texture sampling, etc...

The Original Pipeline

The evolution of the pipeline

Increasing Programmability Shaders Shaders –Intended for graphical use –They were als used also to accelerate applications with a large amount of floating point math Image processing, simulations, etc Image processing, simulations, etc VFire VFire

Enter CUDA

Changes Unified Processor Architecture Unified Processor Architecture –No more vertex or fragment processors Threading emphasized Threading emphasized –Many cores running many many threads

CUDA A subset of C with some extensions A subset of C with some extensions –Thread identifiers –Launching kernels –Some data types Mainly used for organizing thread numbering Mainly used for organizing thread numbering

Example Kernel __global__ void kernel(float* a, int N) { int idx = threadIdx.x; if(idx < N) { a[idx] = a[idx] * a[idx]; }}

Now back to Sound Simulation

Sound Simulation What are the acoustic properties of a room? What are the acoustic properties of a room? What acoustic phenomena will be produced in a room? What acoustic phenomena will be produced in a room?

Wave Simulation Techniques Geometric Geometric –Ray tracing Numeric Numeric –Finite Element Methods(FEM) Breaks domain into many smaller domains Breaks domain into many smaller domains –Finite Difference Time Domain(FDTD) Grid based Grid based

FDTD Decomposes the space in a rectilinear grid Decomposes the space in a rectilinear grid Visualization easy. Visualization easy. Naturally very data parallel Naturally very data parallel –Easy to fit to SIMD Computationally expensive Computationally expensive –Increased frequency range -> Increased resolution of grid and decreased timestep size

Radio Wave Propagation Simulating a cellular telephone in an elevator Simulating a cellular telephone in an elevator

FDTD Sound Simulation Solve for the acoustic pressure across the each point in the grid every time step Solve for the acoustic pressure across the each point in the grid every time step 4 Different cases based on the boundaries at a grid point 4 Different cases based on the boundaries at a grid point

Implementation 1 array which encodes the boundaries for each point on the grid (the boundary is constant for the entire simulation) 1 array which encodes the boundaries for each point on the grid (the boundary is constant for the entire simulation) A series of arrays for holding the current simulation state A series of arrays for holding the current simulation state

System architecture 3 Components 3 Components –Simulation Manager –Memory Manager –Renderer

System Architecture Each component runs on its own thread Each component runs on its own thread Inter-thread communication done with thread-safe queues Inter-thread communication done with thread-safe queues

CPU GPU Simulation Manager Renderer Memory Manager Simulation Data OpenGL Data Frame Data

Texture Mapped Volume Rendering Uses alpha blending to quickly render volumetric data. Uses alpha blending to quickly render volumetric data.

Thank you!

GPU Based Sound Simulation and Visualization Torbjorn Loken, Torbjorn Loken, Sergiu M. Dascalu, and Frederick C Harris, Jr. Department of Computer Science and Engineering University of Nevada, Reno