Finding Body Parts with Vector Processing Cynthia Bruyns Bryan Feldman CS 252.

Slides:



Advertisements
Similar presentations
Is There a Real Difference between DSPs and GPUs?
Advertisements

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan Stanford University Craig Donner Henrik Wann Jensen University.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
CS-378: Game Technology Lecture #9: More Mapping Prof. Okan Arikan University of Texas, Austin Thanks to James O’Brien, Steve Chenney, Zoran Popovic, Jessica.
Fall 2011SYSC 5704: Elements of Computer Systems 1 SYSC 5704 Elements of Computer Systems Optimization to take advantage of hardware.
The Programmable Graphics Hardware Pipeline Doug James Asst. Professor CS & Robotics.
Computer Graphics Hardware Acceleration for Embedded Level Systems Brian Murray
A Crash Course on Programmable Graphics Hardware Li-Yi Wei 2005 at Tsinghua University, Beijing.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Chapter.
Computer Vision Introduction to Image formats, reading and writing images, and image environments Image filtering.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
Sorting and Searching Timothy J. PurcellStanford / NVIDIA Updated Gary J. Katz based on GPUTeraSort (MSR TR )U. of Pennsylvania.
The FFT on a GPU Graphics Hardware 2003 July 27, 2003 Kenneth MorelandEdward Angel Sandia National LabsU. of New Mexico Sandia is a multiprogram laboratory.
Hardware-Based Nonlinear Filtering and Segmentation using High-Level Shading Languages I. Viola, A. Kanitsar, M. E. Gröller Institute of Computer Graphics.
The programmable pipeline Lecture 10 Slide Courtesy to Dr. Suresh Venkatasubramanian.
Real-Time Stereo Matching on Programmable Graphics Hardware Liang Wei.
Mapping Computational Concepts to GPU’s Jesper Mosegaard Based primarily on SIGGRAPH 2004 GPGPU COURSE and Visualization 2004 Course.
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.
Image processing Lecture 4.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
Enhancing GPU for Scientific Computing Some thoughts.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Over View of the GPU Architecture CS7080 Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad &
Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
1 The Performance Potential for Single Application Heterogeneous Systems Henry Wong* and Tor M. Aamodt § *University of Toronto § University of British.
GPU Shading and Rendering Shading Technology 8:30 Introduction (:30–Olano) 9:00 Direct3D 10 (:45–Blythe) Languages, Systems and Demos 10:30 RapidMind.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
Cg Programming Mapping Computational Concepts to GPUs.
The programmable pipeline Lecture 3.
CSE 690: GPGPU Lecture 6: Cg Tutorial Klaus Mueller Computer Science, Stony Brook University.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
GRAPHICS PIPELINE & SHADERS SET09115 Intro to Graphics Programming.
CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.
GPU Computation Strategies & Tricks Ian Buck NVIDIA.
Introduction to OpenGL  OpenGL is a graphics API  Software library  Layer between programmer and graphics hardware (and software)  OpenGL can fit in.
Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science.
Figure ground segregation in video via averaging and color distribution Introduction to Computational and Biological Vision 2013 Dror Zenati.
CSC508 Convolution Operators. CSC508 Convolution Arguably the most fundamental operation of computer vision It’s a neighborhood operator –Similar to the.
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
1 Graphics CSCI 343, Fall 2015 Lecture 5 Color in WebGL.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
David Luebke 1 1/25/2016 Programmable Graphics Hardware.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Lecture.
Ray Tracing using Programmable Graphics Hardware
Mesh Skinning Sébastien Dominé. Agenda Introduction to Mesh Skinning 2 matrix skinning 4 matrix skinning with lighting Complex skinning for character.
The Graphics Pipeline Revisited Real Time Rendering Instructor: David Luebke.
Canny Edge Detection Using an NVIDIA GPU and CUDA Alex Wade CAP6938 Final Project.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing.
CSC Graphics Programming Budditha Hettige Department of Statistics and Computer Science.
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
A Crash Course on Programmable Graphics Hardware
Graphics on GPU © David Kirk/NVIDIA and Wen-mei W. Hwu,
GPU VSIPL: High Performance VSIPL Implementation for GPUs
Graphics Processing Unit
Chapter 6 GPU, Shaders, and Shading Languages
From Turing Machine to Global Illumination
Introduction to Programmable Hardware
Static Image Filtering on Commodity Graphics Processors
Kenneth Moreland Edward Angel Sandia National Labs U. of New Mexico
RADEON™ 9700 Architecture and 3D Performance
Graphics Processing Unit
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

Finding Body Parts with Vector Processing Cynthia Bruyns Bryan Feldman CS 252

Introduction Take existing algorithm for tracking human motion, speed up by computing on the GPU. Demonstrate that many vision algorithms are prime candidates for using vector processing

Results after false candidates have been removed Demo

Vision Algorithms Often computationally expensive- searching over many pixels for objects at many orientations and scales E.g. [((1024x768)pix)x3colors]x[12orientations]x[5 scales] Very often the case that highly parallizable

Limb Finding Goal – find candidate limbs Limbs look like long dark rectangles on light backgrounds or long light things on dark backgrounds

1. Convolution with filter convolve using FFT Response indicates how much pixels go from low to high intensity Convolve over all three color channels so as to not miss red – blue of same intensity Algorithm specifics * x

2. For every pixel location get resp conv from “left” and “right”, put into new matrix resp limb Algorithm specifics -resp conv x x resp conv x x resp limb

Algorithm specifics 3. Find local maximums – for every pixel replace with max. of local neighbors. If resp limb =locMax it’s a max resp limb locMax

GPU It’s a good choice because each operation is per pixel – SIMD-like Data stored in texture buffers equivalent to local cache Clean instruction set and developing interface language to exploit vector operations Justify your gaming habits

GPU dataflow model Hardware supports several data types for bandwidth optimization, i.e. 32 bit floating point, half etc. Data passed to main memory stages via binding Application Fragment Processor Assembly & Rasterization Framebuffer Operations Framebuffer Textures Vertex Processor

Fragment processor has high resource limits 1024 instructions 512 constants or uniform parameters Each constant counts as one instruction 16 texture units Reuse as many times as desired No branching But, can do a lot with condition codes No indexed reads from registers Use texture reads instead No memory writes

The algorithm Draw invokes the fragment programs The texture becomes a data structure – use two for framebuffers to avoid RAW hazzards FFT Fragment program Image Mask Convolution Program Cylinder Program Find Max Program For each orientation to search

Results (CPU-2.53 GHz P4 GPU Nvidia FX5900) Mask size fixed (22x13) vary image size *Additional GPU optimizations possible

Results – log scale (CPU-2.53 GHz P4 GPU Nvidia FX5900) Mask size fixed (22x13) vary image size 42.7 sec sec *Additional GPU optimizations possible

Results Image size fixed (512x512) vary mask size Varying mask sizes allow for varying limb sizes on same image

Results

Comments GPU and image processing are a good match Time to move memory from CPU to GPU is cumbersome – but can be overcome Non-uniformity of installations, products, exact specifications are hearsay

Acknowledgements Kenneth Moreland Deva Ramanan Okan Arikan