Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science.

Slides:



Advertisements
Similar presentations
GPGPU Programming Dominik G ö ddeke. 2Overview Choices in GPGPU programming Illustrated CPU vs. GPU step by step example GPU kernels in detail.
Advertisements

Timothy Blattner and Shujia Zhou May 18, This project is sponsored by Lockheed Martin We would like to thank Joseph Swartz, Sara Hritz, Michael.
GPU Virtualization Support in Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science and Information.
Full Gamut Color Matching for Tiled Display Walls Grant Wallace, Han Chen, Kai Li Princeton University.
An Effective GPU Implementation of Breadth-First Search Lijuan Luo, Martin Wong and Wen-mei Hwu Department of Electrical and Computer Engineering, UIUC.
ECE 562 Computer Architecture and Design Project: Improving Feature Extraction Using SIFT on GPU Rodrigo Savage, Wo-Tak Wu.
Challenge the future Delft University of Technology Evaluating Multi-Core Processors for Data-Intensive Kernels Alexander van Amesfoort Delft.
Acceleration of the Smith– Waterman algorithm using single and multiple graphics processors Author : Ali Khajeh-Saeed, Stephen Poole, J. Blair Perot. Publisher:
2009/04/07 Yun-Yang Ma.  Overview  What is CUDA ◦ Architecture ◦ Programming Model ◦ Memory Model  H.264 Motion Estimation on CUDA ◦ Method ◦ Experimental.
Digital Cameras CCD (Monochrome) RGB Color Filter Array.
LYU0503 Document Image Reconstruction on Mobile Using Onboard Camera Supervisor: Professor Michael R.Lyu Group Members: Leung Man Kin, Stephen Ng Ying.
Parallelization and CUDA libraries Lei Zhou, Yafeng Yin, Hong Man.
University of Michigan Electrical Engineering and Computer Science Amir Hormati, Mehrzad Samadi, Mark Woh, Trevor Mudge, and Scott Mahlke Sponge: Portable.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
CEG 4131-Fall Graphics Processing Unit GPU CEG4131 – Fall 2012 University of Ottawa Bardia Bandali CEG4131 – Fall 2012.
To GPU Synchronize or Not GPU Synchronize? Wu-chun Feng and Shucai Xiao Department of Computer Science, Department of Electrical and Computer Engineering,
Is your best choice كروت الشاشة المتوفرة لدينا.
Jared Barnes Chris Jackson.  Originally created to calculate pixel values  Each core executes the same set of instructions Mario projected onto several.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
2012/06/22 Contents  GPU (Graphic Processing Unit)  CUDA Programming  Target: Clustering with Kmeans  How to use.
By Meidika Wardana Kristi, NRP  Digital cameras used to take picture of an object requires three sensors to store the red, blue and green color.
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
EE 7700 Demosaicking Problem in Digital Cameras. Bahadir K. Gunturk2 Multi-Chip Digital Camera Lens Scene Spectral filters Beam- splitters Sensors To.
Computer Graphics Graphics Hardware
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY Accelerating Simulation of Agent-Based Models on Heterogeneous Architectures.
Revisiting Kirchhoff Migration on GPUs Rice Oil & Gas HPC Workshop
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.
YOU LI SUPERVISOR: DR. CHU XIAOWEN CO-SUPERVISOR: PROF. LIU JIMING THURSDAY, MARCH 11, 2010 Speeding up k-Means by GPUs 1.
Tracking with CACTuS on Jetson Running a Bayesian multi object tracker on a low power, embedded system School of Information Technology & Mathematical.
Tracking with CACTuS on Jetson Running a Bayesian multi object tracker on an embedded system School of Information Technology & Mathematical Sciences September.
Today’s lecture 2-Dimensional indexing Color Format Thread Synchronization within for- loops Shared Memory Tiling Review example programs Using Printf.
GPU Architecture and Programming
CSE 690: GPGPU Lecture 7: Matrix Multiplications Klaus Mueller Computer Science, Stony Brook University.
Introducing collaboration members – Korea University (KU) ALICE TPC online tracking algorithm on a GPU Computing Platforms – GPU Computing Platforms Joohyung.
Dense Image Over-segmentation on a GPU Alex Rodionov 4/24/2009.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
Finding Body Parts with Vector Processing Cynthia Bruyns Bryan Feldman CS 252.
Accelerating Error Correction in High-Throughput Short-Read DNA Sequencing Data with CUDA Haixiang Shi Bertil Schmidt Weiguo Liu Wolfgang Müller-Wittig.
JPEG-GPU: A GPGPU IMPLEMENTATION OF JPEG CORE CODING SYSTEMS Ang Li University of Wisconsin-Madison.
"Distributed Computing and Grid-technologies in Science and Education " PROSPECTS OF USING GPU IN DESKTOP-GRID SYSTEMS Klimov Georgy Dubna, 2012.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Jie Chen. 30 Multi-Processors each contains 8 cores at 1.4 GHz 4GB GDDR3 memory offers ~100GB/s memory bandwidth.
Adam Wagner Kevin Forbes. Motivation  Take advantage of GPU architecture for highly parallel data-intensive application  Enhance image segmentation.
GPU Accelerated MRI Reconstruction Professor Kevin Skadron Computer Science, School of Engineering and Applied Science University of Virginia, Charlottesville,
Program Optimizations and Recent Trends in Heterogeneous Parallel Computing Dušan Gajić, University of Niš Program Optimizations and Recent Trends in Heterogeneous.
David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become.
University of Michigan Electrical Engineering and Computer Science Adaptive Input-aware Compilation for Graphics Engines Mehrzad Samadi 1, Amir Hormati.
Sunpyo Hong, Hyesoon Kim
Canny Edge Detection Using an NVIDIA GPU and CUDA Alex Wade CAP6938 Final Project.
CPU-GPU Collaboration for Output Quality Monitoring Mehrzad Samadi and Scott Mahlke University of Michigan March 2014 Compilers creating custom processors.
Some GPU activities at the CMS experiment Felice Pantaleo EP-CMG-CO EP-CMG-CO 1.
Performed by:Liran Sperling Gal Braun Instructor: Evgeny Fiksman המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory.
Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.
GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.
Bayer Color Filter Array Demosaicing
Computer Graphics Graphics Hardware
GPU Architecture and Its Application
GPU-based iterative CT reconstruction
Image Transformation 4/30/2009
Graphics Processing Unit
Clusters of Computational Accelerators
General Programming on Graphical Processing Units
General Programming on Graphical Processing Units
How to Digitize the Natural Color
Computer Graphics Graphics Hardware
BWLOCK++: Protecting GPU Kernels on Integrated CPU-GPU Platforms
Graphics Processing Unit
Presentation transcript:

Developing the Demosaicing Algorithm in GPGPU Ping Xiang Electrical engineering and computer science

Outline Background Algorithm Implementation Experiment Results Future Work

Background 1.Color Filter Array. A mosaic of color filters in front of the image sensor

Background Demosaicing algorithm is to reconstruct a full color image from the data collected by the color filtering array.

Algorithm Bilinear interpolation: The red value of a non-red pixel is computed by the average of the two or four adjacent red pixels, and similarly for blue and green.

Algorithm

For Green Channels

Algorithm For Red or Blue Channels

Implementation Optimization: 1. Vectorize the pixel data to be processed 2. use shared memory to reduce the data transfer

Implementation 1. Vectorize the pixel data to be processed

Implementation 2. Use shared memory to reduce the data transfer

Experiment Results Platform: ATI Radeon™ HD 4870 Brook+ 1.4 Nvidia GeForce 8800 GTX CUDA 2.1 Dual Core AMD Opteron(tm) 2212 Frequency 2.0GHz

Experiment Results For small data size, GPU is not always a good choice a. Memory transfer time dominates the kernel execution time b. Computation is not that complex enough

Experiment Results When the data size is small, CUDA has better performance. When the data size increases to 4K, the brook+ performance catches up with CUDA

Experiment Results Explanation ? Memory SpeedStream processing Units HD *5*10*2 GTX (16*8)

Experiment Results Shared Register Usage Read data into shared register and try to reuse the data

Experiment Results

Future Work 1. Shared memory usage for further optimization 2. Integrate the code with proper interface to import image data and export pixel data 3. Report

Reference 1. High-Quality linear interpolation for Demosaicing of Bayer-patterned color images, Henrique S. Malvar, Li-wei He, and Ross Cutler 2. An Improved Demosaicing Algorithm Alexey Lukin, Denis Kubasov Questions?