Jungwook Choi and Rob A. Rutenbar

Slides:



Advertisements
Similar presentations
A. Criminisi, T. Sharp and K. Siddiqui. Properties of our algorithm efficient on high-res./nD images (~milliseconds) easy to edit and fix accurate (e.g.
Advertisements

Mean-Field Theory and Its Applications In Computer Vision1 1.
Bayesian Belief Propagation
Cornell Accelerating Belief Propagation in Hardware Skand Hurkat and José Martínez Computer Systems Laboratory Cornell University
Vision REU Week 3. Image registration  Used mutual information-based registration from ITK Ben SchoepkeREU Week 36/8/07 Fixed imageMoving image Pre-registrationPost-registration.
Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
Cuong Cao Pham and Jae Wook Jeon, Member, IEEE
Real-Time Accurate Stereo Matching using Modified Two-Pass Aggregation and Winner- Take-All Guided Dynamic Programming Xuefeng Chang, Zhong Zhou, Yingjie.
Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)
Graphical models, belief propagation, and Markov random fields 1.
Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
Robust Higher Order Potentials For Enforcing Label Consistency
CS6670: Computer Vision Noah Snavely Lecture 17: Stereo
Today: Image Segmentation Image Segmentation Techniques Snakes Scissors Graph Cuts Mean Shift Wednesday (2/28) Texture analysis and synthesis Multiple.
Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.
1 Computer Vision Research  Huttenlocher, Zabih –Recognition, stereopsis, restoration, learning  Strong algorithmic focus –Combinatorial optimization.
High-Quality Video View Interpolation
Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.
Stereo Matching & Energy Minimization Vision for Graphics CSE 590SS, Winter 2001 Richard Szeliski.
Mahesh Sukumar Subramanian Srinivasan. Introduction Face detection - determines the locations of human faces in digital images. Binary pattern-classification.
Manhattan-world Stereo Y. Furukawa, B. Curless, S. M. Seitz, and R. Szeliski 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.
Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.
Stereo Matching Information Permeability For Stereo Matching – Cevahir Cigla and A.Aydın Alatan – Signal Processing: Image Communication, 2013 Radiometric.
Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY.
Minimizing Sparse Higher Order Energy Functions of Discrete Variables (CVPR’09) Namju Kwak Applied Algorithm Lab. Computer Science Department KAIST 1Namju.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Graph Cut Algorithms for Binocular Stereo with Occlusions
Graph Cut 韋弘 2010/2/22. Outline Background Graph cut Ford–Fulkerson algorithm Application Extended reading.
Recap from Monday Image Warping – Coordinate transforms – Linear transforms expressed in matrix form – Inverse transforms useful when synthesizing images.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Cross-Based Local Multipoint Filtering
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
A Non-local Cost Aggregation Method for Stereo Matching
Lena Gorelick joint work with O. Veksler I. Ben Ayed A. Delong Y. Boykov.
Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.
Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.
Area: VLSI Signal Processing.
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006.
1 Markov random field: A brief introduction (2) Tzu-Cheng Jen Institute of Electronics, NCTU
Jason Li Jeremy Fowers 1. Speedups and Energy Reductions From Mapping DSP Applications on an Embedded Reconfigurable System Michalis D. Galanis, Gregory.
A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.
Reconfigurable architectures ESE 566. Outline Static and Dynamic Configurable Systems –Static SPYDER, RENCO –Dynamic FIREFLY, BIOWATCH PipeRench: Reconfigurable.
Efficient Belief Propagation for Image Restoration Qi Zhao Mar.22,2006.
A New Class of High Performance FFTs Dr. J. Greg Nash Centar ( High Performance Embedded Computing (HPEC) Workshop.
Project 2 due today Project 3 out today Announcements TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAA.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
Philipp Gysel ECE Department University of California, Davis
Gaussian Conditional Random Field Network for Semantic Segmentation
Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:
Seamless Video Stitching from Hand-held Camera Inputs Kaimo Lin, Shuaicheng Liu, Loong-Fah Cheong, Bing Zeng National University of Singapore University.
An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm Miaoqing Huang Nov. 5, 2010.
Hardware Acceleration of A Boolean Satisfiability Solver
Dynamo: A Runtime Codesign Environment
Summary of “Efficient Deep Learning for Stereo Matching”
Markov Random Fields with Efficient Approximations
STEREO MATCHING USING POPULATION-BASED MCMC
SoC and FPGA Oriented High-quality Stereo Vision System
3D Stereoscopic Image Analysis Ahmed Kamel, Aashish Agarwal
Introduction to Multiprocessors
Introduction to Heterogeneous Parallel Computing
Sahand Salamat, Mohsen Imani, Behnam Khaleghi, Tajana Šimunić Rosing
Chapter 4 Multiprocessors
Exact Voxel Occupancy with Graph Cuts
Presentation transcript:

Jungwook Choi and Rob A. Rutenbar Configurable and Scalable Belief Propagation Accelerator for Computer Vision Jungwook Choi and Rob A. Rutenbar

Belief Propagation FPGA for Computer Vision Variety of pixel-labeling apps in CV are mapped to probabilistic graphical model, effectively solved by BP FPGA acceleration  Better {Performance/Watt} + Reconfigurability Stereo Matching Image denoising Object segmentation

Before: Point-Accelerators for BP [FPL 2012, FPGA 2013] Very fast stereo matching, but not configurable to other BP problems Pipelined, but not scalable/parallel (only one PE consumes entire mem BW) Pipelined Message Passing Arch Video Stereo Matching Benchmark

New: Scalable/Configurable BP Architecture Not just a pipeline any longer: really parallel… Efficient new memory subsystem overlaps BW and computation, checks for data conflicts P Parallel processor elements (pixel streams) Novel, Configurable Factor-Evaluation unit removes the O(|Labels|2) complexity

Fast Configurable Message Passing: Jump Flooding Problem: BP message computation quadratic in L=|Labels| Solution: Jump Flooding* BP msg approx = L log(L) Analogy: Like “FFT”, smart order for label arith & comparisons Cost fn for inrerence Jump Flooding Message Passing Unit *[Rong, Tan, ACM Symp Int 3D, 2006]

Results: Positive Scalability 2, 4 PEs running (limited by Xilinx V5 size); sims 1-16 PEs Parameterized by “Bandwidth needed to feed P processors” If we can feed the architecture – promising scalability Normalized Mem BW to Feed P Proc (mem blocksize B=4 fixed) Execution Time vs (Mem BW for P processors) P=2 P=4

Results: Configurable BP Architecture 12-40X faster than software (PE = 4); no loss of result quality First “custom HW” to ever run >1 {Middlebury,OpenGM} benchmarks Comparison of Execution Time (in sec) for {Middlebury[1], OpenGM[2]} Benchmarks Inference Results for {Middlebury,OpenGM} Speed comparable to “point-accelerator” [1] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, “A comparative study of energy minimization methods for Markov random fields with smoothness-based priors,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 6, pp. 1068– 1080, 2008. [2] J. H. Kappes, B. Andres, F. Hamprecht, C. Schnorr, S. Nowozin, D. Batra, S. Kim, B. X. Kausler, J. Lellmann, N. Komodakis et al., “A comparative study of modern inference techniques for discrete energy minimization problems,” in CVPR. IEEE, 2013, pp. 1328–1335.