Cuong Cao Pham and Jae Wook Jeon, Member, IEEE

Slides:

Advertisements

Similar presentations

Improved Census Transforms for Resource-Optimized Stereo Vision

Advertisements

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

Spatial-Temporal Consistency in Video Disparity Estimation ICASSP 2011 Ramsin Khoshabeh, Stanley H. Chan, Truong Q. Nguyen.

M.S. Student, Hee-Jong Hong

Real-Time Accurate Stereo Matching using Modified Two-Pass Aggregation and Winner- Take-All Guided Dynamic Programming Xuefeng Chang, Zhong Zhou, Yingjie.

Stereo Matching Segment-based Belief Propagation Iolanthe II racing in Waitemata Harbour.

Does Color Really Help in Dense Stereo Matching?

Texture Segmentation Based on Voting of Blocks, Bayesian Flooding and Region Merging C. Panagiotakis (1), I. Grinias (2) and G. Tziritas (3)

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Young Deok Chun, Nam Chul Kim, Member, IEEE, and Ick Hoon Jang, Member, IEEE IEEE TRANSACTIONS ON MULTIMEDIA,OCTOBER 2008.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Boundary matting for view synthesis Samuel W. Hasinoff Sing Bing Kang Richard Szeliski Computer Vision and Image Understanding 103 (2006) 22–32.

Efficient Moving Object Segmentation Algorithm Using Background Registration Technique Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, Fellow, IEEE Hsin-Hua.

Effective Gaussian mixture learning for video background subtraction Dar-Shyang Lee, Member, IEEE.

Aleixo Cambeiro Barreiro 광주과학기술원 컴퓨터 비전 연구실

A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.

100+ Times Faster Weighted Median Filter [cvpr ‘14]

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.

On Building an Accurate Stereo Matching System on Graphics Hardware

Fast Cost-volume Filtering For Visual Correspondence and Beyond Asmaa Hosni, Member, IEEE, Christoph Rhemann, Michael Bleyer, Member, IEEE, Carsten Rother,

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04) /04 $20.00 c 2004 IEEE 1 Li Hong.

Joint Histogram Based Cost Aggregation For Stereo Matching Dongbo Min, Member, IEEE, Jiangbo Lu, Member, IEEE, Minh N. Do, Senior Member, IEEE IEEE TRANSACTION.

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.

Stereo Matching Information Permeability For Stereo Matching – Cevahir Cigla and A.Aydın Alatan – Signal Processing: Image Communication, 2013 Radiometric.

Michael Bleyer LVA Stereo Vision

A Rapid Stereo Matching Algorithm Based on Disparity Interpolation Gang Yao Yong Liu Bangjun Lei Dong Ren Institute of Intelligent Vision and Image Information.

Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.

ICPR/WDIA-2012 High Quality Novel View Synthesis Based on Low Resolution Depth Image and High Resolution Color Image Jui-Chiu Chiang, Zheng-Feng Liu, and.

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

Surface Stereo with Soft Segmentation Michael Bleyer 1, Carsten Rother 2, Pushmeet Kohli 2 1 Vienna University of Technology, Austria 2 Microsoft Research.

A Local Adaptive Approach for Dense Stereo Matching in Architectural Scene Reconstruction C. Stentoumis 1, L. Grammatikopoulos 2, I. Kalisperakis 2, E.

Joint Depth Map and Color Consistency Estimation for Stereo Images with Different Illuminations and Cameras Yong Seok Heo, Kyoung Mu Lee and Sang Uk Lee.

Takuya Matsuo, Norishige Fukushima and Yutaka Ishibashi

報告人：張景舜 P.H. Wu, C.C. Chen, J.J. Ding, C.Y. Hsu, and Y.W. Huang IEEE Transactions on Image Processing, Vol. 22, No. 9, September 2013 Salient Region Detection.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

Cross-Based Local Multipoint Filtering

A Non-local Cost Aggregation Method for Stereo Matching

Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 2, FEBRUARY Leonardo De-Maeztu, Arantxa Villanueva, Member, IEEE, and.

5. SUMMARY & CONCLUSIONS We have presented a coarse to fine minimization framework using a coupled dual ellipse model to form a subspace constraint that.

Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.

A New Fingertip Detection and Tracking Algorithm and Its Application on Writing-in-the-air System The th International Congress on Image and Signal.

Image Enhancement [DVT final project]

CS332 Visual Processing Department of Computer Science Wellesley College Binocular Stereo Vision Region-based stereo matching algorithms Properties of.

A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.

1 Real-Time Stereo-Matching for Micro Air Vehicles Pascal Dufour Master Thesis Presentation.

Window-based Approach For Fast Stereo Correspondence Raj Kumar Gupta, Siu-Yeung Cho IET Computer Vision,

Hierarchical Method for Foreground DetectionUsing Codebook Model Jing-Ming Guo, Yun-Fu Liu, Chih-Hsien Hsia, Min-Hsiung Shih, and Chih-Sheng Hsu IEEE TRANSACTIONS.

Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

Fast Census Transform-based Stereo Algorithm using SSE2

Fast Cost-volume Filtering For Visual Correspondence and Beyond Asmaa Hosni, Member, IEEE, Christoph Rhemann, Michael Bleyer, Member, IEEE, Carsten Rother,

Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.

Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision.

Improved Census Transforms for Resource-Optimized Stereo Vision

Jeong Kanghun CRV (Computer & Robot Vision) Lab..

Visual Odometry David Nister, CVPR 2004

Journal of Visual Communication and Image Representation

Efficient Stereo Matching Based on a New Confidence Metric

Stereo Video 1. Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos 2. Real-time Spatiotemporal Stereo Matching Using the Dual-Cross-Bilateral.

BLOCK BASED MOTION ESTIMATION. Road Map Block Based Motion Estimation Algorithms. Procedure Of 3-Step Search Algorithm. 4-Step Search Algorithm. N-Step.

Fast face localization and verification J.Matas, K.Johnson,J.Kittler Presented by: Dong Xie.

數位三維視訊楊家輝 Jar-Ferr Yang 電腦與通信工程研究所電機工程學系國立成功大學 Institute of Computer and Communication Engineering Department of Electrical Engineering National Cheng.

Local Stereo Matching Using Motion Cue and Modified Census in Video Disparity Estimation Zucheul Lee, Ramsin Khoshabeh, Jason Juang and Truong Q. Nguyen.

An Implementation Method of the Box Filter on FPGA

Summary of “Efficient Deep Learning for Stereo Matching”

Semi-Global Matching with self-adjusting penalties

SoC and FPGA Oriented High-quality Stereo Vision System

Image Segmentation Techniques

Presentation transcript:

Cuong Cao Pham and Jae Wook Jeon, Member, IEEE Domain Transformation-Based Efficient Cost Aggregation for Local Stereo Matching Cuong Cao Pham and Jae Wook Jeon, Member, IEEE IEEE Transactions on Circuits and Systems for Video Technology, 2012

Outline Introduction Framework Proposed Algorithm Experimental Results Compute Costs Cost Aggregation : Domain Tramsformation Optimization & Refinment Experimental Results Conclusion

Introduction

Background Global stereo algorithms: Local stereo algorithms : [4] K.-J. Yoon and I.-S. Kweon, “Adaptive Support-Weight Approach for Correspondence Search,” IEEE Trans. Pattern Anal. Mach. Intell., vol.28, no. 4, pp. 650-656, 2006. Global stereo algorithms: High accuracy but low speed Local stereo algorithms : High speed but low accuracy The key : cost aggregation Adaptive support-weight[4] : ‧The most well-known local method ‧The state-of-art local algorithm ‧Reduce the gap between global method and local method → Excessive time consumption related to support window size

Related Work Adaptive Weight[4] Cost-volume filtering[21] Bilateral filter Cost-volume filtering[21] Guided filter Geodesic Diffusion[27] Anisotropic diffusion → Geodesic diffusion [21] C. Rhemann, A. Hosni, M. Bleyer, C. Rother, and M. Gelautz, “Fast Cost-Volume Filtering for Visual Correspondence and Beyond,” in Proc.IEEE Intl. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3017-3024,2011. [27] L. De-Maeztu, A. Villanueva, and R. Cabeza, “Near Real-Time Stereo Matching Using Geodesic Diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 2, pp. 410 - 416, 2012.

Objective Present a cost aggregation technique: Domain transformation: Achieve high precision Fast execution Using Domain transformation Domain transformation: Aggregation of 2D cost data → a sequence of 1D filters Lower computational requirements

Framework

Framework

Proposed Algorithm

Pixel-wise Cost Consumption Truncated absolute difference (TAD) : TAD of the gradient : Final cost data: Ii(p): intensity value of the i-th color channel in the RGB color space at pixel p of the image I Tc : user-defined truncation value

Aggregation 1D Cost Data Inspired by the domain transformation technique[14] Dimensionality reduction technique Defines a geodesic distance-preserving representation of a 2D image embedded in 5D (x, y, Ir, Ig, Ib) as a real line. Aggregation of 2D cost data → a sequence of 1D filters Reduce computational time [14] Eduardo S. L. Gastal and Manuel M. Oliveira, “Domain Transform for Edge-Aware Image and Video Processing,” ACM Trans. Graph., vol. 30, no. 4, 2011.

Aggregation 1D Cost Data 1D discrete signal: Cost slide Cd : Feedback comb filter[32]: Cd,y : input signal Cd,y : output signal a ∈ 0,1 : feedback coefficient row y a : consistent → non-edge-aware filter ‘ n-1 n [32] J. Smith, “Introduction to Digital Filters with Audio Applications,” W3K Publishing, 2007.

Aggregation 1D Cost Data 1D discrete signal: Cost slide Cd : Feedback comb filter[32]: Cd,y : input signal Cd,y : output signal a ∈ 0,1 : feedback coefficient row y a : consistent → non-edge-aware filter ‘ n-1 n

Aggregation 1D Cost Data Two similar samples set a high value of a Two different samples set a low value of a ( Discontinue region → prevent the propagation train ) Edge-aware feedback comb filter: g : chosen metric representing the dissimilarity between two samples Compute g as the distance between two samples in the 1D domain (transformed from the corresponding row of the guidance image I)

Domain Transformation I : Ω ϲ R2 → R3 (a 2D RGB color image) p = (xp, yp) : spatial coordinate I(p) = (rp, gp, bp) : range coordinates Goal: find a transform t :R2 → R which preserves the original distances between points on C (given by some metric) R2 R3 g v

Domain transformation L1 distance between two neighboring points in the original domain R2 Distance between two corresponding samples in the new domain R gt(x) = t (x, I(x)) : the transformation operator at point x must equal R R2

Domain transformation Divide both sides by h and take the limit as h→0: The value at any point u in the transformed domain: (By taking the integral of gt′ (x) from 0 to u)

Domain transformation The value at any point u in the transformed domain: The distance between any two points u and v in the transformed domain : (corresponds to the arc length from u to v of the signal I)

Domain transformation The distance between any two points u and v : We can also control the influence of spatial and intensity range information similar to the bilateral filter. Embedding the values of σs and σr :

Domain transformation Select the maximum absolute difference to define the distance between two points in the original domain: The final distance g:

Domain transformation Left image Non-edge-aware filter Edge-aware filter

Aggregation 2D Cost Data

Aggregation 2D Cost Data 1. Left → Right 2. Right → Left 3. Top → Bottom 4. Bottom→ Top

Aggregation 2D Cost Data L→ R R→ L T→ B B→ T

Aggregation 2D Cost Data is the 1D discrete signal plotted from each column along the y direction of the cost slide Cd : σH : kernel standard deviation (implicitly set to σs) σs ∈ [10,300] and σr ∈ [0.01,0.3] can yields good results.

Aggregation 2D Cost Data ‧Algorithm:

Optimization & Refinement Winner-take-all Select disparities Left-Right consistency check Occluded regions Weighted median filter Noise removing

Winner-take-all Winner-take-all(WTA) strategy: Sd : the set of all possible disparities Cd : Aggregated cost ‘

Left-right consistency check The disparity maps obtained at this stage contain errors in the occluded regions. Perform Left-right consistency check A pixel in the left disparity map is marked as invalidated: when its value differs from the corresponding value of the pixel in the right disparity map by a value greater than one Assign the minimum value between two closest validated pixels min validated Left image Right image invalidated

Weighted Median Filter Using a weighted median filter to : Remove streak-like artifacts Remove the small amount of remaining noise Select bilateral filter weight to compute the weighted median filter The validated pixels are not affected by this operation.

Consistency Map vs. Final disparity Invalidated pixels

Experimental Results

Experimental Results Middlebury stereo evaluation Real-world image Middlebury dataset Real-world image Camcorder data Execution time CUDA implementation

Middlebury Evaluation - 1 Adaptive Weight[4] 35×35 support window with γs = 17 and γr = 7:5 Cost-volume filtering[21] 19×19 support window and ε = 0:0004 Geodesic Diffusion[27] Iterated n = 24 times with γc = 40 and l0 = 0:15 InfoPermeable[31] Exponential function with σ = 25 Proposed σs=25 and σr=0.1 Compare with the best-performing algorithm inspired by well-known edge-aware filters [31] C. Cigla and A. A. Alatan, “Efficient Edge-Preserving Stereo Matching”, in ICCV Workshop on LDRMV, 2011.

Middlebury Evaluation - 1 Compare the performance of the raw cost aggregation The same pixel-wise cost computation and disparity optimization steps were installed to ensure fair comparison. Select the TAD of the color and the gradient for computing matching costs { λ , Tc, Tg }={ 0.1, 7/255, 2/255 } Guidance image used for the aggregation stage: Using 3x3 median filter Reduce the high-frequncy information that is not actually useful

Experimental Results Only non-occluded and discontinuity regions

Middlebury Evaluation - 2 Without refinement vs. with refinement { λ , Tc, Tg, σs , σr }={ 0.1, 7/255, 2/255, 45, 0.006 } 3x3 median filter Filtering Guidance image used for the aggregation stage The weighted median filter Used in disparity refinement stage r = 21, γs = 81, and γr = 0.04

Experimental Results without refinement with refinement

Experimental Results

Experimental Results

Experimental Results

Real-world Image Camcorder data: Cafe (640×360, 32 possible disparities) Newspaper (512×384, 32 possible disparities) Book_Arrival (512×384, 60 possible disparities)

Proposed vs. CostFilter[21]

Execution time Using C++ PC with an AMD Athlon 64 X2 Dual Core 3800+ 2.00 Ghz. Measure only the execution time of the aggregation performing on the left view No occlusion handling or post-processing times were included.

Execution time Iteration times: n Window: 2n+1 × 2n+1 Support window size / number of iterations

CUDA Implementation Algorithm Time(s) Graphics Card Image GeoDif 0.06 NVIDIA GeForce GTX 480 Tsukuba stereo pair CostFilter 0.041 400×300 image Proposed 0.0095 NVIDIA GeForce GTX 460 Tsukuba stereo pair

Conclusion

Conclusion Solve the excessive time consumption bottleneck of adaptive-weight Integrates the appealing properties of domain transformation into the cost aggregation Using a sequence of 1D operations Lower computational requirements Lower memory costs Fast and accurate local method