Stereo Video 1. Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos 2. Real-time Spatiotemporal Stereo Matching Using the Dual-Cross-Bilateral Grid 3. Temporally Consistent Disparity and Optical Flow via Efficient Spatio-temporal Filtering 4. Efficient Spatio-temporal Local Stereo Matching Using Information Permeability Filtering 1
A. Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos Michael Bleyer and Margrit Gelautz International Symposium on Image and Signal Processing and Analysis (ISPA)
B. Real-time Spatiotemporal Stereo Matching Using The Dual-cross-bilateral Grid Christian Richardt, Douglas Orr, Ian Davies, Antonio Criminisi, and Neil A. Dodgson1 The European Conference on Computer Vision (ECCV)
C. Temporally Consistent Disparity And Optical Flow Via Efficient Spatio-temporal Filtering Asmaa Hosni, Christoph Rhemann, Michael Bleyer, and Margrit Gelautz The Pacific-Rim Symposium on Image and Video Technology (PSIVT)
D. Efficient Spatio-temporal Local Stereo Matching Using Information Permeability Filtering Cuong Cao Pham, Vinh Dinh Nguyen, and Jae Wook Jeon International Conference on Image Processing (ICIP)2012 5
Outline Introduction Related Works Methods and Results A. Median Filter B. Temporal DCB Grid C. Spatial-temporal Weighted Smoothing D. Three-pass Aggregation Comparison Conclusion 6
INTRODUCTION 7
Introduction Stereo matching issues only focus on static image pairs. The conventional methods estimate the disparities by using spatial and color information. The important problem of extending to video is flickering. Solution : Base on local methods (for real-time) Enforce temporally consistent (for flickering) 8
RELATED WORKS 9
Related Works About Local Methods The key of local method lies in the cost aggregation step. Aggregate the cost data from the neighboring pixels within a finite size window. The most well-known method is edge-preserving algorithm. Adaptive support wight Geodesic Diffusion Bilateral filter Guided filter 10
Related Works Single-frame stereo matching 11
Related Works Spatio-temporal stereo matching The inter disparity difference between two successive frames is minimized to enforce the temporal consistency. 12
METHODS AND RESULTS 13
A. Median filter 14
A. Median filter 15
A. Median filter Computing 1 disparity map takes 1 second. But a video content about 30~60 frames per second. => Can NOT achieve real-time. No data and comparison. 16
B. Temporal DCB Grid Bilateral Grid It runs faster and uses less memory as σ increases. Dual-Cross-Bilateral Grid 17
B. Temporal DCB Grid Dichromatic DCB Grid Comparison (fps) x
B. Temporal DCB Grid Temporal DCB Grid Last n = 5 frames, each weighted by w i i=0 : current frame i=1 : previous frame 19 Weighted Sum
B. Temporal DCB Grid fps14 fps
21 B. Temporal DCB Grid Source data
B. Temporal DCB Grid Only use intensity information Just near-real-time 22
C. Spatial-temporal Weighted Smoothing Cost initialization Construct a spatio-temporal cost volume for each disparity d. Cost aggregation Smooth cost volume with a spatio-temporal filter.(Guided filter [1]) Disparity computation Select the lowest costs as disparity(WTA) Refinement Wighted median filter 23 [1]Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M. Fast Cost-Volume Filtering for Visual Correspondence and Beyond. CVPR(2011) and PAMI (2013)
C. Spatial-temporal Weighted Smoothing 24
C. Spatial-temporal Weighted Smoothing Cost initialization Cost aggregation 25 w k : w x * w y* w t : smoothness parameter
C. Spatial-temporal Weighted Smoothing The guided filter weights can be implemented by a sequence of linear operations. All summations are 3D box filters and can be computed in O(N) time. 26
C. Spatial-temporal Weighted Smoothing Disparity computation : Winner take all Refinement : Wighted Meadian filter => Just adjust to reduce single frame error. 27
C. Spatial-temporal Weighted Smoothing Temporal vs. frame-by-frame processing. 2nd row: Disparity maps computed by a frame-by-frame implementation show flickering artifacts. 3rd row: Our proposed method exploits temporal information, thus can remove most artifacts 28
C. Spatial-temporal Weighted Smoothing 29
C. Spatial-temporal Weighted Smoothing 30
C. Spatial-temporal Weighted Smoothing 31
D. Three-pass cost aggregation Three-pass cost aggregation technique based on information permeability(Adaptive Support-Weight).[2] 32 [2] Yoon, K.J., Kweon, I.S.: Locally Adaptive Support-Weight Approach for Visual Correspondence Search. In: CVPR (2005)
D. Three-pass cost aggregation 33 Frame i+1 Frame i Frame i-1
D. Three-pass cost aggregation Matching cost initialization v = (x, y, t) represents the spatial and temporal positions of a voxel. Similarity(weighted) function 34 Show the effectiveness of using temporal information in addition to spatial information.
D. Three-pass cost aggregation Spatial Aggregation : Horizontal and then Vertical 35
D. Three-pass cost aggregation Temporal Aggregation : Forward and backward Disparity computation : WTA Refinement consistency check 3 × 3 median filter. 36
D. Three-pass cost aggregation Computational Complexity Only six multiplications and nine additions per voxel It is still more efficient than the adaptive support-weight approach. Without motion estimation 37
D. Three-pass cost aggregation 38
D. Three-pass cost aggregation 39
COMPARISON 40
Comparison A.B.C.D. MethodOptical flow + Median filter Weighted last 5 frames Guided filter temporally Three pass DrawbackToo slowOver smoothness Reference frame number 3 frames -1~1 5 frames -4~0 5 frames -2~2 3frames -1~1 41
Comparison 42 No post-processing Include post-processing : consistency check and 3 × 3 median filter
CONCLUSION 43
Conclusion Based on edge-preserving methods. Extend these concepts to time dimension. These methods only solved slow motion scenes. They do not perform well with dynamic scenes that contain large object motions. 44