Recent work in image-based rendering from unstructured image collections and remaining challenges Sudipta N. Sinha Microsoft Research, Redmond, USA
Image-based maps
Structure from motion (Sfm) Robust depth-map estimation Rendering Key Steps
Structure from motion (Sfm) Robust depth-map estimation Image-based navigation Recent results A multi-stage linear approach to structure from motion Sinha, Steedly & Szeliski, RMLE –ECCV workshop 2010 Piecewise planar stereo for image-based rendering Sinha, Steedly & Szeliski, ICCV 2009 Image-based walkthroughs from incremental and partial scene reconstructions Kumar, Ahsan, Sinha & Jawahar, BMVC 2010
Sequential Sfm Fitzgibbon’98, Pollefeys’98, Nister’01, Schaffalitzky’02, Vergauwen’06, Snavely’06, Snavely’07, Agarwal’09, Gherardi’10
Sequential Sfm Fitzgibbon’98, Pollefeys’98, Nister’01, Schaffalitzky’02, Vergauwen’06, Snavely’06, Snavely’07, Agarwal’09, Gherardi’10 Initial seed pair Pose estimation, triangulation Refinement
Sequential Sfm Fitzgibbon’98, Pollefeys’98, Nister’01, Schaffalitzky’02, Vergauwen’06, Snavely’06, Snavely’07, Agarwal’09, Gherardi’10 Initial seed pair Pose estimation, triangulation Refinement
Contributions Vanishing point (VP) constraints reduces drift in rotations – more accurate than [Govindu’04, Martinec’07] for urban scenes. – Faster pairwise matching + geometric verification New practical linear structure and translation estimation – more stable than the known linear method [Rother’03] – robust to outliers in 2D observations – easy to parallelize – faster than sequential Sfm – much faster than L ∞ - methods Linear multi-stage approach to structure from motion Sinha et. al (ECCV-RMLE workshop)
Vanishing Point (VP) Detection Pair Matching 2 – VP + 2 point RANSAC VP tracks relative rotations Feature Extraction VPs interest pts Images Global Rotation Estimation Linear Reconstruction 2-view Reconstruction Robust Alignment Global Scale & Translation Estimation VP tracks relative rotations global camera orientations relative pose estimates Full Sfm initialization Final Bundle Adjustment Linear multi-stage approach to structure from motion Sinha et. al (ECCV-RMLE workshop)
Results
Timings
Break-up of Timings
Comparison with sequential Sfm
STREET sequence HALLWAY sequence OURS (65 cams, 52K pts) before Bundle Adjustment BUNDLER (65 cams, 22K pts) BUNDLER (139 cams, 13K pts) OURS (184 cams, 27K pts) Comparison with sequential Sfm
Piecewise Planar Stereo for image-based rendering Feature matching Graph-cut based energy minimization Sinha et. al. ICCV 2009
Piecewise Planar Stereo for image-based rendering Sinha et. al. ICCV 2009 Planar Stereo Results
also handle non-planar scenes now... Piecewise Planar Stereo for image-based rendering
Skip global scene reconstruction (Sfm) step, Generate several overlapping, partial reconstructions instead. During navigation, jump between local coordinate frames. Scales easily, also parallelizable Incremental matching & reconstruction (images appear over time) Image-based walkthroughs from incremental and partial scene reconstructions Kumar et. al. BMVC 2010 Fort sequence (~5800 images)
Accuracy vs. Connectedness Reliable results from sparse, unstructured imagery – wide-baseline matching is still difficult Representations: – metric vs. topological reconstructions ? hybrid ? Reconstructing Indoors – Bottlenecks: doorways, corridors. – fewer features, non-Lambertian surfaces Existing issues in unstructured Sfm
Acquisition – Images vs. video – Short-term dynamics vs. long-term dynamics Need truly incremental Sfm – Start with scratch but keep going … ? – Interleaving matching, Sfm and dense stereo – Hybrid matching (2D—2D, 2D – 3D, 3D – 3D) Dynamic Image-based Maps: Challenges
Temporal appearance changes – Illumination: day/night, seasons, weather, lights on/off Cyclic, predictable – Albedo changes Store-fronts, ads-billboards, irreversible Geometric changes: – temporary vs. permanent Mid-level features for higher level recognition Dynamic Image-based Maps: Challenges
Questions ?