Download presentation
Presentation is loading. Please wait.
Published byAnissa Twilley Modified over 9 years ago
1
Building Rome in a Day Sameer Agarwal1 Noah Snavely2 Ian Simon1 Steven M. Seitz1 Richard Szeliski3 1University of Washington 2Cornell University 3Microsoft Research
2
Outline 1. Introduction 2. System Design 3. Result 4. Conclusion
3
Introduction Entering the search term “Rome” on flickr returns more than two million photographs. 3D reconstruction in Google Earth and Microsoft’s Virtual Earth
4
Exploring Photo Collection in 3D
5
Outline 1. Introduction 2. System Design – 1.pre-processing & feature extraction – 2.matching – 3.geometric estimation 3. Result 4. Conclusion
6
Scene reconstruction Automatically estimate position, orientation, and focal length of cameras 3D positions of feature points
7
Feature detection Detect features using SIFT [Lowe, IJCV 2004]
8
Feature detection Detect features using SIFT [Lowe, IJCV 2004]
9
Feature detection Detect features using SIFT [Lowe, IJCV 2004]
10
Feature matching Match features between each pair of images approximate nearest neighbor matching
11
Feature matching Refine matching using RANSAC [Fischler & Bolles 1987] to estimate fundamental matrices between pairs
12
Correspondence estimation Link up pairwise matches to form connected components of matches across several images Image 1Image 2Image 3Image 4
13
Structure from motion structure for motion: automatic recovery of camera motion and scene structure from two or more images. It is a self calibration technique and called automatic camera tracking or match moving. Unknowncameraviewpoints
14
Structure from motion Camera 1 Camera 2 Camera 3 R 1,t 1 R 2,t 2 R 3,t 3 p1p1 p4p4 p3p3 p2p2 p5p5 p6p6 p7p7 minimize f (R, T, P)f (R, T, P) rotations R, positions t, and 3D point locations P that minimize sum of squared reprojection errors f
15
Incremental structure from motion
16
Optimize parameters for two cameras and common points Find new image with most matches to existing points Initialize new camera using pose estimation Bundle adjust Add new points Bundle adjust
17
Incremental structure from motion
21
Vocabulary trees (Nister & Stewenius, 2006) Computational efficiency k-means tree is used to quantize the feature descriptors
22
TF-IDF ( term frequency–inverse document frequency ) Consider a document containing 100 words wherein the word cow appears 3 times. (TF) = (3 / 100) = 0.03. Assume we have 10 million documents and cow appears in one thousand of these. (IDF) = log(10 000 000 / 1 000) = 4.
23
TF-IDF score is the product of these quantities: 0.03 × 4 = 0.12 The word is important if the TF-IDF score is large 某一特定文件內的高詞語頻率,以及該詞語 在整個文件集合中的低文件頻率,可以產生 出高權重的 TF-IDF 。因此, TF-IDF 傾向於過 濾掉常見的詞語,保留重要的詞語。
24
Query expansion Large-scale image matching Better approach: use bag-of-words technique to find likely matches For each image, find the top M scoring other images, do detailed SIFT matching with those
30
Outline 1. Introduction 2. System Design 3. Result 4. Conclusion
31
Matching and reconstruction statics for the three sets
32
Building Rome in a Day Rome, Italy. Reconstructed 150,000 in 21 hours on 496 machines Colosseum St. Peter’s Basilica Trevi Fountain
33
Dubrovnik, Croatia. 4,619 images (out of an initial 57,845). Total reconstruction time: 23 hours Number of cores: 352
34
Dubrovnik Dubrovnik, Croatia. 4,619 images (out of an initial 57,845). Total reconstruction time: 23 hours Number of cores: 352
35
San Marco Square San Marco Square and environs, Venice. 14,079 photos, out of an initial 250,000. Total reconstruction time: 3 days. Number of cores: 496.
36
Outline 1. Introduction 2. System Design 3. Result 4. Conclusion
37
Conclusion Our experimental results demonstrate that it is now possible to reconstruct cities consisting of 150K images in less than a day on a cluster with 500 compute cores. Large-scale image matching 3D models http://grail.cs.washington.edu/rome/ http://phototour.cs.washington.edu/applet/index.h tml
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.