Download presentation
Presentation is loading. Please wait.
Published byNigel Holt Modified over 9 years ago
1
Acquiring 3D Indoor Environments with Variability and Repetition Young Min Kim Stanford University Niloy J. Mitra UCL/ KAUST Dong-Ming Yan KAUST Leonidas Guibas Stanford University 1
2
Data Acquisition via Microsoft Kinect Raw data: Noisy point clouds Unsegmented Occlusion issues Our tool: Microsoft Kinect Real-time Provides depth and color Small and inexpensive 2
3
Dealing with Pointcloud Data Object-level reconstruction Scene-level reconstruction [Chang and Zwicker 2011] [Xiao et. al. 2012] 3
4
Mapping Indoor Environments Mapping outdoor environments – Roads to drive vehicles – Flat surfaces General indoor environments contain both objects and flat surfaces – Diversity of objects of interest – Objects are often cluttered – Objects deform and move Solution: Utilize semantic information 4
5
Nature of Indoor Environments Man-made objects can often be well- approximated by simple building blocks – Geometric primitives – Low DOF joints Many repeating elements – Chairs, desks, tables, etc. Relations between objects give good recognition cues 5
6
Indoor Scene Understanding with Pointcloud Data Patch-based approach Object-level understanding [Silberman et. al. 2012] [Koppula et. al. 2011] [Shao et. al. 2012][Nan et. al. 2012] 6
7
Comparisons [1] An Interactive Approach to Semantic Modeling of Indoor Scenes with an RGBD Camera [2] A Search-Classify Approach for Cluttered Indoor Scene Understanding [1][2]ours Prior model3D database Learned DeformationScaling Part-based scaling Learned MatchingClassifier Geometric SegmentationUser-assistedIteration DataMicrosoft KinectMantis VisionMicrosoft Kinect 7
8
Contributions Novel approach based on learning stage – Learning stage builds the model that is specific to the environment Build an abstract model composed of simple parts and relationship between parts – Uniquely explain possible low DOF deformation Recognition stage can quickly acquire large- scale environments – About 200ms per object 8
9
Approach Learning: Build a high-level model of the repeating elements Recognition: Use the model and relationship to recognize the objects translational rotational 9
10
Approach Learning – Build a high-level model of the repeating elements 10
11
Output Model: Simple, Light-Weighted Abstraction Primitives – Observable faces Connectivity – Rigid – Rotational – Translational – Attachment Relationship – Placement information contact translational rotational 11
12
Joint Matching and Fitting Individual segmentation – Group by similar normals Initial matching – Focus on large parts – Use size, height, relative positions – Keep consistent match Joint primitive fitting – Add joints if necessary – Incrementally complete the model 12
13
Approach Learning – Build a high-level model of the repeating elements 13
14
Approach Learning – Build a high-level model of the repeating elements Recognition – Use the model and relationship to recognize the objects 14
15
Hierarchy Ground plane and desk Objects – Isolated clusters Parts – Group by normals The segmentation is approximate and to be corrected later 15
16
Bottom-Up Approach Initial assignment for parts vs. primitives – Simple comparison of height, normal, size – Robust to deformation – Low false-negatives Refined assignment for objects vs. models – Iteratively solve for position, deformation and segmentation – Low false-positives parts 16
17
Bottom-Up Approach Initial assignment for parts vs. primitive nodes Refined assignment for objects vs. models Input points Initial objects Models matched Refined objects objectspartsmatched 17
18
Results Data available: http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_in door/paper_docs/data_learning.zip http://www0.cs.ucl.ac.uk/staff/n.mitra/research/acquire_in door/paper_docs/data_recognition.zip 18
19
Synthetic Scene Recognition speed: about 200ms per object 19
20
Synthetic Scene 20
21
Synthetic Scene 21
22
Different pair Similar pair 22
23
Different pair Similar pair 23
24
24
25
Office 1 trash bin 4 chairs 2 monitors 2 whiteboards 25
26
Office 2 26
27
Office 3 27
28
Deformations drawer deformations monitorlaptop missed monitor chair 28
29
Auditorium 1 Open table 29
30
Auditorium 2 Open table Open chairs 30
31
Seminar Room 1 missed chairs 31
32
Seminar Room 2 missed chairs 32
33
Limitations Missing data – Occlusion, material, … Error in initial segmentation – Cluttered objects are merged as a single segment – View-point sometimes separate single object into pieces 33
34
Conclusion We present a system that can recognize repeating objects in cluttered 3D indoor environments. We used purely geometric approach based on learned attributes and deformation modes. The recognized objects provide high-level scene understanding and can be replaced with high-quality CAD models for visualization (as shown in the previous talks!) 34
35
Thank You Qualcomm Corporation Max Planck Center for Visual Computing and Communications NSF grants 0914833 and 1011228 a KAUST AEA grant Marie Curie Career Integration Grant 303541 Stanford Bio-X travel Subsidy 35
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.