Download presentation
Presentation is loading. Please wait.
Published byAmanda Barrows Modified over 9 years ago
1
3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley, Stephen Gould Stanford Marius Muja UBC
2
3D and Object Recognition Provides more info than just visual texture Good for scale and segmentation Verification Need a good device for 3D info 2
3
3D Cameras 3 TechnologyExamplesPro/Con StereoNewcombe, Davison CVPR 2010Not dense, smearing; real-time, good resolution Registration + regularization Stereo + textureWG deviceDense, real-time, good resolution Short range Laser line scanSTAIR Borg scannerDense, most accurate Short range, not real time Structured lightPrimeSenseDense, real-time, good resolution Short range, ambient light/scene texture Phase shiftSR4, PMDDense, real-time, medium range Low resolution, low accuracy, gross errors Gated reflectanceCanestaDense, real-time Low resolution, low accuracy Tabletop manipulation: Short range High resolution High range accuracy Real-time TechnologyExamplesPro/Con StereoNewcombe, Davison CVPR 2010Not dense, smearing; real-time, good resolution Registration + regularization Stereo + textureWG deviceDense, real-time, good resolution Short range Laser line scanSTAIR Borg scannerDense, most accurate Short range, not real time Structured lightPrimeSenseDense, real-time, good resolution Short range, ambient light/scene texture Phase shiftSR4, PMD, CanestaDense, real-time, medium range Low resolution, low accuracy, gross errors Gated reflectance3DVDense, real-time Low resolution, low accuracy
4
WG Projected Texture Stereo Device Paint the scene with texture from a projector vs. single camera with structured light Advantages: Simple projector Standard algorithms Full frame rates (640x480) Dynamic scenes
5
WG project texture device Projector Red LED Eye safe Synchronized to cameras 3D Fly-thru
6
Object Recognition Pipeline Textured objects via keypoints [Victor Eruhimov, Suat Gedikli] Untextured objects via DOT [Stefan Holzer, Stefan Hinterstoisser] Simple 3D model matching [Marius Muja] STAIR 2D/3D features [Stephen Gould] 6 Pre-filterDetectVerify
7
MOPED – Textured object recognition with pose Model: Stereo view of an object at a known pose Extract keypoints and features For a new scene, match keypoints to each model Run SfM geometric check to verify and recover pose 7 Torres, Romea, Srinivasa ICRA 2010
8
8 - Need texture - Need high res camera
9
Dominant Orientation Templates (DOT) Stefan Hinterstoisser, Stefan Holzer (TUM; CVPR 2010, ECCV 2010) ● DOT is a template matching based approach templatecurrent scene - Template is slid over the image to compute the response for each image position - If response is above a threshold it is considered as detection of the template
10
DOT – Basic Principle ● DOT uses gradients instead of color or gray values templatecurrent scene - Gradients are less sensitive to illumination changes - Gradients have orientation and magnitude
11
Offline Learning ● Good learning is necessary to reduce false-positive rate ● We try to use all available information to segment the object: ● Point cloud from narrow stereo is used to detect the table and segment the point cloud of the object ● Object point cloud is used to create an initial mask ● Mask is refined using GrabCut (see OpenCV)
12
False-Positive Rejection ● Two more precise templates for validation: ● more precise and not discretized gradient template ● disparity template to compare expected with real disparities
13
False-Positive Rejection ● Compute error between reference point cloud and point cloud at detected position Optimize initial 3D point cloud pose given from the detection Directly gives object pose if model is associated with learned point clouds
14
14
15
15
16
16
17
17
18
18
19
STAIR Vision Library (SVL) Stanford STAIR project [Andrew Ng, Stephen Gould] Initially developed to support the Stanford AI Robot (STAIR) project Builds on top of OpenCV computer vision library and Eigen matrix library Provides a range of software infrastructure for computer vision machine learning probabilistic graphical models Hosted on SourceForge
20
Object Detection in SVL Sliding-window object detector Features are extracted from a local window Learned boosted decision-tree classifier scores each window Image is scanned at multiple resolutions to detect objects at different scales
21
Image Channels Image decomposed into multiple channels Depth at each pixel, obtained from a laser scanner, can be thought of as an additional channel intensity imageedge mapdepth map [Quigley et al., ICRA 2009]
22
Object Detection Features Learn a “patch” dictionary over intensity, edge and depth channels Patches encode localized templates for matching Depth patches capture shape; intensity and edge patches capture appearance Patch responses (over entire dictionary) are combined to form the feature vector [Quigley et al., ICRA 2009]
23
Results 150 images of cluttered indoor scenes 5-fold cross-validation Depth information provides significant improvement in area under precision-recall curve [Quigley et al., ICRA 2009] 8% improvement3% improvement38% improvement
24
Conclusions Realtime, accurate 3D devices are becoming available 3D can help in object detection for untextured objects - Combo of visual and 3D features best 3D is useful for verification Check out the PR2 Grasping Demo! 24
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.