3-D Mapping With an RGB-D Camera

Slides:

Advertisements

Similar presentations

Presented by Xinyu Chang

Advertisements

RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.

Joydeep Biswas, Manuela Veloso

Accurate On-Line 3D Occupancy Grids Using Manhattan World Constraints Brian Peasley and Stan Birchfield Dept. of Electrical and Computer Engineering Clemson.

Hilal Tayara ADVANCED INTELLIGENT ROBOTICS 1 Depth Camera Based Indoor Mobile Robot Localization and Navigation.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

Nikolas Engelhard 1, Felix Endres 1, Jürgen Hess 1, Jürgen Sturm 2, Wolfram Burgard 1 1 University of Freiburg, Germany 2 Technical University Munich,

Probabilistic Robotics

May 16, 2015 Sparse Surface Adjustment M. Ruhnke, R. Kümmerle, G. Grisetti, W. Burgard.

(Includes references to Brian Clipp

Kiyoshi Irie, Tomoaki Yoshida, and Masahiro Tomono 2011 IEEE International Conference on Robotics and Automation Shanghai International Conference Center.

Lab 2 Lab 3 Homework Labs 4-6 Final Project Late No Videos Write up

Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.

SA-1 Probabilistic Robotics Probabilistic Sensor Models Beam-based Scan-based Landmarks.

Special Topic on Image Retrieval Local Feature Matching Verification.

Robust and large-scale alignment Image from

Active SLAM in Structured Environments Cindy Leung, Shoudong Huang and Gamini Dissanayake Presented by: Arvind Pereira for the CS-599 – Sequential Decision.

Adam Rachmielowski 615 Project: Real-time monocular vision-based SLAM.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Fitting. We’ve learned how to detect edges, corners, blobs. Now what? We would like to form a higher-level, more compact representation of the features.

The Terrapins Computer Vision Laboratory University of Maryland.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

1/53 Key Problems Localization –“where am I ?” Fault Detection –“what’s wrong ?” Mapping –“what is my environment like ?”

The 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems October 11-15, 2009 St. Louis, USA.

Automatic Camera Calibration

CSE 185 Introduction to Computer Vision

Real-time Dense Visual Odometry for Quadrocopters Christian Kerl

By Yevgeny Yusepovsky & Diana Tsamalashvili the supervisor: Arie Nakhmani 08/07/2010 1Control and Robotics Labaratory.

Action recognition with improved trajectories

Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

3D SLAM for Omni-directional Camera

Mapping and Localization with RFID Technology Matthai Philipose, Kenneth P Fishkin, Dieter Fox, Dirk Hahnel, Wolfram Burgard Presenter: Aniket Shah.

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Young Ki Baik, Computer Vision Lab.

Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.

1 Howard Schultz, Edward M. Riseman, Frank R. Stolle Computer Science Department University of Massachusetts, USA Dong-Min Woo School of Electrical Engineering.

Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.

IIIT Hyderabad Learning Semantic Interaction among Graspable Objects Swagatika Panda, A.H. Abdul Hafez, C.V. Jawahar Center for Visual Information Technology,

1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.

Peter Henry1, Michael Krainin1, Evan Herbst1,

Robotics Club: 5:30 this evening

Bundle Adjustment A Modern Synthesis Bill Triggs, Philip McLauchlan, Richard Hartley and Andrew Fitzgibbon Presentation by Marios Xanthidis 5 th of No.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Globally Consistent Range Scan Alignment for Environment Mapping F. LU ∗ AND E. MILIOS Department of Computer Science, York University, North York, Ontario,

Using Adaptive Tracking To Classify And Monitor Activities In A Site W.E.L. Grimson, C. Stauffer, R. Romano, L. Lee.

Vision-based SLAM Enhanced by Particle Swarm Optimization on the Euclidean Group Vision seminar : Dec Young Ki BAIK Computer Vision Lab.

Stereo Vision Local Map Alignment for Robot Environment Mapping Computer Vision Center Dept. Ciències de la Computació UAB Ricardo Toledo Morales (CVC)

Fast SLAM Simultaneous Localization And Mapping using Particle Filter A geometric approach (as opposed to discretization approach)‏ Subhrajit Bhattacharya.

776 Computer Vision Jan-Michael Frahm Spring 2012.

Robust Estimation Course web page: vision.cis.udel.edu/~cv April 23, 2003  Lecture 25.

Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.

COS 429 PS3: Stitching a Panorama Due November 10 th.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

SLAM Techniques -Venkata satya jayanth Vuddagiri 1.

Autonomous Mobile Robots Autonomous Systems Lab Zürich Probabilistic Map Based Localization "Position" Global Map PerceptionMotion Control Cognition Real.

Computer Photography -Scene Fixed 陳立奇.

Paper – Stephen Se, David Lowe, Jim Little

+ SLAM with SIFT Se, Lowe, and Little Presented by Matt Loper

Approximate Models for Fast and Accurate Epipolar Geometry Estimation

Simultaneous Localization and Mapping

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Probabilistic Robotics

Probabilistic Map Based Localization

Presentation transcript:

3-D Mapping With an RGB-D Camera Felix Endres, Jurgen Hess, Jurgen Sturm, Daniel Cremers, Wolfram Burgard Department of Computer Science, University of Freiburg Department of Computer Science, Technische Universität Münichen IEEE TRANSACTIONS ON ROBOTICS, VOL. 30, NO. 1, FEBRUARY 2014

Simultaneous Localization and Mapping Essential task for the autonomy of a robot Three main areas: localization, mapping and path planning Building a global map of the visited environment and, at the same time, utilizing this map to deduce its own location at any moment.

Sensors Exteroceptive sensors: Sonar Range lasers Cameras GPS Proprioceptive sensors: Encoders Accelerometers gyroscopes Visual SLAM refers to the problem of using images, as the only source of external information.

Pipeline of Visual SLAM Detection and Recognition Measure Visual Sensor Feature Extraction Feature Matching Visual Odometry Localization Map Representation Graph Optimization Create Map Feature matching accuracy Robust and fast visual odometry Globally consistent trajectory Efficient 3-D Mapping

RGB-D RGB Image Depth Image Point Cloud Image

RGB-D coordinate system

RGB-D Coordinate Transformation dep(u,v): depth data s: zoom factor f: focal distance c: center (u,v): point in rgb image (x,y,z): position in 3D

System Architecture Overview Create a 3-D probabilistic occupancy map Construct a graph that represents the geometric relations and their uncertainties Process the sensor data to extract geometric relationships

Egomotion Estimation Processes the sensor data to extract geometric relationships between the robot and landmarks at different points in time. In the case of an RGB-D camera, the input is an RGB image IRGB and a depth image ID . Determine landmarks by extracting a high-dimensional descriptor vector d from IRGB and storing them together with y , their location relative to the observation pose x .

Egomotion Estimation Landmark positions Geometric relations Robot state Keypoint descriptors

Features and Distance SIFT SURF ORB Euclidean distance Hellinger Distance SURF ORB Hamming Distance

Keypoints With SIFT

Match There are mismatches.

RANSAC（RANdom SAmple Consensus随机抽样一致） A data set with many outliers for which a line has to be fitted. Fitted line with RANSAC; outliers have no influence on the result

RANSAC Steps Select a random subset of the original data. Call this subset the hypothetical inliers. A model is fitted to the set of hypothetical inliers. All other data are then tested against the fitted model. Those points that fit the estimated model well, according to some model-specific loss function, are considered as part of the consensus set. The estimated model is reasonably good if sufficiently many points have been classified as part of the consensus set. Afterwards, the model may be improved by reestimating it using all members of the consensus set.

Match After RANSAC Reduce the mismatches.

EMM(Environment Measurement Model ) Low percentage of inliers does not necessarily indicates an unsuccessful transformation estimate and could be a consequence of low overlap between the frames or few visual features, e.g., due to motion blur, occlusions, or lack of texture. RANSAC using feature correspondences lack a reliable failure detection. Developed a method to verify a transformation estimate, independent of the estimation method used. EMM can be used to penalize pose estimates

Different Cases of Associated Observations

Environment Measurement Model The probability for the observation yi given an observation yj from a second frame can be computed as (1) Since the observations are independent given the true obstacle location z, we can rewrite the right-hand side to (2) (3) (4)

Environment Measurement Model Exploiting the symmetry of Gaussians, we can write (5) The product of the two normal distributions contained in the integral can be rewritten so that we obtain (6) (7) (8)

Environment Measurement Model The first term in the integral in (6) is constant with respect to z, which allows us to move it out of the integral (9) Assume p(z) to be a uniform distribution (10)

Environment Measurement Model Expand the normalization factor (11) (12) We obtain (13) (14)

Environment Measurement Model Combing (10) and (14), we get the final result (15) Combine the aforementioned 3-D distributions of all data associations to a 3N-dimensional normal distribution, and assume independent measurements yields (16)

Different Cases of Associated Observations We use the hypothesis test on the distribution of the individual observations and compute the fraction of outliers as a criterion to reject transformation.

Loop Closure Search Drastically Reduce the accumulating error. A more efficient strategy to select candidate frames for which to estimate the transformation

The Accumulating Error Landmark v x2 v Robot x3 x1 Error x5 x6 x4 e6 e5 e4 e2 e3 e1 t The pose estimate is in gross error (as is often the case following a transit around a long loop).

Loop Closure Search Landmark v x2 v Robot x3 x1 Error x5 x6 x4 e4 e6 Loop closure is the act of correctly asserting that a vehicle has returned to a previously visited location. Reduce the gross error.

Combine Several Motion Estimates Landmark v x2 v Robot x3 x1 Error x5 x6 x4 e4 e6 e2 e3 e5 e1 t Combining several motion estimates, additionally estimating the transformation to frames other than the direct predecessor substantially increases accuracy and reduces the drift . This increases the computational expense linearly with the number of estimates.

Loop Closure Search Require a more efficient strategy to select candidate frames for which to estimate the transformation. Strategy with three different types of candidates: Apply the egomotion estimation to n immediate predecessors. Search for loop closures in the geodesic (graph-) neighborhood of the previous frame. Remove the n immediate predecessors from the node and randomly draw k frames from the tree with a bias toward earlier frames.

Large Loop Closure To find large loop closures, we randomly sample l frames from a set of designated keyframes. A frame is added to the set of keyframes, when it cannot be matched to the previous keyframe. This way, the number of frames for sampling is greatly reduced, while the field of view of the frames in between keyframes always overlaps with at least one keyframe.

Comparison Between A Pose Graph Constructed Without and With Sampling of The Geodesic Neighborhood Top: n = 3 immediate predecessors k = 6 randomly sampled keyframes Bottom: n = 2 immediate predecessors k = 3 randomly sampled keyframes l = 2 sampled frames from the geodesic neighborhood On the challenging “Robot SLAM” dataset, the average error is reduced by 26 %.

Graph Optimization Transformation estimates between sensor poses form the edges of a pose graph. The edges form no globally consistent trajectory. General graph optimization(g2o) framework. Errors in motion estimation. Prune edges.

An Example of Graph

Graph Optimization—g2o(general graph optimization) Use g2o framework, minimize an error function of the form: To find the optimal trajectory: Here: X = is a vector of sensor poses. zij and Ωij represent the mean and the information matrix of a constraint relating the poses xi and xj . e(xi, xj, zij) is a vector error function that measures how well the poses xi and xj satisfy the constraint zij.

Objective Function by A Graph

Rewrite F(x) We obtain: Update step: Apply nonlinear operator:

g2o (general graph optimization) Framework

Edge Pruning In some cases, graph optimization may also distort the trajectory. Increased robustness can be achieved by detecting transformations that are inconsistent to other estimates. Do this by pruning edges after optimization based on the Mahalanobis distance obtained from g2o.

Map Representation Project the original point measurements into a common coordinate frame.

Map Representation Point cloud Drawback: Highly redundant and require vast computational and memory resources

Map Representation OctoMap: octree-based mapping framework Explicit representation of free space and unmapped areas, which is essential for collision avoidance and exploration tasks

Octree Octree: a hierarchical data structure for spatial subdivision in 3D Using Boolean occupancy states or discrete labels allows for compact representations of the octree: If all children of a node have the same state (occupied or free) they can be pruned.

Memory-Efficient Node Implementation Left: The first nodes of the octree example in memory connected by pointers. Data is stored as one float denoting occupancy. Right: The complete tree as compact serialized bitstream.

Experiments RGB-D benchmark Hardware: Several sequences captured with two Microsoft Kinect and one Asus Xtion Pro Live sensor Synchronized ground truth data for the sensor trajectory Hardware: An Intel Core i7 CPU with 3.40GHZ An nVidia GeForce GTX 570 graphics card

Trajectory Error Metric The root-mean-square of the absolute trajectory error : trajectory estimate : ground truth

Trajectory Estimate A 2-D projection of the ground truth trajectory for the “fr2/pioneer_slam” sequence and a corresponding estimate of our approach

Detailed Results Obtained With the Presented System

Visual Features Comparison The keypoint detectors and descriptors offer different tradeoffs between accuracy and processing times.

Detailed Results Per Sequence of the “fr1” dataset Using SIFT Features

The Number of Features Extracted Per Frame SIFT Increasing the number of features until About 600 to 700 improves the accuracy. No noticeable impact on accuracy was obtained when using more features.

With Hellinger Distance Instead of Euclidean Distance SIFT and SURF Improvement of up to 25.8% for some datasets However, for most sequence in the used dataset, improvement was not significant Neither increases the runtime nor the memory requirements noticeably Suggest the adoption of the Hellinger distance .

Evaluation of Proposed EMM q = I / (I + O) The use of the EMM decreases the average error for thresholds on the quality measure from 0.25 to 0.9

Evaluation of Graph Optimization

Summary 3-D SLAM system for RGB-D sensors Visual keypoint Extract visual keypoint : from the color images Localize visual keypoint: from the depth images Estimate transformation: RANSAC Optimize pose graph: nonlinear optimization Create map: OctoMap EEM: improve the reliability of the transformation estimates

Thank You Q&A