1 Long-term image-based motion estimation Dennis Strelow
2 Problems (1) micro air vehicle (MAV) navigation AeroVironment Black WidowAeroVironment Microbat
3 Problems (2) mars rover navigation Mars Exploration Rovers (MER)Hyperion
4 Problems (3) robotic search and rescue Rhex Center for Robot-Assisted Search and Rescue, U. of South Florida
5 Problems (4) NASA ISS personal satellite assistant
6 Problems (5) Each of these problems requires: six degree of freedom motion in unknown environments without GPS or other absolute positioning over the long term …and some of the problems require: small, light, and cheap sensors
7 Existing mage-based approaches (1) Monocular image-based motion estimation is a good candidate given these requirements In particular, simultaneous estimation of: multiframe motion sparse scene structure is the most promising approach
8 Existing image-based approaches (2) Algorithms exist to estimate camera motion and sparse scene structure: SVD-based factorization (1992) Bundle adjustment (1950’s) Kalman filtering (1990) Variable state dimension filter (~1994)
9 Existing image-based approaches (3) Hotel sequence here
10 Existing image-based approaches (4) These often generate good results:
11 Sensitivity of motion estimation (4) But the resulting estimates can be very sensitive to: incorrect or insufficient image feature tracking camera modeling and calibration errors poor prior assumptions on the motion poor approximations in error modeling outlier detection thresholds
12 Sensitivity of motion estimation (5) REL sequence here
13 Sensitivity of motion estimation (6)
14 Long-term motion estimation (1) For applications like micro air vehicles… …the situation is really desperate
15 Long-term motion estimation (2) Each tracked point is only visible in a small percentage of the image sequence (Example video here of going over the wall)
16 Long-term motion estimation (3) So, we’re no longer estimating our motion with respect to a single point… …the motion estimation essentially becomes integration, just as in odometry
17 Long-term motion estimation (4) But, harder than odometry: odometry measurements are direct measurements of the incremental motion whereas, as we’ve seen: sparse image measurements can produce very poor estimates of the incremental motion
18 Long-term motion estimation (5) And since we’re essentially integrating incremental motions: one gross error in the estimated motion finishes you one mild qualitative error may quickly compound into a gross error, finishing you
19 Long-term motion estimation (5) And since we’re essentially integrating incremental motions: one gross error in the estimated motion finishes you one mild qualitative error may quickly compound into a gross error, finishing you Even with no gross or “mild qualitative” errors the integrated motion will always drift
20 Approach (1) Two areas: improved 6 DOF estimates improved tracking
21 Approach (2) most of our work has been on improved 6 DOF estimates
22 Improved 6 DOF estimates (1) Batch estimation: uses all of the observations at once all observations must be available before computation begins Online estimation: observations are incorporated as they arrive many reasons why the filter’s prior distribution might be inaccurate
23 Improved 6 DOF estimates (2) Image measurements only often the only option Image and inertial measurements can disambiguate image-only estimates more calibration, unknowns
24 Improved 6 DOF estimates (3) Conventional images: most common case Omnidirectional images: requires a more complex projection model, additional calibration generally better motion
25 Improved tracking (1) For long-term image-based motion estimation, high-quality feature tracking is critical A. Robustness to harsh overall motion B. Robustness to poor image texture C. Uniform image coverage D. Squeeze every feature for all it’s worth
26 Improved tracking (2) A. Robustness to harsh image motion large overall image motions highly discontinuous 2D motion fields from nonplanar scenes
27 Improved tracking (3) B. Robustness to poor image texture low texture repetitive texture one-dimensional texture
28 Improved tracking (4) C. Uniform image coverage features should span the entire image features that have become clumped together are redundant
29 Improved tracking (5) D. Squeeze every feature for all it’s worth features should be tracked: despite changes in appearance due to: even if they are near the image boundary distance relative angle specularities
30 Improved tracking (6) smalls correlation image feature tracker Picture of l.s. here
31 Improved tracking (7) Leonard Smalls smalls tracker and lone biker of the apocalypse correlation tracker especially hard on the little things safe for use with the little things mama didn’t love himnot applicable all the powers of hell at his command maybe in 2.0
32 Improved tracking (8) Eliminates the heuristics normally used for… handling large motions determining when a point has: extracting features been mistracked become occluded left the image
33 Improved tracking (9) …and instead: constrains tracking to epipolar lines uses only 3D geometric consistency for determining when a point has: chooses features based on image coverage been mistracked become occluded left the image
34 Improved tracking (10) smalls uses… SIFT keypoint extraction and matching RANSAC two-frame SFM …to determine… consistent matches between the images
35 Improved tracking (11) …and thence: epipolar geometry center of search range along epipolar lines
36 Hyperion (1)
37 Hyperion (2)
38 Hyperion (3)
39 Hyperion (4)
40 Hyperion (4)
41 Hyperion (5)
42 Hyperion (6)
43 Experiments (7): CMU crane Crane capable of translating a platform… …through x, y, z… …through a workspace of about 10 x 10 x 5 m
44 Experiments (8): CMU crane, cont. y translation (meters) x translation (meters) (x, y) translation ground truth
45 Experiments (9): CMU crane, cont. z (m) time z translation ground truth No change in rotation
46 Experiments (10): CMU crane, cont.
47 Experiments (11): CMU crane, cont. Hard sequence: Each image contains an average of 56.0 points Each point appears in an average of 62.3 images (4.4% of sequence) Image-and-inertial online algorithm applied 40 images used in batch initialization
48 Experiments (12): CMU crane, cont.
49 Experiments (13): CMU crane, cont. Estimated z camera translations
50 Experiments (14): CMU crane, cont. 6 DOF errors, after scaled rigid alignment: Rotation: 0.14 radians average Translation: 31.5 cm average (0.9% of distance traveled) Global scale error: -3.4%
51 Future work (1) Closing the loop to deal with drift: (1) recognizing revisited features (2) exploiting revisited features in the estimation
52