Real-Time Tracking Axel Pinz Image Based Measurement Group EMT – Institute of Electrical Measurement and Measurement Signal Processing TU Graz – Graz University.

Slides:



Advertisements
Similar presentations
Institut für Elektrische Meßtechnik und Meßsignalverarbeitung Professor Horst Cerjak, Augmented Reality VU 4 Algorithms + Tracking.
Advertisements

The fundamental matrix F
Lecture 11: Two-view geometry
Institut für Elektrische Meßtechnik und Meßsignalverarbeitung Professor Horst Cerjak, Augmented Reality VU 2 Calibration Axel Pinz.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Computer vision: models, learning and inference
Two-View Geometry CS Sastry and Yang
Self-calibration.
Two-view geometry.
Camera calibration and epipolar geometry
Structure from motion.
CH24 in Robotics Handbook Presented by Wen Li Ph.D. student Texas A&M University.
MASKS © 2004 Invitation to 3D vision Lecture 11 Vision-based Landing of an Unmanned Air Vehicle.
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
Accurate Non-Iterative O( n ) Solution to the P n P Problem CVLab - Ecole Polytechnique Fédérale de Lausanne Francesc Moreno-Noguer Vincent Lepetit Pascal.
Adam Rachmielowski 615 Project: Real-time monocular vision-based SLAM.
Geometry of Images Pinhole camera, projection A taste of projective geometry Two view geometry:  Homography  Epipolar geometry, the essential matrix.
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Uncalibrated Geometry & Stratification Sastry and Yang
CAU Kiel DAGM 2001-Tutorial on Visual-Geometric 3-D Scene Reconstruction 1 The plan for today Leftovers and from last time Camera matrix Part A) Notation,
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
3D Computer Vision and Video Computing 3D Vision Lecture 15 Stereo Vision (II) CSC 59866CD Fall 2004 Zhigang Zhu, NAC 8/203A
3D Computer Vision and Video Computing 3D Vision Lecture 14 Stereo Vision (I) CSC 59866CD Fall 2004 Zhigang Zhu, NAC 8/203A
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.
3-D Scene u u’u’ Study the mathematical relations between corresponding image points. “Corresponding” means originated from the same 3D point. Objective.
Epipolar geometry Class 5. Geometric Computer Vision course schedule (tentative) LectureExercise Sept 16Introduction- Sept 23Geometry & Camera modelCamera.
Overview and Mathematics Bjoern Griesbach
Vision Guided Robotics
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
Automatic Camera Calibration
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 3.1: 3D Geometry Jürgen Sturm Technische Universität München.
Kalman filter and SLAM problem
Camera Calibration & Stereo Reconstruction Jinxiang Chai.
Multi-view geometry.
1 Interest Operators Harris Corner Detector: the first and most basic interest operator Kadir Entropy Detector and its use in object recognition SIFT interest.
Brief Introduction to Geometry and Vision
1 Preview At least two views are required to access the depth of a scene point and in turn to reconstruct scene structure Multiple views can be obtained.
Consistent Visual Information Processing Axel Pinz EMT – Institute of Electrical Measurement and Measurement Signal Processing TU Graz – Graz University.
Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.
IMAGE MOSAICING Summer School on Document Image Processing
CSCE 643 Computer Vision: Structure from Motion
Binocular Stereo #1. Topics 1. Principle 2. binocular stereo basic equation 3. epipolar line 4. features and strategies for matching.
1 Formation et Analyse d’Images Session 7 Daniela Hall 25 November 2004.
© 2005 Martin Bujňák, Martin Bujňák Supervisor : RNDr.
Lecture 03 15/11/2011 Shai Avidan הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
Single-view geometry Odilon Redon, Cyclops, 1914.
Real-Time Simultaneous Localization and Mapping with a Single Camera (Mono SLAM) Young Ki Baik Computer Vision Lab. Seoul National University.
Bahadir K. Gunturk1 Phase Correlation Bahadir K. Gunturk2 Phase Correlation Take cross correlation Take inverse Fourier transform  Location of the impulse.
Raquel A. Romano 1 Scientific Computing Seminar May 12, 2004 Projective Geometry for Computer Vision Projective Geometry for Computer Vision Raquel A.
Two-view geometry. Epipolar Plane – plane containing baseline (1D family) Epipoles = intersections of baseline with image planes = projections of the.
Feature Matching. Feature Space Outlier Rejection.
776 Computer Vision Jan-Michael Frahm & Enrique Dunn Spring 2013.
Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.
3D Reconstruction Using Image Sequence
Geometry Reconstruction March 22, Fundamental Matrix An important problem: Determine the epipolar geometry. That is, the correspondence between.
Uncalibrated reconstruction Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration.
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.
11/25/03 3D Model Acquisition by Tracking 2D Wireframes Presenter: Jing Han Shiau M. Brown, T. Drummond and R. Cipolla Department of Engineering University.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry
Epipolar geometry.
3D Photography: Epipolar geometry
Multiple View Geometry for Robotics
Uncalibrated Geometry & Stratification
Presentation transcript:

Real-Time Tracking Axel Pinz Image Based Measurement Group EMT – Institute of Electrical Measurement and Measurement Signal Processing TU Graz – Graz University of Technology

Defining the Terms Real-Time –Task dependent, “in-the-loop” –Navigation: “on-time” –Video rate: 30Hz –High-speed tracking: several kHz Tracking –DoF: Degrees of Freedom –2D: images, videos  2 / 3 DoF –3D: scenes, object pose  6 DoF

Example: High-speed, 2D

Applications Surveillance Augmented reality Surgical navigation Motion capture (MoCap) Autonomous navigation Telecommunication Many industrial applications

Example: Augmented Reality [ARToolkit, Billinghurst, Kato, Demo at ISAR2000, Munich]

Agenda Structure of the SSIP Lecture Intro, terminology, applications 2D motion analysis Geometry 3D motion analysis Practical considerations Existing systems Summary, conclusions

2D Motion Analysis Change detection –Can be anything (not necessarily motion) Optical flow computation –What is moving in which direction ? –Hard in real time Data reduction required ! –Interest operators –Points, lines, regions, contours Modeling required –Motion models, object models –Probabilistic modeling, prediction

Change Detection [Pinz, Bildverstehen, 1994]

Optical Flow (1) [Brox, Bruhn, Papenberg, Weickert] ECCV04 best paper award Estimating the displacement field Assumptions: Gray value constancy Gradient constancy Smoothness... Error Minimization

Optical Flow (2) [Brox, Bruhn, Papenberg, Weickert] ECCV04 best paper award !! Not in real-time !!

Interest Operators Reduce the amount of data Track only salient features Support region – ROI (region of interest) Feature in ROI: Edge / LineBlobCornerContour

2D Point Tracking [Univ. Erlangen, VAMPIRE, EU-IST ] Corner detection  Initialization –Calculate “cornerness” c –Threshold  sensitivity, # of corners –E.g.: “Harris” / “Plessey” corners in ROI Cross-correlation in ROI

2D Point Tracking [Univ. Erlangen, VAMPIRE, EU-IST ]

Edge Tracking [Rapid 95, Harris, RoRapid 95, Armstrong, Zisserman]

Blob Tracking [Mean Shift 03, Comaniciu, Meer]

Contour Tracking [CONDENSATION 98-02, Isard, Toyama, Blake]

CONDENSATION (2) CONditional DENSity propagATION Requires a good initialization Works with active contours Maintains / adapts a contour model Can keep more than one hypothesis

Agenda Structure of the SSIP Lecture Intro, terminology, applications 2D motion analysis Geometry 3D motion analysis Practical considerations Existing systems Summary, conclusions

Geometry Having motion in images: –What does it mean? –What can be measured? Projective camera Algebraic projective geometry Camera calibration Computer Vision –Reconstruction from uncalibrated views There are excellent textbooks [Faugeras 1994, Hartley+Zisserman 2001, Ma et al. 2003]

Projective Camera (1) Pinhole camera model: –p = (x,y) T is the image of P = (X,Y,Z ) T –(x,y)... image-, (X,Y,Z)... scene-coordinates –o... center of projection –(x,y,z)... camera coordinate system –(x,y,-f)... image plane x x y z y f p(x,y) P(X,Y,Z) o X Y Z

Projective Camera (2) Pinhole camera model: –If scene- = camera-coordinate system Z o f x X

Projective Camera (3) Frontal pinhole camera model: –(x,y,+f)... image plane –Normalized camera: f=+1 x x y z y f p P(X,Y,Z) o X Y Z

Projective Camera (4) “real” camera: –5 Intrinsic parameters (K) –Lens distortion –6 Extrinsic parameters (M: R, t) – … arbitrary scale

Algebraic Projective Geometry [Semple&Kneebone 52] Homogeneous coordinates Duality points  lines Homography H describes any transformation –E.g.: image  image transform: x’ = Hx –All transforms can be described by 3x3 matrices –Combination of transformations: Matrix product Translation Rotation

Camera Calibration (1) Recover the 11 camera parameters: –5 Intrinsic parameters (K: fs x, fs y, fs , u 0, v 0 ) –6 Extrinsic parameters (M: R, t) Calibration target: –At least 6 point correspondences  –System of linear equations Direct (initial) solution for K and M

Camera Calibration (2) Iterative optimization –K, M, lens distortion –E.g. Levenberg-Marquart Practical solutions require more points –Many algorithms [Tsai 87, Zhang 98, Heikkilä 00] Overdetermined systems Robustness against outliers –E.g. RANSAC Refer to [Hartley, Zisserman, 2001]

What can be measured... … with a calibrated camera –Viewing directions –Angles between viewing directions –3D reconstruction: more than 1 view required … with uncalibrated camera(s) –Computer Vision research of the past decade –Hierarchy of geometries: Projective – oriented projective – affine – similarity – Euclidean

Agenda Structure of the SSIP Lecture Intro, terminology, applications 2D motion analysis Geometry 3D motion analysis Practical considerations Existing systems Summary, conclusions

3D Motion Analysis: Location and Orientation R t head coord. system scene coord. system 6 DoF pose in real-time  Extrinsic parameters in real-time

3D Motion Analysis: Tracking technologies, terminology Camera pose (PnP) Stereo, E, F, epipolar geometry Model-based tracking –Confluence of 2D and 3D Fusion Kalman Filter

Tracking Technologies (1) Mechanical tracking “Magnetic tracking” Acoustic – time of flight “Optical”  vision-based Compass GPS, … External effort required ! No „self-contained“ system [Allen, Bishop, Welch. Tracking: Beyond 15 minutes of thought. SIGGRAPH’01]

Tracking Technologies (2) Examples [Allen, Bishop, Welch. Tracking: Beyond 15 minutes of thought. SIGGRAPH’01]

Research at EMT: Hybrid Tracking – HT Combine 2 technologies: Vision-based +Good results for slow motion –Motion blur, occlusion, wrong matches Inertial +Good results for fast motion –Drift, noise, long term stability Fusion of complementary sensors ! Mimicks human cognition !

Vision-Based Tracking More Terminology Measure position and orientation in real-time Obtain trajectories of object(s) Moving observer, egomotion – “inside-out” Stationary observer – “outside-in Tracking” Combinations of the above Degrees of Freedom – DoF –3 DoF (mobile robot) –6 DoF (head tracking in AR)

Inside-out Tracking monocular exterior parameters 6 DoF from  4 points wearable, fully mobile cornersblobsnatural landmarks

Outside-in Tracking stereo-rig IR-illumination no cables 1 marker/device: 3 DoF 2 markers: 5 DoF 3 markers: 6 DoF devices

Camera Pose Estimation Pose estimation: Estimate extrinsic parameters from known / unknown scene  find R, t Linear algorithms [Quan, Zan, 1999] Iterative algorithms [Lu et al., 2000] Point-based methods –No geometry, just 3D points Model-based methods –Object-model, e.g. CAD

PnP(1) Perspective n-Point Problem Calibrated camera K, C = (KK T ) -1 n point correspondences scene  image Known scene coordinates of p i, and known distances d ij = || p i – p j || Each pair (p i,p j ) defines an angle   can be measured (2 lines of sight, calibrated camera)  constraint for the distance ||c – p i || c xixi xjxj pipi pjpj  d ij

PnP (2) c xixi xjxj pipi pjpj  uiui ujuj d ij

PnP (3) P3P, 3 points: underdetermined, 4 solutions P4P, 4 points: overdetermined, 6 equations, 4 unknowns 4 x P3P, then find a common solution General problem: PnP, n points

PnP (4) Once the x i have been solved: 1)project image points  scene p’ i = x i K -1 u i 2)find a common R, t for p’ i  p i (point-correspondences  solve a simple system of linear equations)

Stereo Reconstruction Elementary stereo geometry in “canonical configuration” 2 h … “baseline” b P r - P l … “disparity” d There is just column disparity Depth computation: x z y hh x=0x r =0x l =0 f PlPl PrPr ClCl CrCr P(x,y,z) PzPz

Stereo (2) 2 cameras, general configuration: Epipolar geometry ClCl CrCr X ulul urur l lrlr elel erer Y v

Uncalibrated cameras: Fundamental matrix F Calibrated cameras: Essential matrix E 3x3, Rank 2 Many Algorithms –Normalized 8-point [Hartley 97] –5-point (structure and motion) [Nister 03] Stereo (3)

Model-Based Tracking Confluence of 2D and 3D [Deutscher, Davison, Reid 01]

3D Motion Analysis: Tracking technologies, terminology Camera pose (PnP) Stereo, E, F, epipolar geometry Model-based tracking –Confluence of 2D and 3D Fusion Kalman Filter 1.General considerations 2.Kalman Filter 3.  EMT HT project

General Considerations We have: –Several sensors (vision, inertial,...) –Algorithms to deliver pose streams for each sensor (at discrete times; rates may vary depending on sensor, processor, load,...) Thus, we need: –Algorithms for sensor fusion (weighting the confidence in a sensor, sensor accuracy,...) –Pose estimation including a temporal model [Allen, Bishop, Welch. Tracking: Beyond 15 minutes of thought. SIGGRAPH’01]

Sensor Fusion Dealing with ignorance: –Imprecision, ambiguity, contradiction,... Mathematical models –Fuzzy sets –Dempster-Shafer evidence theory –Probability theory Probabilistic modeling in Computer Vision –The topic of this decade ! –Examples: CONDENSATION, mean shift

Kalman Filter (1) [Welch, Bishop. An Introduction to the Kalman Filter. SIGGRAPH’01]

Kalman Filter (2) Estimate a process with a measurement x  n... State of the process z  m... Measurement p, v... Process and measurement noise (zero mean) A... n x n Matrix relates the previous with the current time step B... n x l Matrix relates optional control input u to x H... n x m Matrix relates state x to measurement z

Kalman Filter (3) Definitions Then:

Kalman Filter (4) Compute with

Kalman Filter (5)

Kalman Filter (6)

EMT Hybrid Tracker HT Project Ingredients of hybrid tracking: Camera(s) Inertial sensors Feature extraction Pose estimation Structure estimation Real-time Synchronisation Kalman filter Sensor Fusion

Hybrid Tracking 6 DoF vision-based tracker 6 DoF inertial tracker Fusion by a Kalman filter

“Structure and Motion” “Tracking” + “Structure from Motion”

Research Prototype Tracking subsystem Visualization subsystem Sensors + HMD

HT Application Example

Agenda Structure of the SSIP Lecture Intro, terminology, applications 2D motion analysis Geometry 3D motion analysis Practical considerations Existing systems Summary, conclusions

Practical Considerations There are critical configurations ! Projective geometry vs. discrete pixels –Rays do not intersect ! –Error minimization algorithms required Robustness (many points) vs. real-time –Outlier detection can become difficult ! Precision (iterative) vs. real-time (linear) Combination of diverse features –Points, lines, curves Jitter, lag Debugging of a real-time system !

Existing Systems (1) VR/AR –Intersense, Polhemus, A.R.T. MoCap –Vicon, A.R.T. Medical tracking –MedTronic, A.R.T. Fiducial tracker (Intersense) Research systems –KLT (Kanade, Lucas, Tomasi) –ARToolkit (Billinghurst) –XVision (Hager)

Existing Systems (2)

Open Issues Tracking of natural landmarks First success in online structure and motion –[Nister CVPR03, ICCV03, ECCV04] (Re-)Initialisation in highly complex scenes Usability !

Future Applications Can pose (position and orientation) be exploited ? –What is the user looking at? –Architecture, city guide, museum, emergency, … From bulky gear and HMD  PDA –Wireless communication –Camera(s) –Inertial sensors (+ compass, + GPS, …) Automotive ! –Driver assistance –Autonomous vehicles, mobile robot navigation, … Medicine ! –Surgical navigation –Online fusion (temporal genesis, sensory modes, …)

Summary, Conclusions Real-time pose (6 DoF) 2D and 3D motion analysis Geometry Probabilistic modeling High potential for future developments

Acknowledgements EU-IST Vampire - Visual Active Memory Processes and Interactive Retrieval FWF P15748 Smart Tracking FWF P14470 Mobile Collaborative AR Christian Doppler Laboratory for Automotive Measurement Research Markus Brandner Harald Ganster Bettina Halder Jochen Lackner Peter Lang Ulrich Mühlmann Miguel Ribo Hannes Siegl Christoph Stock Georg Teichtmeister Jürgen Wolf

Further Reading R. Hartley, A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2nd ed., Y. Ma, S. Soatto, J. Kosecka, S. Shankar Sastry. An Invitation to 3D Vision. Springer, B.D. Allen, G. Bishop, G. Welch. Tracking: Beyond 15 Minutes of Thought. SIGGRAPH 2001, Course 11. See G. Welch, G. Bishop. An Introduction to the Kalman Filter. SIGGRAPH 2001, Course 8. See