Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robot Vision SS 2005 Matthias Rüther 1 ROBOT VISION Lesson 10: Object Tracking and Visual Servoing Matthias Rüther.

Similar presentations


Presentation on theme: "Robot Vision SS 2005 Matthias Rüther 1 ROBOT VISION Lesson 10: Object Tracking and Visual Servoing Matthias Rüther."— Presentation transcript:

1 Robot Vision SS 2005 Matthias Rüther 1 ROBOT VISION Lesson 10: Object Tracking and Visual Servoing Matthias Rüther

2 Robot Vision SS 2005 Matthias Rüther 2 Contents  Object Tracking –Appearance based tracking Kalman filtering Condensation algorithm –Model based tracking Model fitting and tracking  Visual Servoing –Principle –Servoing Types

3 Robot Vision SS 2005 Matthias Rüther 3 Tracking

4 Robot Vision SS 2005 Matthias Rüther 4 Definition of Tracking  Tracking: –Generate some conclusions about the motion of the scene, objects, or the camera, given a sequence of images. –Knowing this motion, predict where things are going to project in the next image, so that we don’t have so much work looking for them.

5 Robot Vision SS 2005 Matthias Rüther 5 Why Track?

6 Robot Vision SS 2005 Matthias Rüther 6 Tracking a Silhouette by Measuring Edge Positions  Observations are positions of edges along normals to tracked contour

7 Robot Vision SS 2005 Matthias Rüther 7 Why not Wait and Process the Set of Images as a Batch?  E.g. in a car system, detecting and tracking pedestrians in real time is important.  Recursive methods require less computing

8 Robot Vision SS 2005 Matthias Rüther 8 Implicit Assumptions of Tracking  Physical cameras do not move instantly from a viewpoint to another.  Objects do not teleport between places around the scene.  Relative position between camera and scene changes incrementally.  We can model motion

9 Robot Vision SS 2005 Matthias Rüther 9 Related Fields  Signal Detection and Estimation  Radar technology

10 Robot Vision SS 2005 Matthias Rüther 10 The Problem: Signal Estimation  We have a system with parameters –Scene structure, camera motion, automatic zoom –System state is unknown (“hidden”)  We have measurements –Components of stable “feature points” in the images. –“Observations”, projections of the state.  We want to recover the state components from the observations

11 Robot Vision SS 2005 Matthias Rüther 11 Necessary Models

12 Robot Vision SS 2005 Matthias Rüther 12 A Simple Example of Estimation by Least Square Method

13 Robot Vision SS 2005 Matthias Rüther 13 Recursive Least Square Estimation  We don’t want to wait until all data have been collected to get an estimate of the depth.  We don’t want to reprocess old data when we make a new measurement.  Recursive method: data at step i are obtained from data at step i-1

14 Robot Vision SS 2005 Matthias Rüther 14 Recursive Least Square Estimation 2

15 Robot Vision SS 2005 Matthias Rüther 15 Recursive Least Square Estimation 3

16 Robot Vision SS 2005 Matthias Rüther 16 Least Square Estimation of the State Vector of a Static System

17 Robot Vision SS 2005 Matthias Rüther 17 Least Square Estimation of the State Vector of a Static System 2

18 Robot Vision SS 2005 Matthias Rüther 18 Dynamic System

19 Robot Vision SS 2005 Matthias Rüther 19 Recursive Least Square Estimation for a Dynamic System (Kalman Filter)

20 Robot Vision SS 2005 Matthias Rüther 20 Estimation when System Model is Nonlinear (Extended Kalman Filter)

21 Robot Vision SS 2005 Matthias Rüther 21 Tracking Steps

22 Robot Vision SS 2005 Matthias Rüther 22 Recursive Least Square Estimation for a Dynamic System (Kalman Filter)

23 Robot Vision SS 2005 Matthias Rüther 23 Tracking as a Probabilistic Inference Problem  Find distributions for state vector a i and for measurement vector x i. Then we are able to compute the expectations â i and x^ i.  Simplifying assumptions (same as for HMM)

24 Robot Vision SS 2005 Matthias Rüther 24 Tracking as Inference

25 Robot Vision SS 2005 Matthias Rüther 25 Model based tracking

26 Robot Vision SS 2005 Matthias Rüther 26 IDEA: if motion is caused by known 3-D object, we can track 3-D motion parameters, not just individual features! ADVANTAGES: - low dimensionality (3 rotations, 3 translations independent of number of features tracked) - mutually constrained motion instead of independently moving points LIMITATIONS: - 6 params only with rigid objects! Not articulated, not deformable. - assumes 3-D model known a priori MODEL-BASED 3-D TRACKING

27 Robot Vision SS 2005 Matthias Rüther 27 [Wunsch,Hirzinger IEEE RA 1997] SKETCH OF ALGORITHM: 0. Initialize 3-D pose R 0, t 0 (rot, transl) 1. Extract features from image I t 2. Match img features with features of 3-D model positioned at R t-1, t t-1 3. Evaluate global error metric in 3-D space (notice, not in image space) 4. Estimate R t, t t aligning img and model features 5. Next frame and go to 1. Example Algorithm

28 Robot Vision SS 2005 Matthias Rüther 28 FEATURES: for instance using image edges with orient.  and offset d (and s x, s y camera scale factors), then is the normal of the 3-D plane through the img edge. Corresponding model edge 3-D plane through img edge ERROR METRIC: in 3-D space for efficiency (no back- projection): orthogonality of n and model edge p q Some Details

29 Robot Vision SS 2005 Matthias Rüther 29 MINIMISATION: using, say, 3 types of features: Trick 1: Approximating R with differential rotations: All E terms can be linearized, a linear system obtained from the quadratic minimization, and a solution computed in closed form: e.g., for edges, Some Details

30 Robot Vision SS 2005 Matthias Rüther 30... where The resulting linear system A [t  ] = b is (trick 2) applied iteratively at each time instant to reduce errors; a few iterations should suffice for small frame-to- frame displacements. NOTICE ASSUMPTIONS MADE: - rigid object - model known a priori - small frame-to-frame displacements - img-model feature correspondences known (if small displacements, by min distance) Some Details

31 Robot Vision SS 2005 Matthias Rüther 31 Problems with Tracking  Initial detection –If it is too slow we will never catch up –If it is fast, why not do detection at every frame?  Even if raw detection can be done in real time, tracking saves processing cycles compared to raw detection.  The CPU has other things to do.  Detection is needed again if you lose tracking  Most vision tracking prototypes use initial detection done by hand

32 Robot Vision SS 2005 Matthias Rüther 32 Visual Servoing  Vision System operates in a closed control loop.  Better Accuracy than „Look and Move“ systems Figures from S.Hutchinson: A Tutorial on Visual Servo Control

33 Robot Vision SS 2005 Matthias Rüther 33 Visual Servoing  Example: Maintaining relative Object Position Figures from P. Wunsch and G. Hirzinger. Real-Time Visual Tracking of 3-D Objects with Dynamic Handling of Occlusion Real-Time Visual Tracking of 3-D Objects with Dynamic Handling of Occlusion

34 Robot Vision SS 2005 Matthias Rüther 34 Visual Servoing  Camera Configurations: End-Effector MountedFixed Figures from S.Hutchinson: A Tutorial on Visual Servo Control

35 Robot Vision SS 2005 Matthias Rüther 35 Visual Servoing  Servoing Architectures Figures from S.Hutchinson: A Tutorial on Visual Servo Control

36 Robot Vision SS 2005 Matthias Rüther 36 Visual Servoing  Position-based and Image Based control –Position based: Alignment in target coordinate system The 3D structure of the target is rconstructed The end-effector is tracked Sensitive to calibration errors Sensitive to reconstruction errors –Image based: Alignment in image coordinates No explicit reconstruction necessary Insensitive to calibration errors Only special problems solvable Depends on initial pose Depends on selected features target End-effector Image of target Image of end effector

37 Robot Vision SS 2005 Matthias Rüther 37 Visual Servoing  EOL and ECL control –EOL: endpoint open-loop; only the target is observed by the camera –ECL: endpoint closed-loop; target as well as end-effector are observed by the camera EOL ECL

38 Robot Vision SS 2005 Matthias Rüther 38 Visual Servoing  Position Based Algorithm: 1.Estimation of relative pose 2.Computation of error between current pose and target pose 3.Movement of robot  Example: point alignment p1p1 p2p2

39 Robot Vision SS 2005 Matthias Rüther 39 Visual Servoing  Position based point alignment  Goal: bring e to 0 by moving p 1 e = |p 2m – p 1m | u = k*(p 2m – p 1m )  p xm is subject to the following measurement errors: sensor position, sensor calibration, sensor measurement error  p xm is independent of the following errors: end effector position, target position p 1m p 2m d

40 Robot Vision SS 2005 Matthias Rüther 40 Visual Servoing  Image based point alignment  Goal: bring e to 0 by moving p 1 e = |u 1m – v 1m | + |u 2m – v 2m |  u xm, v xm is subject only to sensor measurement error  u xm, v xm is independent of the following measurement errors: sensor position, end effector position, sensor calibration, target position p1p1 p2p2 c1c1 c2c2 u1u1 u2u2 v1v1 v2v2 d1d1 d2d2

41 Robot Vision SS 2005 Matthias Rüther 41 Visual Servoing  Example Laparoscopy Figures from A.Krupa: Autonomous 3-D Positioning of Surgical Instruments in Robotized Laparoscopic Surgery Using Visual Servoing

42 Robot Vision SS 2005 Matthias Rüther 42 Visual Servoing  Example Laparoscopy Figures from A.Krupa: Autonomous 3-D Positioning of Surgical Instruments in Robotized Laparoscopic Surgery Using Visual Servoing

43 Robot Vision SS 2005 Matthias Rüther 43 Tracking using CONDENSATION CONditional DENSity PropagATION M. Isard and A. Blake, CONDENSATION – Conditional density propagation for visual tracking, Int. J. Computer Vision 29(1), 1998, pp. 4-28.

44 Robot Vision SS 2005 Matthias Rüther 44 Goal  Model-based visual tracking in dense clutter at near video frame rates

45 Robot Vision SS 2005 Matthias Rüther 45 Example

46 Robot Vision SS 2005 Matthias Rüther 46 Approach  Probabilistic framework for tracking objects such as curves in clutter using an iterative sampling algorithm.  Model motion and shape of target  Top-down approach  Simulation instead of analytic solution

47 Robot Vision SS 2005 Matthias Rüther 47 Probabilistic Framework  Object dynamics form a temporal Markov chain  Observations, z t, are independent (mutually and w.r.t process)  Use Bayes’ rule

48 Robot Vision SS 2005 Matthias Rüther 48 Notation X State vector, e.g., curve’s position and orientation Z Measurement vector, e.g., image edge locations p(X) Prior probability of state vector; summarizes prior domain knowledge, e.g., by independent measurements p(Z) Probability of measuring Z; fixed for any given image p(Z | X) Probability of measuring Z given that the state is X; compares image to expectation based on state p(X | Z) Probability of X given that measurement Z has occurred; called state posterior

49 Robot Vision SS 2005 Matthias Rüther 49 Tracking as Estimation  Compute state posterior, p(X|Z), and select next state to be the one that maximizes this (Maximum a Posteriori (MAP) estimate)  Measurements are complex and noisy, so posterior cannot be evaluated in closed form  Particle filter (iterative sampling) idea: –Stochastically approximate the state posterior with a set of N weighted particles, (s,  ), where s is a sample state and  is its weight  Use Bayes’ rule to compute p(X|Z)

50 Robot Vision SS 2005 Matthias Rüther 50 Factored Sampling  Generate a set of samples that approximates the posterior p(X|Z)  Sample set s={s (1), …, s (N) } generated from p(X); each sample has a weight (“probability”)

51 Robot Vision SS 2005 Matthias Rüther 51 Factored Sampling N=15 CONDENSATION for one image

52 Robot Vision SS 2005 Matthias Rüther 52 Estimating Target State State samples Mean of weighted state samples

53 Robot Vision SS 2005 Matthias Rüther 53 Bayes’ Rule This is what you can evaluate This is what you may know a priori, or what you can predict This is what you want. Knowing p(X|Z) will tell us what is the most likely state X. This is a constant for a given image

54 Robot Vision SS 2005 Matthias Rüther 54 CONDENSATION Algorithm 1.Select: Randomly select N particles from {s t-1 (n) } based on weights  t-1 (n) ; same particle may be picked multiple times (factored sampling) 2.Predict: Move particles according to deterministic dynamics (drift), then perturb individually (diffuse) 3.Measure: Get a likelihood for each new sample by comparing it with the image’s local appearance, i.e., based on p(z t |x t ); then update weight accordingly to obtain {(s t (n),  t (n) )}

55 Robot Vision SS 2005 Matthias Rüther 55 CONDENSATION Scheme

56 Robot Vision SS 2005 Matthias Rüther 56 Notes on Updating  Enforcing plausibility: Particles that represent impossible configurations are discarded  Diffusion modeled with a Gaussian  Likelihood function: Convert “goodness of prediction” score to pseudo-probability –More markings closer to predicted markings -> higher likelihood

57 Robot Vision SS 2005 Matthias Rüther 57 State Posterior

58 Robot Vision SS 2005 Matthias Rüther 58 State Posterior Animation

59 Robot Vision SS 2005 Matthias Rüther 59 Object Motion Model  For video tracking we need a way to propagate probability densities, so we need a “motion model” such as X t+1 = A X t + B W t where W is a noise term and A and B are state transition matrices that can be learned from training sequences  The state, X, of an object, e.g., a B-spline curve, can be represented as a point in a 6D state space of possible 2D affine transformations of the object

60 Robot Vision SS 2005 Matthias Rüther 60 Evaluating p(Z | X) where  m = {true measurement is z m } for m = 1,…,M, and q = 1 -  m p(  m ) is the probability that the target is not visible otherwise

61 Robot Vision SS 2005 Matthias Rüther 61 Dancing Example

62 Robot Vision SS 2005 Matthias Rüther 62 Hand Example

63 Robot Vision SS 2005 Matthias Rüther 63 Pointing Hand Example

64 Robot Vision SS 2005 Matthias Rüther 64 3D Model-based Example  3D state space: image position + angle  Polyhedral model of object

65 Robot Vision SS 2005 Matthias Rüther 65 Advantages of Particle Filtering  Nonlinear dynamics, measurement model easily incorporated  Copes with lots of false positives  Multi-modal posterior okay (unlike Kalman filter)  Multiple samples provides multiple hypotheses  Fast and simple to implement


Download ppt "Robot Vision SS 2005 Matthias Rüther 1 ROBOT VISION Lesson 10: Object Tracking and Visual Servoing Matthias Rüther."

Similar presentations


Ads by Google