Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Models of Appearance for Computer Vision 主講人:虞台文.

Similar presentations


Presentation on theme: "Statistical Models of Appearance for Computer Vision 主講人:虞台文."— Presentation transcript:

1 Statistical Models of Appearance for Computer Vision 主講人:虞台文

2 Content Overview Statistical Shape Models Statistical Appearance Models Active Shape Models Active Appearance Models Comparison : ASM v/s AAM Conclusion

3 Statistical Models of Appearance for Computer Vision Overview

4 Computer Vision Goal – Image understanding Challenge – Deformability of objects Statistic model-based approach – Shape model – Appearance model – Model matching Image interpretation

5 Deformable Models General – capable of generating any plausible example of the class they represent Specific – only capable of generating legal examples

6 Statistical Models of Appearance for Computer Vision Statistical Shape Models

7 Shapes The shape of an object is represented by a set of n points in any dimension Invariance under some transformations – in 2-3 dimension – translation, rotation, scaling – called similarity transformation

8 Hand-Annotated Training Set The training set typically comes from hand annotation of a set of training images

9 Suitable Landmarks Object Boundary High Curvature Equally spaced Intermediate points T Junction

10 Suitable Landmarks

11 Shape Vector A d -dim shape with n landmarks is represented by a vector with elements In a 2D-image with n landmark points, a shape vector x is then

12 Training Set of Shape Vectors A d -dim shape with n landmarks is represented by a vector with nd elements In a 2D-image with n landmark points, a shape vector x is then The shape vectors in the training set should be in the same coordinate frame – Alignment of the training set is required

13 Aligning the Training Set

14 Procrustes Analysis Minimize

15 Alignment : Iterative Approach

16

17

18 Possible Alignment Approaches: Center -> scale ( |x| =1) -> rotation Center -> (scale + rotation) Center -> (scale + rotation) -> projection onto tangent space of the mean

19 Alignment : Iterative Approach Possible Alignment Approaches: Center -> scale ( |x| =1) -> rotation Center -> (scale + rotation) Center -> (scale + rotation) -> projection onto tangent space of the mean

20 Alignment : Iterative Approach Possible Alignment Approaches: Center -> scale ( |x| =1) -> rotation Center -> (scale + rotation) Center -> (scale + rotation) -> projection onto tangent space of the mean Tangent space to x t – all vectors s.t. (x t  x)  x t = 0, or x  x t = 1 if x t  x t = 1 Obtained by aligning the shapes with the mean, allowing scaling and rotation, then project into the tangent space by scaling x by 1/(x  x) Tangent space to x t – all vectors s.t. (x t  x)  x t = 0, or x  x t = 1 if x t  x t = 1 Obtained by aligning the shapes with the mean, allowing scaling and rotation, then project into the tangent space by scaling x by 1/(x  x)

21 Alignment : Iterative Approach

22

23 Modeling Shape Variation Parametric Shape Model Estimate the distribution of b – Generating new shapes – Examining new shapes (plausibility)

24 PCA 1. Compute the mean 2. Compute the covariance matrix 3. Compute the eigenvectors,  i and corresponding eigenvalues i of S s.t. 1  2  

25 Shape Approximation a plausible shape mean shape parameter vector of deformable model

26 Shape Approximation parameter vector of deformable model with The generated shape is similar those in training set (plausible).

27 Choice of Number of Modes ( t = ? ) Let f v be the proportion of the total variation one wishes to explain (e.g., 98%) Total variance V T =  i is the sum of all eigenvalues, assuming 1  2   T Then, we can chose the smallest t s.t.

28 Examples of Shape Models Some members in the training set (18 hands) Each is represented by 72 landmark points

29 Examples of Shape Models

30 Some members in the training set (300 faces) Each is represented by 133 landmark points

31 Examples of Shape Models

32 Generating Plausible Shapes parameter vector of deformable model Assumption : b i ’s are independent and Gaussian Options Hard limits on independent b i ’s Constrain b in a hyperellipsoid Assumption : b i ’s are independent and Gaussian Options Hard limits on independent b i ’s Constrain b in a hyperellipsoid

33 Generating Plausible Shapes Variation is modeled as a linear combination of eigenvectors How about for nonlinear cases?

34 Nonlinear Shape Variation

35 b 1 and b 2 are not independent.

36 Nonlinear Shape Variation

37 Nonlinear shape variations often caused by – Rotating parts of objects – View point change – Other special cases Eg : Only 2 valid positions (x = f(b) fails) Plausible Shapes cannot be generated using aforementioned method.

38 Non-Linear Models for PDF Polar coordinates (Heap and Hogg) Modeling p(b) by Mixture of Gaussians Drawbacks Figuring out no. of Gaussians to be used Finding nearest plausible shape Polar coordinates (Heap and Hogg) Modeling p(b) by Mixture of Gaussians Drawbacks Figuring out no. of Gaussians to be used Finding nearest plausible shape

39 Similar Transformation Translation Scale Rotation A shape of the model A shape in a image Pose parameters

40 Fitting a Model to New Points in a Image Given a set of new image points Y, find b and pose parameters so as to minimize

41 Fitting a Model to New Points in a Image Minimize

42 Statistical Models of Appearance for Computer Vision Statistical Models of Appearance

43 Appearance Shape Texture – Pattern of intensities

44 Appearance Model shape variance texture variance Shape normalization

45 Shape Normalization Remove spurious texture variations due to shape differences Warp each image to match control points with the mean image – triangulation algorithm

46 Intensity Normalization Gray-level vector of shape normalized image Mean of the normalized data, scaled and offset s.t. elements’ sum is 0 and variance is 1 Obtained using a recursive process

47 PCA mean normalized grey-level vector a set of orthogonal modes of variation a set of grey-level parameters

48 Texture Reconstruction

49 Combined Appearance Model Shape parameters Appearance parameters

50 Combined Appearance Models a diagonal matrix of weights to accommodate the difference in units between the shape and grey models

51 PCA for Combined Vectors eigenvectors from applying PCA on b ’s appearance parameters controlling both the shape and grey-levels of the model

52 Shape & Grey-Level Reconstruction

53 Appearance Reconstruction Given appearance parameter c Generate shape-free gray-level image g warp g to the shape described by x

54 Review: Combined Appearance Models a diagonal matrix of weights to accommodate the difference in units between the shape and grey models How to obtained W s ?

55 Choice of Shape Parameter Weights Method1  Displace each element of b s from its optimum value and observe change in g for each training example – The RMS change gives elements in W s Method2  W s = rI where r 2 is the ratio of the total intensity variation to the total shape variation

56 Choice of Shape Parameter Weights Method1  Displace each element of b s from its optimum value and observe change in g for each training example – The RMS change gives elements in W s Method2  W s = rI where r 2 is the ratio of the total intensity variation to the total shape variation The choice of W s is relatively insensitive

57 Example: Facial Appearance Model First two modes of shape variation (  3 sd) First two modes of grey- level variation (  3 sd)

58 Example: Facial Appearance Model First four modes of appearance variation (  3 sd)

59 Approximating a New Example Given a new image, labelled with a set of landmarks, to generate an approximation with the model. Approx.

60 Approximating a New Example Given a new image, labelled with a set of landmarks, to generate an approximation with the model. Obtain b s and b g Obtain b Obtain c Apply Inverting gray level normalization by Applying pose to the points Projecting the gray level vector to the image

61 Statistical Models of Appearance for Computer Vision Active Shape Models

62 Goal Given a rough starting approximation, to fit an instance of a model to the image

63 Iterative Approach Iteratively improving the fit of the instance, X, to an image proceeds as follows: 1. Examine a region of the image around each point X i to find the best nearby match for the point 2. Update the parameters (X t, Y t, s, , b) to best fit the new found points X 3. Repeat until convergence

64 Iterative Approach Iteratively improving the fit of the instance, X, to an image proceeds as follows: 1. Examine a region of the image around each point X i to find the best nearby match for the point 2. Update the parameters (X t, Y t, s, , b) to best fit the new found points X 3. Repeat until convergence The above method is applicable if model points are edges The best approach is to examine the local structures of model points (to be discussed) The above method is applicable if model points are edges The best approach is to examine the local structures of model points (to be discussed)

65 Iterative Approach Iteratively improving the fit of the instance, X, to an image proceeds as follows: 1. Examine a region of the image around each point X i to find the best nearby match for the point 2. Update the parameters (X t, Y t, s, , b) to best fit the new found points X 3. Repeat until convergence

66 Modeling Local Structure Sample the derivative along a profile, k pixels on either side of a model point, to get a vector g i of the 2k+1 points gigi k points

67 Modeling Local Structure Sample the derivative along a profile, k pixels on either side of a model point, to get a vector g i of the 2k+1 points Normalize g i by Repeat for each training image for same model point to get {g i } Estimate mean and covariance

68 Fitness Sample the derivative along a profile, k pixels on either side of a model point, to get a vector g i of the 2k+1 points Normalize g i by Repeat for each training image for same model point to get {g i } Estimate mean and covariance Mahalanobis distance

69 Using Local Structure Model Sample a profile m pixels either side of the current point ( m>k ) Test quality of fit at 2(m  k)+1 positions Chose the one which gives the best match

70 Statistical Models of Grey-Level Profiles The same number of pixels is used at each level

71 MRASM Algorithm

72 Starting from the coarsest level

73 MRASM Algorithm Set up the initial position before search

74 MRASM Algorithm Search best match position for each model points

75 MRASM Algorithm Search best match position for each model points Ensure plausibility

76 MRASM Algorithm Repeat until convergence of the level Change to finer level

77 MRASM Algorithm Report result

78 Summary of Parameters

79 Examples of Search

80 Poor Starting Point

81 Examples of Search Search using ASM of cartilage on an MR image of the knee

82 Statistical Models of Appearance for Computer Vision Active Appearance Models

83 Disadvantages of ASM Only uses shape constraints (together with some information about the image structure near the landmarks) for search Incapable of generating synthetic image

84 Goal of AAM Given a rough starting approximation of an appearance model, to fit it within an image

85 Review: Appearance Model model parameter

86 Overview of AAM Search difference vector grey-level vector in image grey-level vector by model

87 Overview of AAM Search AAM Search  Minimize  by varying c effectively effectively

88 Overview of AAM Search AAM Search  Minimize  by varying c effectively effectively Given  g, how to obtain  c deterministically?

89 Learning to Correct Model Parameters Given  g, how to obtain  c deterministically? Linear Model Multivariate regression on a sample of known model displacements,  c, and the corresponding  g

90 Extra Parameters for Pose Linear Model Multivariate regression on a sample of known model displacements,  c, and the corresponding  g Pose parameters Including (s x, s y, t x, t y )

91 Training Linear Model Multivariate regression on a sample of known model displacements,  c, and the corresponding  g Including (s x, s y, t x, t y ) c 0  a know appearance model in the current image Perturbing by  c to get new parameters, i.e., c = c 0 +  c – including small displacements in position, scale, and orientation Computing  g = g i  g m – Use shape-normalized representation (warping) Record enough perturbations and image differences for regression c 0  a know appearance model in the current image Perturbing by  c to get new parameters, i.e., c = c 0 +  c – including small displacements in position, scale, and orientation Computing  g = g i  g m – Use shape-normalized representation (warping) Record enough perturbations and image differences for regression

92 Training c 0  a know appearance model in the current image Perturbing by  c to get new parameters, i.e., c = c 0 +  c – including small displacements in position, scale, and orientation Computing  g = g i  g m – Use shape-normalized representation (warping) Record enough perturbations and image differences for regression c 0  a know appearance model in the current image Perturbing by  c to get new parameters, i.e., c = c 0 +  c – including small displacements in position, scale, and orientation Computing  g = g i  g m – Use shape-normalized representation (warping) Record enough perturbations and image differences for regression Ideally, we want a model that holds over large error range,  g, so also for parameter range  c Experimentally, optimal perturbation around 0.5 standard deviations for each parameter Ideally, we want a model that holds over large error range,  g, so also for parameter range  c Experimentally, optimal perturbation around 0.5 standard deviations for each parameter

93 Results For The Face Model

94 The weight attached to different areas of the sampled patch when estimating the displacement

95 Weights for Pose Parameters sxsx sysy txtx tyty

96 Pose-Parameter Displacement Weights sxsx sysy txtx tyty

97 First Mode and Displacement Weights

98 Second Mode and Displacement Weights

99 Performance of the Prediction Linear relation holds within 4 pixels As long as prediction has the same sign as actual error, and not much over-prediction, it converges Extend range by building multi-resolution model

100 Performance of the Prediction Multi-Resolution Model L0: 10000 pixels, L1: 2500 pixels, L2: 600 pixels.

101 AAM Search: Iterative Model Refinement Evaluate the error vector Evaluate the current error Compute the predicted displacement, Set k = 1 Let Sample the image at this new prediction, and calculate a new error vector, If then accept the new estimate, c 1, Otherwise try at k = 1.5, k = 0.5, k = 0.25 etc.

102 AAM Search: Iterative Model Refinement Evaluate the error vector Evaluate the current error Compute the predicted displacement, Set k = 1 Let Sample the image at this new prediction, and calculate a new error vector, If then accept the new estimate, c 1, Otherwise try at k = 1.5, k = 0.5, k = 0.25 etc.

103 Examples of AAM Search Reconstruction (left) and original (right) given original landmark points

104 Examples of AAM Search Multi-Resolution search from displaced position

105 Examples of AAM Search First two modes of appearance variation of knee model Best fit of knee model to new image given landmarks

106 Examples of AAM Search Multi-Resolution search from displaced position

107 Statistical Models of Appearance for Computer Vision Comparison : ASM v/s AAM

108 Key Differences ASM only uses models of the image texture in the small regions around each landmark point ASM searches around current position ASM seeks to minimize the distance b/w model points and corresponding image points AAM uses a model of appearance of the whole region AAM only samples the image under current position AAM seeks to minimize the difference of the synthesized image and target image

109 Experiment Data Two data sets : – 400 face images, 133 landmarks – 72 brain slices, 133 landmark points Training data set – Faces : 200, tested on remaining 200 – Brain : 400, leave-one-brain-experiments

110 Capture Range

111

112 Point Location Accuracy

113 ASM runs significantly faster for both models, and locates the points more accurately

114 Texture Matching

115 Statistical Models of Appearance for Computer Vision Conclusion

116 ASM searches around the current location, along profiles, so one would expect them to have larger capture range ASM takes only the shape into account thus are less reliable AAM can work well with a much smaller number of landmarks as compared to ASM


Download ppt "Statistical Models of Appearance for Computer Vision 主講人:虞台文."

Similar presentations


Ads by Google