Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Slides:



Advertisements
Similar presentations
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Observers and Kalman Filters
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
Cognitive Computer Vision
Hidden Markov Models Theory By Johan Walters (SR 2003)
Apaydin slides with a several modifications and additions by Christoph Eick.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 10 Statistical Modelling Martin Russell.
Lecture 5: Learning models using EM
Mobile Intelligent Systems 2004 Course Responsibility: Ola Bengtsson.
Linear and generalised linear models
Learning Bayesian Networks
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III
Modern Navigation Thomas Herring
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Example Clustered Transformations MAP Adaptation Resources: ECE 7000:
Gaussian Mixture Model and the EM algorithm in Speech Recognition
Lab 4 1.Get an image into a ROS node 2.Find all the orange pixels (suggest HSV) 3.Identify the midpoint of all the orange pixels 4.Explore the findContours.
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Machine Learning Recitation 6 Sep 30, 2009 Oznur Tastan.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
CSC321: Neural Networks Lecture 24 Products of Experts Geoffrey Hinton.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Lecture 17 Gaussian Mixture Models and Expectation Maximization
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Mixture of Gaussians This is a probability distribution for random variables or N-D vectors such as… –intensity of an object in a gray scale image –color.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
CSC321: Neural Networks Lecture 16: Hidden Markov Models
1 CS 552/652 Speech Recognition with Hidden Markov Models Winter 2011 Oregon Health & Science University Center for Spoken Language Understanding John-Paul.
Lecture 2: Statistical learning primer for biologists
1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2006 Oregon Health & Science University OGI School of Science & Engineering John-Paul.
Flat clustering approaches
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Principal Component Analysis (PCA)
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
Kalman Filtering And Smoothing
Tracking with dynamics
Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston 18.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Review of statistical modeling and probability theory Alan Moses ML4bio.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition Objectives: Reestimation Equations Continuous Distributions Gaussian Mixture Models EM Derivation of Reestimation Resources:
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Other Models for Time Series. The Hidden Markov Model (HMM)
Cognitive Computer Vision 3R400 Kingsley Sage Room 5C16, Pevensey III
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
Introduction to Sampling based inference and MCMC
Hidden Markov Models.
Cognitive Computer Vision
CH 5: Multivariate Methods
Regression.
Hidden Markov Models Part 2: Algorithms
LECTURE 15: REESTIMATION, EM AND MIXTURES
Parametric Methods Berlin Chen, 2005 References:
Probabilistic Surrogate Models
Presentation transcript:

Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Lecture 8 Gaussian Mixture Models Using Mixture Models in continuous valued Hidden Markov Models Coursework

So why are GMMs relevant to Cognitive CV? Provides a well-founded methodology for representing e.g.: – Arbitrarily complex functions – Images Combined with HMMs (previous lecture) provided one basis for a model of expectation capable of dealing with continuous valued data

The Gaussian function In 1-dimension, the Gaussian Function is the probability function of the normal distribution  p(x)  : mean  : std deviation  2 : variance

The Gaussian function Can extend this idea to k-dimensions (the multivariate normal distribution) k: # of dimensions x,  : vectors of k data values  : covariance matrix T: matrix transpose

Variance and co-variance reminder For single dimension (k=1) X with N observations = {x 1,x 2,…x N } For small samples (say N < 20), a better unbiased estimate is given by replacing 1/N terms with 1/(N-1)

Variance and co-variance reminder For k-dimensional X with N observations …

Variance and co-variance reminder Covariance matrix is symmetric I.e. cov(x k,1 )=cov(x 1,k ) Diagonal matrix is variance of 1  k parameters Matrix encodes variability between variables Uncorrelated dimensions will have low co-variance

Effect of co-variance

Mixtures of Gaussians We might represent set of points (like a trajectory) as a sum of a number of Gaussian components. We decide how many components and fit to data …                      m=1 m=2 m=3 Here, each mixture m has its own  (m) vector (length k=2) and  matrix (2  2 matrix)

Mixtures of Gaussians Or, a common trick with HMMs, we can use a single co-variance matrix …                     

Mixtures of Gaussians Or we can treat a set of variables (like pixels in an image) as a set of Gaussian function (one or more per pixel) to describe its variability over a set (e.g. time) In this example, each pixel position has one Gaussian function, so we can view the mean and variance maps as images X={,,,,, …}

The Hidden Markov Model (HMM) reminder of model structure from previous lecture x1x1 x2x2 x3x3 Observable variable Hidden states

Hidden states are indexes to Gaussian mixtures Adapted from an example by Sam Roweis Data maps time between eruptions of a geyser vs. duration of eruption This is modelled as a HMM with 3 hidden states

What is a Hidden Markov Model? Formally a Hidden Markov Model = ( , A, B)  vector and A matrix as before 11 2 N hidden states The B (confusion) parameters Each state has a  (mean) vector with length k ( 2 in this case) There is one mixture per hidden state There is either a covariance matrix per state, or (more usually) one overall co-variance matrix  NB: We now use N for the # of hidden nodes. The # of observations is now shown as T

Forward evaluation (1) O = {o 1,o 2, …,o T } 11 2 N hidden states State at t me t-1 State at time t k=1k=2 k= k=  =

Forward evaluation (2)  t=1Norm 11/3 1/3 * N(o 1 ;  (1),  ) = /3 1/3 * N(o 1 ;  (2),  ) = e /3 1/3 * N(o 1 ;  (3),  ) = e-13 0 Here O = {o 1 = [4.0,60.0]} Normalisation solves the problem with the Gaussian functions constants 11 2 N hidden states k=1k=2 k= k=  =

Forward evaluation (3)  t=1t=2 1 1/ {(0.9698*0) + (0.0302*0) + (0.00*0.9835)} * N(o 2 ;  (1),  ) 2 1/ {(0.9698*0.3047) + (0.0302*0.6167) + (0.00*0.0165)} * N(o 2 ;  (2),  ) 3 1/ {(0.9698*0.6953) + (0.0302*0.3833) + (0.00*0)} * N(o 2 ;  (3),  ) Here O = {o 1 = [4.0,60.0], o 2 = [2.0,80.0]} State at t me t-1 State at time t

Coursework The first coursework deals with using HMMs The lab seminar dealt with practical issues around the discrete HMM case These ideas all apply equally to the continuous valued HMMs Coursework requires you to write forwards and backwards evaluation routines and use them to measure statistics for a simple visual control task

Further reading Variable length Markov Models: “The Power of Amnesia”, D. Ron, Y. Singer and N. Tishby, In Advances in Neural Information Processing Systems (NIPS), vol 6, pp , 1994 Extending VLMMs to work with continuous valued data: “Learning temporal structure for task-based control”, check for latest version at

Summary We can combine multi-dimensional Gaussian mixture models with simple HMMs to interpret continuous data The hidden states then index the Gaussian mixtures Forward and backwards evaluation perform the same functions as in the discrete case

Next time … Dynamic Bayes nets for recognising visual behaviour …