Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 15 Applications of Bayesian Networks 01.07.2009 Bastian.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

CS188: Computational Models of Human Behavior
Factorial Mixture of Gaussians and the Marginal Independence Model Ricardo Silva Joint work-in-progress with Zoubin Ghahramani.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Expectation Maximization
Supervised Learning Recap
Segmentation and Fitting Using Probabilistic Methods
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 143, Brown James Hays 02/22/11 Many slides from Derek Hoiem.
Hidden Variables, the EM Algorithm, and Mixtures of Gaussians Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/15/12.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Visual Recognition Tutorial
Overview Full Bayesian Learning MAP learning
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Lecture 5: Learning models using EM
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-
Maximum Likelihood (ML), Expectation Maximization (EM)
Stereo Computation using Iterative Graph-Cuts
Visual Recognition Tutorial
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
Today Logistic Regression Decision Trees Redux Graphical Models
Computer vision: models, learning and inference
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Crash Course on Machine Learning
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Computer vision: models, learning and inference
Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 16 Approximate Inference Bastian Leibe RWTH.
Perceptual and Sensory Augmented Computing Machine Learning, WS 13/14 Machine Learning – Lecture 14 Introduction to Regression Bastian Leibe.
Perceptual and Sensory Augmented Computing Machine Learning, Summer’10 Machine Learning – Lecture 16 Approximate Inference Bastian Leibe RWTH.
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct
Lecture 19: More EM Machine Learning April 15, 2010.
Perceptual and Sensory Augmented Computing Machine Learning Summer’09 Machine Learning – Lecture 2 Probability Density Estimation Bastian Leibe.
Perceptual and Sensory Augmented Computing Machine Learning, Summer’11 Machine Learning – Lecture 13 Introduction to Graphical Models Bastian.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Unsupervised Learning: Clustering Some material adapted from slides by Andrew Moore, CMU. Visit for
Machine Learning – Lecture 6
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
Perceptual and Sensory Augmented Computing Machine Learning WS 13/14 Machine Learning – Lecture 3 Probability Density Estimation II Bastian.
Machine Learning – Lecture 15
Perceptual and Sensory Augmented Computing Machine Learning, WS 13/14 Machine Learning – Lecture 15 Regression II Bastian Leibe RWTH Aachen.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Perceptual and Sensory Augmented Computing Machine Learning, WS 13/14 Machine Learning – Lecture 21 Learning Bayesian Networks & Extensions
Lecture 2: Statistical learning primer for biologists
Machine Learning – Lecture 15
Machine Learning – Lecture 11
Machine Learning – Lecture 18
Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 13 Exact Inference & Belief Propagation Bastian.
Today Graphical Models Representing conditional dependence graphically
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Perceptual and Sensory Augmented Computing Machine Learning, WS 13/14 Machine Learning – Lecture 5 Linear Discriminant Functions Bastian Leibe.
Markov Random Fields in Vision
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Learning Deep Generative Models by Ruslan Salakhutdinov
Markov Random Fields with Efficient Approximations
Statistical Models for Automatic Speech Recognition
Data Mining Lecture 11.
Latent Variables, Mixture Models and EM
Bayesian Models in Machine Learning
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Presentation transcript:

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 15 Applications of Bayesian Networks Bastian Leibe RWTH Aachen TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A AA A A A A A AAAAAA A AAAA A A A AA A A

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Course Outline Fundamentals (2 weeks)  Bayes Decision Theory  Probability Density Estimation Discriminative Approaches (5 weeks)  Lin. Discriminants, SVMs, Boosting  Dec. Trees, Random Forests, Model Sel. Graphical Models (5 weeks)  Bayesian Networks & Applications  Markov Random Fields & Applications  Exact Inference  Approximate Inference Regression Problems (2 weeks)  Gaussian Processes B. Leibe 2

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: MRF Structure for Images Basic structure Two components  Observation model –How likely is it that node x i has label L i given observation y i ? –This relationship is usually learned from training data.  Neighborhood relations –Simplest case: 4-neighborhood –Serve as smoothing terms.  Discourage neighboring pixels to have different labels. –This can either be learned or be set to fixed “penalties”. 3 B. Leibe “True” image content Noisy observations

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: How to Set the Potentials? Unary potentials  E.g. color model, modeled with a Mixture of Gaussians  Learn color distributions for each label 4 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: How to Set the Potentials? Pairwise potentials  Potts Model –Simplest discontinuity preserving model. –Discontinuities between any pair of labels are penalized equally. –Useful when labels are unordered or number of labels is small.  Extension: “contrast sensitive Potts model” where –Discourages label changes except in places where there is also a large change in the observations. 5 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: Graph Cuts for Binary Problems 6 B. Leibe n-links s t a cut t-link EM-style optimization “expected” intensities of object and background can be re-estimated [Boykov & Jolly, ICCV’01] Slide credit: Yuri Boykov

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: s-t-Mincut Equivalent to Maxflow 7 B. Leibe Source Sink v1v1 v2v Slide credit: Pushmeet Kohli Augmenting Path Based Algorithms 1.Find path from source to sink with positive capacity 2.Push maximum possible flow through this path 3.Repeat until no path can be found Algorithms assume non-negative capacity Flow = 0

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: When Can s-t Graph Cuts Be Applied? s-t graph cuts can only globally minimize binary energies that are submodular. Submodularity is the discrete equivalent to convexity.  Implies that every local energy minimum is a global minimum.  Solution will be globally optimal. 8 B. Leibe t-links n-links Boundary term Regional term E(L) can be minimized by s-t graph cuts Submodularity (“convexity”) [Boros & Hummer, 2002, Kolmogorov & Zabih, 2004]

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap:  -Expansion Move Basic idea:  Break multi-way cut computation into a sequence of binary s-t cuts.  No longer globally optimal result, but guaranteed approximation quality and typically converges in few iterations. 9 B. Leibe other labels  Slide credit: Yuri Boykov

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Recap: Simple Binary Image Denoising Model MRF Structure  Example: simple energy function (“Potts model”) –Smoothness term: fixed penalty ¯ if neighboring labels disagree. –Observation term: fixed penalty ´ if label and observation disagree. 10 B. Leibe “True” image content Noisy observations “Smoothness constraints” Observation process PriorSmoothnessObservation Image source: C. Bishop, 2006

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Converting an MRF into an s-t Graph Conversion: Energy:  Unary potentials are straightforward to set. –How? –Just insert x i = 1 and x i = -1 into the unary terms above B. Leibe Source Sink xixi xjxj

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Converting an MRF into an s-t Graph Conversion: Energy:  Unary potentials are straightforward to set. How?  Pairwise potentials are more tricky, since we don’t know x i ! –Trick: the pairwise energy only has an influence if x i  x j. –(Only!) in this case, the cut will go through the edge { x i, x j }. 12 B. Leibe Source Sink xixi xjxj

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Topics of This Lecture Learning Bayesian Networks  Learning with known structure, full observability  Learning with known structure, partial observability  Structure learning Modelling and Applying Bayesian Networks  Case study: a Bayes net for pedestrian detection  Defining the network structure  Defining the variables  Modelling the CPTs  Results & Discussion 13 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Bayesian Networks What we’ve learned so far…  We know they are directed graphical models.  Their joint probability factorizes into conditional probabilities,  We know how to convert them into undirected graphs.  We know how to perform inference for them. –Sum/Max-Product BP for exact inference in (poly)tree-shaped BNs. –Loopy BP for approximate inference in arbitrary BNs. –Junction Tree algorithm for converting arbitrary BNs into trees. But what are they actually good for?  How do we apply them in practice?  And how do we learn their parameters? 14 B. Leibe Image source: C. Bishop, 2006

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Parameter Learning in Bayesian Networks We need to specify two things:  Structure of Bayesian network (graph topology)  Parameters of each conditional probability table (CPT) It is possible to learn both from training data.  But learning structure is much harder than learning parameters.  Also, learning when some nodes are hidden is much harder than when everything is observable. Four cases: 15 B. Leibe StructureObservabilityMethod KnownFullMaximum Likelihood Estimation KnownPartialEM (or gradient ascent) UnknownFullSearch through model space UnknownPartialEM + search through model space

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Learning Parameters Example:  Assume each variable x i is discrete and can take K i values.  The parameters of this model can be represented with 4 tables (called conditional probability tables – CPT): – p ( x 1 = k ) = µ 1, k µ 1 has K 1 entries. – p ( x 2 = k ’ | x 1 = k ) = µ 2, k, k ’ µ 2 has K 1 £ K 2 entries. – p ( x 3 = k ’ | x 1 = k ) = µ 3, k, k ’ – p ( x 4 = k ’ | x 2 = k ) = µ 4, k, k ’ –Note that 16 B. Leibe Slide credit: Zoubin Ghahramani

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Case 1: Known Structure, Full Observability Assume a training data set:  How do we learn µ from D ? Maximum Likelihood: Maximum Log-Likelihood: 17 B. Leibe Slide credit: Zoubin Ghahramani

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Case 1: Known Structure, Full Observability Maximum Log-Likelihood:  This decomposes into a sum of functions µ i.  Each µ i can be optimized separately: where n i, k, k ’ is the number of times in D that x i = k ’ and x pa ( i ) = k. ML solution  Simply calculate frequencies! 18 B. Leibe Slide credit: Zoubin Ghahramani

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Case 2: Known Structure, Hidden Variables ML learning with hidden variables  Assume a model parameterized by µ with observed variables X and hidden (latent) variables Z. Goal  Maximize parameter log-likelihood given the observed data EM Algorithm: Iterate between two steps:  E-step: fill-in hidden / missing variables  M-step: apply complete-data learning to filled-in data. 19 B. Leibe Slide adapted from Zoubin Gharahmani Z1Z1 Z2Z2 Z3Z3 X

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Learning with Hidden Variables: EM Algorithm Goal:  Maximize parameter log-likelihood given the observed data. EM Algorithm: Derivation  We do not know the values of the latent variables in Z, but we can express their posterior distribution given X and (an initial guess for) µ.  E-step: Evaluate  Since we cannot use the complete-data log-likelihood directly, we maximize its expected value under the posterior distribution of Z.  M-step: Maximize 20 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Learning with Hidden Variables: EM Algorithm Note on the E-step:  The E-step requires solving the inference problem.  I.e. finding the distribution over the hidden variables given the current model parameters.  This can be done using belief propagation or the junction tree algorithm.  As inference becomes a subroutine of the learning procedure, fast inference algorithms are crucial! 21 B. Leibe Slide adapted from Bernt Schiele

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Example Application Mixture-of-Gaussian Fitting with EM  Standard application of EM.  Corresponding Bayesian network: Important point here  Bayesian networks can be treacherous!  They hide the true complexity in a very simple-looking diagram.  E.g. the diagram here only encodes the information that we have a latent variable µ which depends on observed variables x i –The information that p ( x i | µ ) is represented by a mixture-of- Gaussians needs to be communicated additionally! –On the other hand, this general framework can also be used to apply EM for other types of distributions or latent variables. 22 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Summary: Learning with Known Structure ML-Learning with complete data (no hidden variables)  Log-likelihood decomposes into sum of functions of µ i.  Each µ i can be optimized separately.  ML-solution: simply calculate frequencies. ML-Learning with incomplete data (hidden variables)  Iterative EM algorithm.  E-step: compute expected counts given previous settings µ ( t ) of parameters  E [ n i, j, k | D, µ ( t ) ].  M-step: re-estimate parameters µ using the expected counts. 23 B. Leibe Slide credit: Bernt Schiele

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Cases 3+4: Unknown Structure Goal  Learn a directed acyclic graph (DAG) that best explains the data. Constraints-based learning  Use statistical tests of marginal and conditional independence.  Find the set of DAGs whose d-separation relations match the results of conditional independence tests. Score-based learning  Use a global score such as BIC (Bayes Information Criterion).  Find a structure that maximizes this score. 24 B. Leibe Slide adapted from Zoubin Gharahmani

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Cases 3+4: Unknown Structure Extremely hard problem  NP-hard  Number of DAGs on N variables is super-exponential in N. –4 nodes: 543 DAGs –10 nodes: O(10 18 ) DAGs.  Need to use heuristics to prune down the search space and use efficient methods to evaluate hypotheses. Additional problem: often not enough data available.  Need to make decisions about statistical conditional independence.  Typically only feasible if the structure is relatively simple and a lot of data is available… 25 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Example Application Analyzing gene expression from micro-array data  1000s of measurement spots (probes) on micro-array, each sensitive to a specific DNA marker (e.g a section of a gene).  The probes measure if the corresponding gene is expressed (=active).  Collect samples from patients with a certain disease or condition.  Monitor 1000s of genes simulta- neously. Interesting questions  Is there a statistical relationship between certain gene expressions?  If so, can we derive the structure by which they influence each other? 26 B. Leibe Image source: Wikipedia Micro-array with ~40k probes

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Topics of This Lecture Learning Bayesian Networks  Learning with known structure, full observability  Learning with known structure, partial observability  Structure learning Modelling and Applying Bayesian Networks  Case study: a Bayes net for pedestrian detection  Defining the network structure  Defining the variables  Modelling the CPTs  Results & Discussion 27 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Case Study: Improving Object Detection Goal  Detect objects of a certain category in real-world images, e.g. pedestrians. Typical results: 28 B. Leibe True Detection True Detections Missed False Positives Slide credit: Derek Hoiem

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Idea: Use Scene Geometry Idea  If we know something about the scene geometry, we can use those constraints to improve the detection results.  Objects of interest occur on the ground plane.  Objects have a class-specific size distribution. 29 B. Leibe ImageWorld Image source: Derek Hoiem

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Idea: Use Scene Geometry Procedure  Take the footpoint of the object’s bounding box in the image.  Cast a sight ray through this point and intersect it with the ground plane  Distance to object in 3D.  Do the same for the top point  Object size in 3D. 30 B. Leibe Image source: Derek Hoiem

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Important Caveat Important to avoid hard decisions  If we restrict detections to lie on the ground plane and our estimate for the ground plane is wrong, then everything will fail!  Integrate information probabilistically.  This can be done with a Bayesian network In the following, we will analyze the approach by  A. Ess, B. Leibe, L. Van Gool, Depth and Appearance for Mobile Scene Analysis, ICCV’07, 2007.Depth and Appearance for Mobile Scene Analysis  See how such a problem can be modeled in a Bayesian network framework.  How the model can be learned.  What level of complexity can be hidden in seemingly simple graphical models. 31 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Approach Combine several different cues  Ground plane measurements from stereo depth,  Appearance-based object detections,  Depth measurements to verify detections. Goal  Find the best combined solution for the ground plane and its supporting detections. 32 B. Leibe Ground plane measurementsObject detections Depth verification

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Bayesian Network Elements  Ground plane: ¼  Ground plane measurements: ¼ D  Object detections: o i  Depth verification: d i 33 B. Leibe What would be a good structure?

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Bayesian Network Elements  Ground plane: ¼  Ground plane measurements: ¼ D  Object detections: o i  Depth verification: d i 34 B. Leibe What about this one?

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 What Should This Network Express? Interpretation  The entire network is parameterized over the ground plane ¼.  For a certain state of ¼, we can measure how consistent this state is with the ground plane measurements ¼ D.  We have a set of object detections (bounding boxes) { o i } whose validity depends on the ground plane parameters ¼.  Each object detection also has a depth measurement d i. 35 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 What Should This Network Express? How exactly should the { o i } depend on ¼ and d i ? Idea 1:  We get a distance estimate from the ground plane (through the bounding box footpoint) and a distance estimate from the stereo depth map.  Check whether those two distance estimates are consistent.  Dependency: 36 B. Leibe Distance from stereo Distance from ground plane

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 What Should This Network Express? How exactly should the { o i } depend on ¼ and d i ? Idea 2:  From the ground plane and the object’s bounding box, we also get a height estimate for the object in the world.  Evaluate this under a population distribution.  Dependency: 37 B. Leibe Distance from ground plane Height distribution of pedestrians

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 What Should This Network Express? How exactly should the { o i } depend on ¼ and d i ? Idea 3:  We can also analyze the depth values inside the detection bounding box and check whether their distribution is typical for a real pedestrian.  Train a classifier on that (logistic regression).  Dependency: 38 B. Leibe Distance from stereo %Depth inliers for correct and incorrect detections

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 What Should This Network Express? How exactly should the { o i } depend on ¼ and d i ?  Now we have three different dependencies we could model:  We cannot take all three in a Bayesian network!  Need to redesign… 39 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 What Should This Network Express? Idea  Split up o i into a consistency c i and a validity v i.  c i checks whether the distances are consistent:  v i verifies the object size and depth distribution: 40 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Factorization of the joint  Given this network structure, the joint probability factorizes as  Next, we need to define those probabilities and find ways how to learn them. What Should This Network Express? 41 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Necessary Check  Is this network complete and feasible?  In order to verify that, we need to check whether we can define the variable states and CPTs. If this is not possible yet, we need to reconsider… What Should This Network Express? 42 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the Variables… Ground plane ¼  In order to specify a ground plane, we need to define a normal vector n and a distance d according to the equation  Since n is normalized, we only need to store 2 parameters for it.  Represent it in terms of polar angles µ and Á.  Three parameters in total. 43 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the Variables… Ground plane ¼ (cont’d)  Discretize each of the 3 parameters to get a 3D grid with 6 £ 6 £ 20 = 720 cells.  Those grid cells represent the possible states the ground plane can take on.  This is quite a large state space! –Necessary to make detailed decisions about different ground planes. –But many parameters need to be learned. Learning p ( ¼ )  Process several sequences with a moving camera setup and annotated ground plane coordinates (from Structure-from-Motion). –Here: 1,600 frames 44 B. Leibe Learned prior

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the Variables… Ground plane measurements ¼ D  Want to verify the consistency of each ground plane hypothesis ¼ with the stereo depth measurements.  Also define ¼ D over the same discretized state space as ¼. 45 B. Leibe Ground plane measurements

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the Variables… Depth verification d i  We can only use the depth verification for an object bounding box if we get confident stereo depth measurements there.  Define d i as a binary variable with values {0,1}  {unavailable, available}. 46 B. Leibe Stereo depth verification

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the Variables… Objct distance consistency c i  This should determine whether the two distance estimates from both cues are consistent.  Define c i as a binary variable with values {0,1}  {inconsistent, consistent}. 47 B. Leibe Distance from stereo Distance from ground plane

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the Variables… Object validity v i  This should include the additional verification by the inferred object height and the stereo depth-based classifier.  If p ( v i | ¼ ) is low, but p ( v i | d i ) is high, then the current ground plane estimate may be wrong…  Define v i as a binary variable with values {0,1}  {invalid, valid}. 48 B. Leibe Depth-based classifier Height from ground plane

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… For each link, we need to specify the dependency.  Defined by the corresponding conditional prob. table (CPT).  Dimensionality directly follows from the variable dimensions.  Need to think about what we want to model here and how… In addition, we need to define how to set the observed variables. 49 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… What should the observations ¼ D contain?  Measure for each ground plane hypothesis the consistency with the depth measurements (robust least-median-of-squares residual).  Convert this into the value range [0,1].  This defines ¼ D. 50 B. Leibe Ground plane measurements

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… Now define p ( ¼ D | ¼ )  CPT has extremely high dimensionality!  Ooops!  But we can assume that we only have dependen- cies between corresponding cells.  Set p ( ¼ D | ¼ ) to the identity matrix. 51 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… Next learn the prior p ( d i )  Use a training set with hand-annotated pedestrian bounding boxes.  Measure how often the bounding box contains useful depth values.  Experimentally found: p ( d i = 1) ¼ B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… Define the consistency measure p ( c i | ¼, d i )  Set p ( c i | ¼, d i = 0) as uniform, since an inaccurate depth map gives no information about an object’s presence.  Learn p ( c i | ¼, d i = 1) from an annotated training set: model the acceptable distance deviation between the two cues by a Gaussian. 53 B. Leibe Evaluate for each value of ¼ !

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… Define the first validity measure p ( v i | ¼, c i )  For each ground plane hypothesis, evaluate the object height under a population distribution (modeled by a Gaussian, parameters learned from training data). 54 B. Leibe 0 1 Height distribution of pedestrians

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… Define the second validity measure p ( v i | d i )  Measure the number of pixels in the bounding box that can be considered uniform in depth.  Train a classifier for p ( v i | d i = 1 ) using logistic regression. 55 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Defining the CPTs… Now, everything is defined.  Was more effort than you thought, wasn’t it? Again emphasizes important point  A BN’s simple structure can hide the true complexity!  Possible to encode powerful effects.  But need to look closely at what’s happening inside… 56 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Results: Object & Ground-plane Reasoning Effect:  Reliable detections from scene context  Accurate 3D positioning from depth map 57 recording setup [Ess, Leibe, Van Gool, ICCV’07]

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Discussion 58 B. Leibe

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 References and Further Reading The Bayes net we discussed is described in detail in the following paper:  A. Ess, B. Leibe, L. Van Gool, Depth and Appearance for Mobile Scene Analysis, ICCV’07, 2007.Depth and Appearance for Mobile Scene Analysis B. Leibe 59

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Case 2: Known Structure, Hidden Variables ML learning with hidden variables  Assume a model parameterized by µ with observed variables Y and hidden (latent) variables X. Goal  Maximize parameter log-likelihood given the observed data EM Algorithm: Iterate between two steps:  E-step: fill-in hidden / missing variables  M-step: apply complete-data learning to filled-in data. 60 B. Leibe Slide credit: Bernt Schiele

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Learning with Hidden Variables: EM Algorithm Goal:  Maximize parameter log-likelihood given the observed data. EM Algorithm: Derivation  The Kullback-Leibler divergence KL ( p k q ) measures the additional information needed to specify the value of x by q ( x ) when p ( x ) represents the true distribution of the data.  By minimizing the KL divergence (= maximizing the negative KL), we can force q ( X ) to become close to p ( Y, X | µ ). 61 B. Leibe Slide credit: Bernt Schiele

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Learning with Hidden Variables: EM Algorithm EM Algorithm: Derivation E-step:  Maximize F ( q ( X )| µ ( t ) ) w.r.t. q ( X ) holding µ ( t ) fixed: M-step:  Maximize F ( q ( X )| µ ( t ) ) w.r.t. µ ( t ) holding q ( X ) fixed 62 B. Leibe Slide credit: Bernt Schiele