Data-Driven Markov Chain Monte Carlo

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
Image Segmentation Image segmentation (segmentace obrazu) –division or separation of the image into segments (connected regions) of similar properties.
Semantic Structure from Motion Paper by: Sid Yingze Bao and Silvio Savarese Presentation by: Ian Lenz.
Image Segmentation some examples Zhiqiang wang
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty.
Bayesian statistics – MCMC techniques
Visual Recognition Tutorial
Lecture 6 Image Segmentation
Mean Shift A Robust Approach to Feature Space Analysis Kalyan Sunkavalli 04/29/2008 ES251R.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Segmentation and Tracking of Multiple Humans in Crowded Environments Tao Zhao, Ram Nevatia, Bo Wu IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
1 On the Statistical Analysis of Dirty Pictures Julian Besag.
Problem Sets Problem Set 3 –Distributed Tuesday, 3/18. –Due Thursday, 4/3 Problem Set 4 –Distributed Tuesday, 4/1 –Due Tuesday, 4/15. Probably a total.
Segmentation Divide the image into segments. Each segment:
Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.
Phylogenetic Trees Presenter: Michael Tung
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Today Introduction to MCMC Particle filters and MCMC
Visual Recognition Tutorial
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Data-Driven Markov Chain Monte Carlo Presented by Tomasz MalisiewiczTomasz Malisiewicz for Advanced PerceptionAdvanced Perception 3/1/2006.
Monte Carlo Methods in Partial Differential Equations.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Computer Vision James Hays, Brown
Mean Shift Theory and Applications Reporter: Zhongping Ji.
A General Framework for Tracking Multiple People from a Moving Camera
CSE 185 Introduction to Computer Vision Pattern Recognition 2.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Chapter 10 Image Segmentation.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
EECS 274 Computer Vision Segmentation by Clustering II.
CS654: Digital Image Analysis Lecture 30: Clustering based Segmentation Slides are adapted from:
CS654: Digital Image Analysis
Tracking Multiple Cells By Correspondence Resolution In A Sequential Bayesian Framework Nilanjan Ray Gang Dong Scott T. Acton C.L. Brown Department of.
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Ch.8 Efficient Coding of Visual Scenes by Grouping and Segmentation Bayesian Brain Tai Sing Lee and Alan L. Yuille Heo, Min-Oh.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Segmentation Through Optimization Pyry Matikainen.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Bayesian Inference and Visual Processing: Image Parsing & DDMCMC. Alan Yuille (Dept. Statistics. UCLA) Tu, Chen, Yuille & Zhu (ICCV 2003).
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Introduction to Sampling based inference and MCMC
MCMC Output & Metropolis-Hastings Algorithm Part I
- photometric aspects of image formation gray level images
Advanced Statistical Computing Fall 2016
Particle Filtering for Geometric Active Contours
Dynamical Statistical Shape Priors for Level Set Based Tracking
Latent Variables, Mixture Models and EM
Outline Image Segmentation by Data-Driven Markov Chain Monte Carlo
Image Segmentation Techniques
Edges/curves /blobs Grammars are important because:
Hidden Markov Models Part 2: Algorithms
Image Parsing & DDMCMC. Alan Yuille (Dept. Statistics. UCLA)
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Announcements Project 2 artifacts Project 3 due Thursday night
Announcements Project 4 out today Project 2 winners help session today
EM Algorithm and its Applications
Yalchin Efendiev Texas A&M University
Presentation transcript:

Data-Driven Markov Chain Monte Carlo Presented by Tomasz Malisiewicz for Advanced Perception 3/1/2006 This is a presentation of the paper titled “Image Segmentation by Data-Driven Markov Chain Monte Carlo” by Tu and Zhy in PAMI 2002. Presenter: Tomasz Malisiewicz (tomasz@cmu.edu) PhD student at CMU’s Robotics Institute.

Overview of Talk What is Image Segmentation? How to find a good segmentation? DDMCMC results Image segmentation in a Bayesian statistical framework Markov Chain Monte Carlo for exploring the space of all segmentations Data-Driven methods for exploiting image data and speeding up MCMC

DDMCMC Motivation Iterative approach: consider many different segmentations and keep the good ones Few tunable parameters, ex) # of segments encoded into prior DDMCMC vs Ncuts

Berkeley Segmentation Database Image 326038 Berkeley Ncuts K=30 DDMCMC You have to set K for Ncuts. This Ncuts result was generated using the Berkeley Segmentation Engine

Why a rigorous formulation? Allows us to define what we want the segmentation algorithm to return Assigning a Score to a segmentation

Formulation #1 (and you thought you knew what image segmentation was) Image Lattice: Image: For any point either or Lattice partition into K disjoint regions: Region is discrete label map: Region Boundary is Continuous: An image partition into disjoint regions is not An image segmentation! Regions Contents Are Key! Regions are treated as discrete label maps (easier for maintaining topology) Boundaries are treated as continuous (easier for diffusion)

Formulation #2 (and you thought you knew what image segmentation was) Each Image Region is a realization from a probabilistic model are parameters of model indexed by A segmentation is denoted by a vector of hidden variables W; K is number of regions Bayesian Framework: Space of all segmentations We will characterize the space of all segmentations in a future slide Posterior Likelihood Prior

Prior over segmentations (do you like exponentials?) # of model params Want less regions Want round-ish regions ~ uniform Want less complex models The product in the prior corresponds to the assumption that the prior for each region are independent. |\Theta_i| is the number of parameters of theta C is 0.9 (hard-coded) Gamma is the only free parameter (the only free parameter that they vary) What is \lambda_0 and \mu and \nu ? (not defined in the paper) Want small regions

Likelihood for Images Visual Patterns are independent stochastic processes is model-type index is model parameter vector is image appearance in i-th region Grayscale Color

Four Gray-level Models Uniform Clutter Texture Shading Gray-level model space: Gaussian Intensity Histogram FB Response Histogram B-Spline

Three Color Models (L*,u*,v*) Gaussian Mixture of 2 Gaussians Bezier Spline Color model space:

Calibration Likelihoods are calibrated using empirical study Calibration required to make likelihoods for different models comparable (necessary for model competition) This is a hack. Principled? or Hack?

What did we just do? Def. of Segmentation: Score (probability) of Segmentation: Likelihood of Image = product of region likelihoods Summary of Formulation Slides Regions defined by k-partition:

What do we do with scores? Search

Search through what? Anatomy of Solution Space Space of all k-partitions General partition space Space of all segmentations or Scene Space Partition space K Model spaces Space of all segmentation is the union of all k-segmentations A k-segmentation is the product of one k-partition space and k spaces for the image models A model space is made up of cue-spaces

Searching through segmentations Exhaustive Enumeration of all segmentations Takes too long! Greedy Search (Gradient Ascent) Local minima! Stochastic Search Takes too long MCMC based exploration Described in the rest of this talk!

Why MCMC What is it? What does it do? -A clever way of searching through a high-dimensional space -A general purpose technique of generating samples from a probability -Iteratively searches through space of all segmentations by constructing a Markov Chain which converges to stationary distribution

Designing Markov Chains Three Markov Chain requirements Ergodic: from an initial segmentation W0, any other state W can be visited in finite time (no greedy algorithms); ensured by jump-diffusion dynamics Aperiodic: ensured by random dynamics Detailed Balance: every move is reversible

5 Dynamics 1.) Boundary Diffusion 2.) Model Adaptation 3.) Split Region 4.) Merge Region 5.) Switch Region Model At each iteration, we choose a dynamic with probability q(1),q(2),q(3),q(4),q(5)

Dynamics 1: Boundary Diffusion Diffusion* within Temperature Decreases over Time Brownian Motion Along Curve Normal Boundary Between Regions i and j The motion of {x(s),y(s)} follows steepest ascent equation of log(p(W | I)) plus brownian motion Brownian Motion is a Normal distribution *Movement within partition space

Dynamics 2: Model Adaptation Fit the parameters* of a region by steepest ascent How is this not greedy? It appears to be greedy. *Movement within cue space

Dynamics 3-4: Split and Merge Remaining Variables Are unchanged Split one region into two Probability of Proposed Split q(3) is just the probability of choosing dynamics 3 Data-Driven Speedup Conditional Probability of how likely chain proposes to move to W’ from W

Dynamics 3-4: Split and Merge Remaining Variables Are unchanged Merge two Regions Data-Driven Speedup Probability of Proposed Merge q(3) is just the probability of choosing dynamics 3

Dynamics 5: Model Switching Change models Proposal Probabilities Data-Driven Speedup q(5) is just the probability of choosing dynamics 5

Motivation of DD Region Splitting: How to decide where to split a region? Model Switching: Once we switch to a new model, what parameters do we jump to? vs Page 18 stuff Model Adaptation Required some initial parameter vector

Data Driven Methods Focus on boundaries and model parameters derived from data: compute these before MCMC starts Cue Particles: Clustering in Model Space K-partition Particles: Edge Detection Particles Encode Probabilities Parzen Window Style

Cue Particles In Action Clustering in Color Space This figure shows a few color clusters in (L,u,v) color space. The size of the ball represents the weights of each particle. Each cluster has an associated saliency map

K-partition Particles in Action Edge detection gives us a good idea of where we expect a boundary to be located

Particles or Parzen Window* Locations? What is this particle business about? A particle is just the position of a parzen-window which is used for density estimation *Parzen Windowing also known as: Kernel Density Estimation, Non-parametric density estimation 1D particles

Are you awake: What did we just do? So what type of answer does the Markov Chain return? What can we do with this answer? How many answers to we want? Scores (Probability of Segmentation)  Search 5 MCMC dynamics Data-Driven Speedup (key to making MCMC work in finite time)

Multiple Solutions MAP gives us one solution Output of MCMC sampling Distance between two segmentations W1 and W2 is last paragraph of paper: it is a non-principled way of doing this They simply measure the distance between W1 and W2 by accumulating the difference in the number of regions in W1 and W2 and the types of Image models used at each pixel by W1 and W2 How do we get multiple solutions? Parzen Windows: Again Scene Particles

Why multiple solutions? Segmentation is often not the final stage of computation A higher level task such as recognition can utilize a segmentation We don’t want to make any hard decision before recognition multiple segmentations = good idea

K-adventurers We want to keep a fixed number K of segmentations but we don’t want to keep trivially different segmentations Goal: Keep the K segmentations that best preserve the posterior probability in KL-sense Greedy Algorithm: - Add new particle, remove worst particle The papers shows no results on K-adventurers. Maybe it doesn’t work too well.

Results (Multiple Solutions) This is the result of using different scales not K-adventurers result.

Results

Results (Color Images) http://www.stat.ucla.edu/~ztu/DDMCMC/benchmark_color/benchmark_color.htm

Conclusions DDMCMC: Combines Generative (top-down) and Discriminative (bottom-up) approaches Traverse the space of all segmentations via Markov Chains Does your head hurt? Questions?

References DDMCMC Paper: http://www.cs.cmu.edu/~efros/courses/AP06/Papers/tu-pami-02.pdf DDMCMC Website: http://www.stat.ucla.edu/%7Eztu/DDMCMC/DDMCMC_segmentation.htm MCMC Tutorial by Authors: http://civs.stat.ucla.edu/MCMC/MCMC_tutorial.htm Some nice segmentation results on the DDMCMC website. Funny Anecdote: Reading this paper was similar to performing some type MCMC. After first iteration I only understood 15% of the paper, and after a few iterations I got to 40%, but then another careful reading and I was back down to 30%. Finally, my understanding of this paper converged to a stable high-80% (I believe). It took many hours to understand this work and present it to a broad audience of 1st and 2nd year Robotics PhD/Masters Students at CMU.