Integration and Graphical Models

Slides:

Advertisements

Similar presentations

Mean-Field Theory and Its Applications In Computer Vision1 1.

Advertisements

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.

Bayesian Belief Propagation

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.

BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.

Introduction to Markov Random Fields and Graph Cuts Simon Prince

Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.

Scene Labeling Using Beam Search Under Mutex Constraints ID: O-2B-6 Anirban Roy and Sinisa Todorovic Oregon State University 1.

Dynamic Bayesian Networks (DBNs)

Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

Computer vision: models, learning and inference

Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.

Belief Propagation on Markov Random Fields Aggeliki Tsoli.

Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)

Robust Higher Order Potentials For Enforcing Label Consistency

Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.

Lecture 5: Learning models using EM

Genome evolution: a sequence-centric approach Lecture 5: Undirected models and variational inference.

2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-

1 Computer Vision Research  Huttenlocher, Zabih –Recognition, stereopsis, restoration, learning  Strong algorithmic focus –Combinatorial optimization.

Conditional Random Fields

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

Understanding Belief Propagation and its Applications Dan Yuan June 2004.

Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.

Graph-based Segmentation

Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/31/15.

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.

Segmentation: MRFs and Graph Cuts Computer Vision CS 143, Brown James Hays 10/07/11 Many slides from Kristin Grauman and Derek Hoiem.

Directed - Bayes Nets Undirected - Markov Random Fields Gibbs Random Fields Causal graphs and causality GRAPHICAL MODELS.

1 CS 391L: Machine Learning: Bayesian Learning: Beyond Naïve Bayes Raymond J. Mooney University of Texas at Austin.

Markov Random Fields Probabilistic Models for Images

Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006.

Lecture 2: Statistical learning primer for biologists

Crash Course on Machine Learning Part V Several slides from Derek Hoiem, Ben Taskar, and Andreas Krause.

Markov Random Fields & Conditional Random Fields

Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.

A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.

1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.

Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/27/12.

Graduate School of Information Sciences, Tohoku University

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Introduction of BP & TRW-S

STEREO MATCHING USING POPULATION-BASED MCMC

Nonparametric Semantic Segmentation

Object detection as supervised classification

Markov Networks.

Learning to Combine Bottom-Up and Top-Down Segmentation

Learning Markov Networks

Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.

Overview of Machine Learning

Graduate School of Information Sciences, Tohoku University

Physical Fluctuomatics 7th~10th Belief propagation

Expectation-Maximization & Belief Propagation

Slides for Sampling from Posterior of Shape

Graduate School of Information Sciences, Tohoku University

Discriminative Probabilistic Models for Relational Data

Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website

Graduate School of Information Sciences, Tohoku University

Markov Networks.

Graduate School of Information Sciences, Tohoku University

Mean Field and Variational Methods Loopy Belief Propagation

Sequential Learning with Dependency Nets

Presentation transcript:

Integration and Graphical Models Derek Hoiem CS 598, Spring 2009 April 14, 2009

Why? The goal of vision is to make useful inferences about the scene. In most cases, this requires integrative reasoning about many types of information.

Example: 3D modeling

Object context From Divvala et al. CVPR 2009

How? Feature passing Graphical models

Class Today Feature passing Graphical models Example Bayesian networks Markov networks Various inference and learning methods Example

Properties of a good mechanism for integration Modular: different processes/estimates can be improved independently Symbiotic: each estimate improves Robust: mistakes in one process are not fatal for others that partially rely on it Feasible: training and inference is fast and easy

Feature Passing Compute features from one estimated scene property to help estimate another Image X Features X Estimate Y Features Y Estimate

Feature passing: example Use features computed from “geometric context” confidence images to improve object detection Features: average confidence within each window Above Object Window Below Hoiem et al. ICCV 2005

Feature Passing Pros and cons Simple training and inference Very flexible in modeling interactions Not modular if we get a new method for first estimates, we may need to retrain Requires iteration to be symbiotic complicates things Robust in expectation but not instance

Probabilistic graphical models Explicitly model uncertainty and dependency structure Directed Undirected Factor graph a a a b b b c d c d c d Key concept: Markov blanket

Directed acyclical graph (Bayes net) Arrow directions matter a a c independent of a given b d independent of a given b a,c,d dependent when conditioned on b b b c d c d P(a,b,c,d) = P(c|b)P(d|b)P(b|a)P(a) P(a,b,c,d) = P(b|a,c,d)P(a)P(c)P(d)

Directed acyclical graph (Bayes net) Can model causality Parameter learning Decomposes: learn each term separately (ML) Inference Simple exact inference if tree-shaped (belief propagation) a b c d P(a,b,c,d) = P(c|b)P(d|b)P(b|a)P(a)

Directed acyclical graph (Bayes net) Can model causality Parameter learning Decomposes: learn each term separately (ML) Inference Simple exact inference if tree-shaped (belief propagation) Loops require approximation Loopy BP Tree-reweighted BP Sampling a b c d P(a,b,c,d) = P(c|b)P(d|a,b)P(b|a)P(a)

Directed graph Example: Places and scenes Place: office, kitchen, street, etc. Objects Present Fire Hydrant Car Person Toaster Microwave P(place, car, person, toaster, micro, hydrant) = P(place) P(car | place) P(person | place) … P(hydrant | place)

Directed graph Example: “Putting Objects in Perspective”

Undirected graph (Markov Networks) Does not model causality Often pairwise Parameter learning difficult Inference usually approximate x1 x2 x3 x4

Markov Networks Example: “label smoothing” grid Binary nodes Pairwise Potential 0 1 0 0 K 1 K 0

Factor graphs A general representation Factor Graph a Bayes Net a b b d c d

Factor graphs A general representation Factor Graph a a b c Markov Net d b c d

Factor graphs Write as a factor graph

Inference: Belief Propagation Very general Approximate, except for tree-shaped graphs Generalizing variants BP can have better convergence for graphs with many loops or high potentials Standard packages available (BNT toolbox, my website) To learn more: Yedidia, J.S.; Freeman, W.T.; Weiss, Y., "Understanding Belief Propagation and Its Generalizations”, Technical Report, 2001: http://www.merl.com/publications/TR2001-022/

Inference: Graph Cuts Associative: edge potentials penalize different labels Associative binary networks can be solved optimally (and quickly) using graph cuts Multilabel associative networks can be handled by alpha-expansion or alpha-beta swaps To learn more: http://www.cs.cornell.edu/~rdz/graphcuts.html Classic paper: What Energy Functions can be Minimized via Graph Cuts? (Kolmogorov and Zabih, ECCV '02/PAMI '04)

Inference: Sampling (MCMC) Metropolis-Hastings algorithm Define transitions and transition probabilities Make sure you can get from any state to any other (ergodicity) Make proposal and accept if rand(1) < P(new state)/P(old state) P(backward transition) / P(transition) Note: if P(state) decomposes, this is easy to compute Example: “Image parsing” by Tu and Zhu to find good segmentation

Learning parameters: maximize likelihood Simply count for Bayes network with discrete variables Run BP and do gradient descent for Markov network Often do not care about full likelihood

Learning parameters: maximize objective SPSA (simultaneous perturbation stochastic approximation) algorithm: Take two trial steps in a random direction, one forward and one backwards Compute loss (or objective) for each and get a pseudo-gradient Take a step according to results Refs Li and Huttenlocher, “Learning for Optical Flow Using Stochastic Optimization”, ECCV 2008 Various papers by Spall on SPSA

Learning parameters: structured learning See also Tsochantaridis et al.: http://jmlr.csail.mit.edu/papers/volume6/tsochantaridis05a/tsochantaridis05a.pdf Szummer et al. 2008

How to get the structure? Set by hand (most common) Learn (mostly for Bayes nets) Maximize score (greedy search) Based on independence tests Logistic regression with L1 regularization for finding Markov blanket For more: www.autonlab.org/tutorials/bayesstruct05.pdf

Graphical Models Pros and cons Very powerful if dependency structure is sparse and known Modular (especially Bayesian networks) Flexible representation (but not as flexible as “feature passing”) Many inference methods Recent development in learning Markov network parameters, but still tricky

Which techniques have I used? Almost all of them Feature passing (ICCV 2005, CVPR 2008) Bayesian networks (CVPR 2006) In factor graph form (ICCV 2007) Semi-naïve Bayes (CVPR 2004) Markov networks (ECCV 2008, CVPR 2007, CVPR 2005: HMM) Belief propagation (CVPR 2006, ICCV 2007) Structured learning (ECCV 2008) Graph cuts (CVPR 2008, ECCV 2008) MCMC (IJCV 2007… didn’t work well) Learning Bayesian structure (2002-2003, not published)

Example: faces, skin, cloth