Contextual models for object detection using boosted random fields by Antonio Torralba, Kevin P. Murphy and William T. Freeman.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.
Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
Learning on the Test Data: Leveraging “Unseen” Features Ben Taskar Ming FaiWong Daphne Koller.
Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Support Vector Machines
Parameter Learning in MN. Outline CRF Learning CRF for 2-d image segmentation IPF parameter sharing revisited.
Loopy Belief Propagation a summary. What is inference? Given: –Observabled variables Y –Hidden variables X –Some model of P(X,Y) We want to make some.
Automatic Speech Recognition II  Hidden Markov Models  Neural Network.
CMPUT 466/551 Principal Source: CMU
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
Computer vision: models, learning and inference
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Belief Propagation on Markov Random Fields Aggeliki Tsoli.
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
Ensemble Learning: An Introduction
Learning Low-Level Vision William T. Freeman Egon C. Pasztor Owen T. Carmichael.
Adaboost and its application
Understanding Belief Propagation and its Applications Dan Yuan June 2004.
Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.
Bayesian Learning for Conditional Models Alan Qi MIT CSAIL September, 2005 Joint work with T. Minka, Z. Ghahramani, M. Szummer, and R. W. Picard.
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Boosting for tumor classification
Radial-Basis Function Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Super-Resolution of Remotely-Sensed Images Using a Learning-Based Approach Isabelle Bégin and Frank P. Ferrie Abstract Super-resolution addresses the problem.
Latent Boosting for Action Recognition Zhi Feng Huang et al. BMVC Jeany Son.
24 November, 2011National Tsin Hua University, Taiwan1 Mathematical Structures of Belief Propagation Algorithms in Probabilistic Information Processing.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Lecture 26: Single-Image Super-Resolution CAP 5415.
Background Subtraction based on Cooccurrence of Image Variations Seki, Wada, Fujiwara & Sumi Presented by: Alon Pakash & Gilad Karni.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall 2004 AdaBoost.. Binary Classification. Read 9.5 Duda,
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Training Conditional Random Fields using Virtual Evidence Boosting Lin Liao, Tanzeem Choudhury †, Dieter Fox, and Henry Kautz University of Washington.
Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.
Introduction to Belief Propagation
Belief Propagation and its Generalizations Shane Oldenburger.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
Efficient Belief Propagation for Image Restoration Qi Zhao Mar.22,2006.
Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Bayesian Belief Propagation for Image Understanding David Rosenberg.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
10 October, 2007 University of Glasgow 1 EM Algorithm with Markov Chain Monte Carlo Method for Bayesian Image Analysis Kazuyuki Tanaka Graduate School.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Learning Coordination Classifiers
Today.
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
ECE 5424: Introduction to Machine Learning
Machine Learning Week 1.
Introduction to Data Mining, 2nd Edition
Introduction to Boosting
network of simple neuron-like computing elements
Markov Random Fields Presented by: Vladan Radosavljevic.
LECTURE 15: REESTIMATION, EM AND MIXTURES
Model generalization Brief summary of methods
Markov Networks.
Presentation transcript:

Contextual models for object detection using boosted random fields by Antonio Torralba, Kevin P. Murphy and William T. Freeman

Quick Introduction What is this? Now can you tell?

Belief Propagation (BP) Network (Pairwise Markov Random Fields)  observed nodes ( y i )

Belief Propagation (BP) Network (Pairwise Markov Random Fields)  observed nodes ( y i )  hidden nodes ( x i )

Belief Propagation (BP) Network (Pairwise Markov Random Fields)  observed nodes ( y i )  hidden nodes ( x i ) Statistical dependency, called local evidence: Shord-hand

Belief Propagation (BP) Statistical dependency: Local evidence Shord-hand Statistical dependency: Compatibility function

Belief Propagation (BP) Joint probability

Belief Propagation (BP) Joint probability x x1x1 x2x2 xixi …. x5x5 x3x3 x1x1 x4x4 xjxj x 12 y1y1 y2y2 yiyi

Belief Propagation (BP) Joint probability x x1x1 x2x2 xixi …. x5x5 x3x3 x1x1 x4x4 xjxj x 12 y1y1 y2y2 yiyi

Belief Propagation (BP) The belief b at a node i is represented by  the local evidence of the node  all the messages coming in from neighbors xixi xjxj ∏ NiNi yiyi

Belief Propagation (BP) The belief b at a node i is represented by  the local evidence of the node  all the messages coming in from neighbors xixi xjxj ∏ NiNi yiyi

Belief Propagation (BP) Messages m between hidden nodes How likely node j thinks it is that node i will be in the corresponding state. xixi xjxj m ji (x i )

Belief Propagation (BP) xixi xjxj xkxk xixi xjxj m ji (x i )

Conditional Random Field Distribution of the form:

Conditional Random Field Distribution of the form:

Boosted Random Field Basic Idea:  Use BP to estimate P(x|y)  Use boosting to maximize Log Likelihood of each node wrt to

Algorithm: BP Minimize negative log likelihood of training data ( y i ). Label Loss function to minimize:

Algorithm: BP Minimize negative log likelihood of training data ( y i ). Label Loss function to minimize:

Algorithm: BP Minimize negative log likelihood of training data ( y i ). Label Loss function to minimize:

Algorithm: BP xixi xjxj NiNi ∏ yiyi

xixi xjxj NiNi ∏ yiyi

xixi xjxj NiNi ∏

xixi xjxj

xixi F : a function of the input data yiyi

Algorithm: BP xixi xjxj with yiyi

Algorithm: BP xixi xjxj with yiyi

Function F xixi yiyi Boosting!  f is the weak learner: weighted decision stumps.

Minimization of loss L

where

Local Evidence: algorithm For t=1..T  Iterate N boost times find the best basis function h update local evidence with update the beliefs update the weights  Iterate N BP times update messages update the beliefs xixi xjxj yiyi

Local Evidence: algorithm For t=1..T  Iterate N boost times find the best basis function h update local evidence with update the beliefs update the weights  Iterate N BP times update messages update the beliefs xixi xjxj yiyi

Local Evidence: algorithm For t=1..T  Iterate N boost times find the best basis function h update local evidence with update the beliefs update the weights  Iterate N BP times update messages update the beliefs xixi xjxj yiyi

Local Evidence: algorithm For t=1..T  Iterate N boost times find the best basis function h update local evidence with update the beliefs update the weights  Iterate N BP times update messages update the beliefs xixi xjxj yiyi

Local Evidence: algorithm For t=1..T  Iterate N boost times find the best basis function h update local evidence with update the beliefs update the weights  Iterate N BP times update messages update the beliefs xixi xjxj yiyi

Local Evidence: algorithm For t=1..T  Iterate N boost times find the best basis function h update local evidence with update the beliefs update the weights  Iterate N BP times update messages update the beliefs xixi xjxj yiyi

Function G By assuming that the graph is densely connected we can make the approximation: Now G is a non-linear additive function of the beliefs:

Function G Instead of learningthe function can be learnt with an additive model: weighted regression stumps

Function G The weak learner is chosen by minimizing the loss:

The Boosted Random Field Algorithm For t=1..T  find the best basis function h for f  find the best basis function for  compute local evidence  compute compatibilities  update the beliefs  update weights xixi xjxj yiyi

The Boosted Random Field Algorithm For t=1..T  find the best basis function h for f  find the best basis function for  compute local evidence  compute compatibilities  update the beliefs  update weights xixi b1b1 b2b2 bjbj …

Final classifier For t=1..T  update local evidences F  update compatibilities G  compute current beliefs Output classification:

Multiclass Detection U: Dictionary of ~2000 images patches V: Same number of image masks

Multiclass Detection U: Dictionary of ~2000 images patches V: Same number of image masks At each round t, for each class c for each dictionary entry d there is a weak learner:

Function f To take into account different sizes, we first downsample the image and then upsample and OR the scales: which is our function for computing the local evidence.

Function g The compatibily function has a similar form:

Function g The compatibily function has a similar form: W represent a kernel with all the messages directed to node x, y, c

Kernels W Example of incoming messages:

Function G The overall incoming messages function is given by:

Learning… Labeled dataset of office and street scenes, with each ~100 images  In the first 5 round updated only the local evidence  After the 5th iteration update also the compatibility functions At each round update only F and G of the single object class that reduces the most the multiclass cost.

Learning… Biggest objects are detected first because they reduce the error of all classes the fastest:

The End

Introduction Observed: Picture Dictionary: Dog P(Dog|Pic)

Introduction P(Head|Pic i ) P(Tail|Pic i ) P(Front Legs|Pic i ) P(Back Legs|Pic i )

Introduction Comp(Head, Legs) Comp(Head, Tail) Comp(F. Legs, B. Legs) Comp(Tail, Legs) Dog!

Introduction P(Piraña|Pic i ) Comp(Piraña, Legs)

Graphical Models Observation nodes y i Y y i can be a pixel or a patch

Graphical Models Hidden Nodes Local Evidence: X Dictionary Shord-hand

Graphical Models Compatibility Function: X