A Graphical Model For Simultaneous Partitioning And Labeling Philip Cowans & Martin Szummer AISTATS, Jan 2005 Cambridge.

Slides:

Advertisements

Similar presentations

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.

Advertisements

CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.

Lauritzen-Spiegelhalter Algorithm

Constrained Approximate Maximum Entropy Learning (CAMEL) Varun Ganapathi, David Vickrey, John Duchi, Daphne Koller Stanford University TexPoint fonts used.

Introduction to Markov Random Fields and Graph Cuts Simon Prince

Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.

Exact Inference in Bayes Nets

Dynamic Bayesian Networks (DBNs)

An Introduction to Variational Methods for Graphical Models.

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data John Lafferty Andrew McCallum Fernando Pereira.

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.

From Variable Elimination to Junction Trees

CS774. Markov Random Field : Theory and Application Lecture 06 Kyomin Jung KAIST Sep

1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.

Recent Development on Elimination Ordering Group 1.

Conditional Random Fields

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

Understanding Belief Propagation and its Applications Dan Yuan June 2004.

Exact Inference: Clique Trees

Bayesian Networks Alan Ritter.

Computer vision: models, learning and inference Chapter 10 Graphical Models.

Scalable Text Mining with Sparse Generative Models

Conditional Random Fields Rahul Gupta (KReSIT, IIT Bombay)

Computer vision: models, learning and inference

CSC2535 Spring 2013 Lecture 2a: Inference in factor graphs Geoffrey Hinton.

Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.

Sketch Recognition for Digital Circuit Diagrams in the Classroom Christine Alvarado Harvey Mudd College March 26, 2007 Joint work with the HMC Sketchers.

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

12/07/2008UAI 2008 Cumulative Distribution Networks and the Derivative-Sum-Product Algorithm Jim C. Huang and Brendan J. Frey Probabilistic and Statistical.

Graphical models for part of speech tagging

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.

Maximum Entropy (ME) Maximum Entropy Markov Model (MEMM) Conditional Random Field (CRF)

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

Presented by Jian-Shiun Tzeng 5/7/2009 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania CIS Technical Report MS-CIS

An Introduction to Variational Methods for Graphical Models

Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)

Associative Hierarchical CRFs for Object Class Image Segmentation

Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.

Lecture 2: Statistical learning primer for biologists

Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.

Functional Data Graphical Models Hongxiao Zhu Virginia Tech July 2, 2015 BIRS Workshop 1 (Joint work with Nate Strawn and David B. Dunson)

John Lafferty Andrew McCallum Fernando Pereira

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Christopher M. Bishop, Pattern Recognition and Machine Learning 1.

Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.

Pattern Recognition and Machine Learning

Probabilistic Equational Reasoning Arthur Kantor

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

Machine Learning – Lecture 18

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 13 Exact Inference & Belief Propagation Bastian.

Today Graphical Models Representing conditional dependence graphically

1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.

Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.

Bayesian Conditional Random Fields using Power EP Tom Minka Joint work with Yuan Qi and Martin Szummer.

Inference in Bayesian Networks

Exact Inference Continued

Prof. Adriana Kovashka University of Pittsburgh April 4, 2017

Learning to Combine Bottom-Up and Top-Down Segmentation

CSCI 5822 Probabilistic Models of Human and Machine Learning

Markov Random Fields Presented by: Vladan Radosavljevic.

Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.

Junction Trees 3 Undirected Graphical Models

Locality In Distributed Graph Algorithms

Presentation transcript:

A Graphical Model For Simultaneous Partitioning And Labeling Philip Cowans & Martin Szummer AISTATS, Jan 2005 Cambridge

Motivation – Interpreting Ink Hand-drawn diagram Machine interpretation

Graph Construction Vertices are grouped into parts. Each part is assigned a label G Edges, E Vertices, V

Labeled Partitions We assume: We assume: Parts are contiguous. Parts are contiguous. The graph is triangulated. The graph is triangulated. We’re interested in probability distributions over labeled partitions conditioned on observed data. We’re interested in probability distributions over labeled partitions conditioned on observed data.

Conditional Random Fields CRFs (Lafferty et. al.) provide joint labeling of graph vertices. CRFs (Lafferty et. al.) provide joint labeling of graph vertices. Idea: define parts to be contiguous regions with same label. Idea: define parts to be contiguous regions with same label. But… But… Large number of labels needed. Large number of labels needed. Symmetry problems / bias. Symmetry problems / bias

A Better Approach… Extend the CRF framework to work directly with labeled partitions. Extend the CRF framework to work directly with labeled partitions. Complexity is improved – don’t need to deal with so many labels. Complexity is improved – don’t need to deal with so many labels. No symmetry problem – we’re working directly with the representation in which the problem is posed. No symmetry problem – we’re working directly with the representation in which the problem is posed.

Consistency Let G and H µ V. Let G and H µ V. Y (G) and Y (H) are consistent if and only if: Y (G) and Y (H) are consistent if and only if: For any vertex in G Å H, Y (G) and Y (H) agree on its label. For any vertex in G Å H, Y (G) and Y (H) agree on its label. For any pair of vertices in G Å H, Y (G) and Y (H) agree on their part membership. For any pair of vertices in G Å H, Y (G) and Y (H) agree on their part membership. Denoted Y (G) v Y (H). Denoted Y (G) v Y (H).

Projection Projection maps labeled partitions onto smaller subgraphs. Projection maps labeled partitions onto smaller subgraphs. If G µ V then, the projection of Y onto G is the unique labeled partition of G which is ‘consistent’ with Y. If G µ V then, the projection of Y onto G is the unique labeled partition of G which is ‘consistent’ with Y.

Notation Y Labeled partition of G Y (A) Labeled partition of the induced subgraph of A µ V YAYAYAYA Projection of Y onto A µ V YiYiYiYi Projection of Y onto vertex i. Y ij Projection of Y onto vertices i and j.

Potentials

The Model - Unary: - Pairwise:

The Model

Training Train by finding MAP weights on example data with Gaussian prior (BFGS). Train by finding MAP weights on example data with Gaussian prior (BFGS). We require the value and gradient of the log posterior: We require the value and gradient of the log posterior: Normalization Marginalization

Prediction New data is processed by finding the most probable labeled partition. New data is processed by finding the most probable labeled partition. This is the same as normalization with the summation replaced by a maximization. This is the same as normalization with the summation replaced by a maximization.

Inference These operations require summation or maximization over all possible labeled partitions. These operations require summation or maximization over all possible labeled partitions. The number of terms grows super- exponentially with the size of G. The number of terms grows super- exponentially with the size of G. Efficient computation possible using message passing as distribution factors. Efficient computation possible using message passing as distribution factors. Proof based on Shenoy & Shafer (1990). Proof based on Shenoy & Shafer (1990).

Factorization A distribution factors if it can be written as a product of potentials for cliques on the graph: A distribution factors if it can be written as a product of potentials for cliques on the graph: This is the case for the (un-normalized) model. This is the case for the (un-normalized) model. This allows efficient computation using message passing. This allows efficient computation using message passing.

Message Passing

1,2,3,4 2,3,4,5 2,91,7,8 4,5,6 ‘Upstream’ Message summarizes contribution from ‘upstream’ to the sum for a given configuration of the separator. Junction tree constructed from cliques on original graph.

Message Passing PartitionLabelsValue (2)(3)(4)+,+, (2)(3)(4)+,-, (2,3)(4)+, (2,3,4) ……… 1,2,3,4 2,3,4,5 2,91,7,8 4,5,6PartitionLabelsValue(2)+0.43 (2)-0.72 PartitionLabelsValue(1)+0.23 (1)-0.57 x22

Message Update Rule Update messages (for summation) according to Update messages (for summation) according to Marginals found using Marginals found using Z can be found explicitly Z can be found explicitly

Complexity Clique Size 2345 CRF £ 10 5 Labeled Partitions

Experimental Results We tested the algorithm on hand drawn ink collected using a Tablet PC. We tested the algorithm on hand drawn ink collected using a Tablet PC. The task is to partition the ink fragments into perceptual objects, and label them as containers or connectors. The task is to partition the ink fragments into perceptual objects, and label them as containers or connectors. Training data set was 40 diagrams, from 17 subjects with a total of 2157 fragments. Training data set was 40 diagrams, from 17 subjects with a total of 2157 fragments. 3 random splits (20 training and 20 test examples). 3 random splits (20 training and 20 test examples).

Example 1

Example 2

Example 3

Labeling Results Model Labeling Error Grouping Error Independent Labeling 8.5%- Joint Labeling 4.5%- Labeled Partitions 2.6%8.5% Labelling error: fraction of fragments labeled incorrectly. Labelling error: fraction of fragments labeled incorrectly. Grouping error: fraction of edges locally incorrect. Grouping error: fraction of edges locally incorrect.

Conclusions We have presented a conditional model definied over labeled partitions of an undirected graph. We have presented a conditional model definied over labeled partitions of an undirected graph. Efficient exact inference is possible in our model using message passing. Efficient exact inference is possible in our model using message passing. Labeling and grouping simultaneously can improve labeling performance. Labeling and grouping simultaneously can improve labeling performance. Our model performs well when applied to the task of parsing hand-drawn ink diagrams. Our model performs well when applied to the task of parsing hand-drawn ink diagrams.

Acknowledgements Thanks to: Thomas Minka, Yuan Qi and Michel Gagnet for useful discussion and providing software. Hannah Pepper for collecting our ink database.