Learning Equivalence Classes of Bayesian-Network Structures David M. Chickering Presented by Dmitry Zinenko.

Slides:



Advertisements
Similar presentations
CS 336 March 19, 2012 Tandy Warnow.
Advertisements

Great Theoretical Ideas in Computer Science for Some.
Coloring Warm-Up. A graph is 2-colorable iff it has no odd length cycles 1: If G has an odd-length cycle then G is not 2- colorable Proof: Let v 0, …,
CS 473Lecture 131 CS473-Algorithms I Lecture 13-A Graphs.
Constraint Satisfaction Problems
Interval Graph Test.
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
Introduction to Graphs
Lauritzen-Spiegelhalter Algorithm
2/14/13CMPS 3120 Computational Geometry1 CMPS 3120: Computational Geometry Spring 2013 Planar Subdivisions and Point Location Carola Wenk Based on: Computational.
Great Theoretical Ideas in Computer Science for Some.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture10.
EMIS 8374 Vertex Connectivity Updated 20 March 2008.
CompSci 102 Discrete Math for Computer Science April 19, 2012 Prof. Rodger Lecture adapted from Bruce Maggs/Lecture developed at Carnegie Mellon, primarily.
1 Chapter 22: Elementary Graph Algorithms IV. 2 About this lecture Review of Strongly Connected Components (SCC) in a directed graph Finding all SCC (i.e.,
Finding Optimal Bayesian Networks with Greedy Search
Yangjun Chen 1 Bipartite Graphs What is a bipartite graph? Properties of bipartite graphs Matching and maximum matching - alternative paths - augmenting.
Bayesian Network Representation Continued
1 Data Structures and Algorithms Graphs I: Representation and Search Gal A. Kaminka Computer Science Department.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Yangjun Chen 1 Bipartite Graph 1.A graph G is bipartite if the node set V can be partitioned into two sets V 1 and V 2 in such a way that no nodes from.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 14 Strongly connected components Definition and motivation Algorithm Chapter 22.5.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Maximal Independent Set Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Graph Algorithms Using Depth First Search Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms.
Circle Graph and Circular Arc Graph Recognition. 2/41 Outlines Circle Graph Recognition Circular-Arc Graph Recognition.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Multiple-Source Shortest Paths in Planar Graphs Allowing Negative Lengths Philip Klein Brown University.
Spring 2015 Lecture 10: Elementary Graph Algorithms
Dijkstra’s Algorithm. Announcements Assignment #2 Due Tonight Exams Graded Assignment #3 Posted.
Tree A connected graph that contains no simple circuits is called a tree. Because a tree cannot have a simple circuit, a tree cannot contain multiple.
CSCI 115 Chapter 7 Trees. CSCI 115 §7.1 Trees §7.1 – Trees TREE –Let T be a relation on a set A. T is a tree if there exists a vertex v 0 in A s.t. there.
Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.
5.2 Trees  A tree is a connected graph without any cycles.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
Course files
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Introduction to Graphs. This Lecture In this part we will study some basic graph theory. Graph is a useful concept to model many problems in computer.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Graphs Slide credits:  K. Wayne, Princeton U.  C. E. Leiserson and E. Demaine, MIT  K. Birman, Cornell U.
Graph Algorithms Maximum Flow - Best algorithms [Adapted from R.Solis-Oba]
Great Theoretical Ideas in Computer Science for Some.
Today Graphical Models Representing conditional dependence graphically
Interval Graph Test Wen-Lian Hsu.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
COMPSCI 102 Introduction to Discrete Mathematics.
Iterative Improvement for Domain-Specific Problems Lecturer: Jing Liu Homepage:
1 Graph theory Outline A graph is an abstract data type for storing adjacency relations –We start with definitions: Vertices, edges, degree and sub-graphs.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Introduction to Algorithms
School of Computing Clemson University Fall, 2012
Bipartite Graphs What is a bipartite graph?
Maximum Flow - Best algorithms
Graph Algorithms Using Depth First Search
Autumn 2016 Lecture 11 Minimum Spanning Trees (Part II)
Planarity Testing.
3.3 Applications of Maximum Flow and Minimum Cut
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
Algorithms (2IL15) – Lecture 5 SINGLE-SOURCE SHORTEST PATHS
Bipartite Graph 1. A graph G is bipartite if the node set V can be partitioned into two sets V1 and V2 in such a way that no nodes from the same set are.
Algorithms (2IL15) – Lecture 7
Discrete Mathematics for Computer Science
Winter 2019 Lecture 11 Minimum Spanning Trees (Part II)
3.2 Graph Traversal.
Applied Discrete Mathematics Week 13: Graphs
Autumn 2019 Lecture 11 Minimum Spanning Trees (Part II)
Presentation transcript:

Learning Equivalence Classes of Bayesian-Network Structures David M. Chickering Presented by Dmitry Zinenko

Heuristic Search  We are looking for the best state in the search space. Na ï vely: state = a particular DAG search space = all possible DAGs over our variables  Move between related states using search operators. Naively: Egde addition/removal/inversion

Heuristic Search Challenges  Search space graph should be well- connected To reach good states quickly To avoid local maxima  Search space graph should not be too dense Computationally efficient scoring and transformations

Equivalence  G 1 and G 2 are equivalent if the set of distributions that can be represented by them is identical  Equivalence is an equivalence relationship! XY XY XY P

Score Equivalence  If all we care about is the probability distribution, all we need is the equivalence class  The scoring function should give equal scores to structures from the same class Called score equivalent  Why prefer one representation of the class to another?

Equivalence Classes Are Good For You  We are ultimately looking for a probability representation, not a particular DAG  Searching individual DAGs is bad: Some operators lead to the same class  Efficiency  Bad state connectivity for greedy

Theorem 1 (Verma & Pearl 1990)  Two DAGs are equivalent if and only if they have the same skeletons and the same v-structures X Y X Y Z X Y Z Z X Y Z

Partially Directed Acyclic Graph  A directed edge is called compelled in G, if for every G ’ equivalent to G, that edge has the same direction  Otherwise we call it reversible  Partially Directed Acyclic Graph (PDAG) Contains both directed and undirected edges Does not contain any directed circles  Theorem 1 extends naturally to PDAGs A DAG is also a PDAG

CPDAG and Consistent Extension  Completed PDAG for Class(G) contains directed edges for the compelled edges of G undirected edges for the reversible edges of G  G is consistent extension of P if G has the same skeleton and v-structures Every directed edge in P has the same orientation in G XYZXYZXYWZ

CPDAGs And Equivalence  Every consistent extension of P is in Class(P)  If P c is a completed PDAG, then every PDAG G in Class(P c ) is a consistent extension of P c  If P 1 and P 2 are completed PDAGs that admit consistent extension, then P 1 =P 2 if and only if Class(P 1 )=Class(P 2 ) A completed PDAG uniquely represents its equivalence class

DAG to CPDAG (Meek 1995)  Undirect all edges except those that are in the v-structures  Direct (mark as compelled) undirected edges that match particular patterns X Y ZX Y Z X Y Z W

Constructing Consistent Extension (I)  “ Theorem 26 ” : The undirected components of a CPDAG are chordal In any cycle of length >3 in a DAG, there must be a v-structure! Let {K i } be the set of undirected components of a completed PDAG P c. Let {G i } be consistent extensions of {K i } A graph G that results from replacing each reversible edge in K i with the directed edge from corresponding G i is a consistent extension of P c

Constructing Consistent Extension (II)  Use decreasing maximum cardinality search to direct edges in each one of the chordal components Property of dMCS: Every path between any pair of non-adjacent x, y contains a node numbered higher than x or y  Resulting graph is a consistent extension of P c  Works only on completed PDAGs

PDAG-to-DAG (Dor & Tarsi 1992)  Select a node x in P s.t. x has no outgoing edges Vertices adjacent to x form a clique  Direct all edges (x―y) toward x x becomes a sink  Remove x from P  Works only on any PDAG

Applying the Operators

Operators  The set of operators should: Ensure global connectivity (completeness) and good connectivity in general Be easy to check for applicability (validity) Avoid redundancy Allow for efficient scoring  Local scoring – local changes in G cause “ local ” changes in score(G)

Score Decomposability  A scoring function S is decomposable if it is a product (or sum) of factors s, each depending only on one node and its parents  For example: XY XYZ Z

Used Operators

Operator Scoring  Chickering 1996a Apply the operator and score the consistent extension (DAG)  Drawbacks: Need to apply PDAG-to-DAG for every operator Local operators may cause non-local changes when applied to CPDAG  Cannot benefit from local scoring

Local Operator Scoring

InsertU Operator – “Theorem 34”  Let P c be any completed PDAG for which nodes x and y are not adjacent.  If after adding an edge between x and y P c admits a consistent extension, then  The edge x―y is reversible if and only if x and y have exactly the same parents in the original PDAG

InsertU Operator – “Theorem 6”  The insertion of the undirected edge x―y in a CPDAG P c is valid if and only if: x and y have the same parents in P c every undirected path between x and y contains at least one of their common neighbors  Only if (+Theorem 34): Take the shortest undirected path from x to y in P c that does not include any common neighbor of x and y  Length at least 3 and has no chord  After adding x―y becomes a cycle of length 4

InsertU Operator – “Lemma 32”  Let P c be any completed PDAG, and let x and y be any pair of nodes that are not adjacent.  There exists a consistent extension of P c in which all the reversible edges adjacent to x are directed away from x all the reversible edges between y and the common neighbors of x and y are directed toward y all the other reversible edges adjacent to y are directed away from y  If and only if every undirected path between x and y passes through a common neighbor of x and y

InsertU Operator – Theorem 6 “If” proof outline  Use consistent extension from Lemma 32 as G  Add a directed edge x → y to G to get G ’ (the other direction is symmetric)  Show that G ’ is a consistent extension of P ’ (P with the addition of the undirected edge x―y) G ’ is acyclic Same skeleton Same v-structures

InsertU Operator – Theorem 6 G’ is a DAG  Assume by contradiction that there is a directed path from y to x in G  All the reversible edges are directed away from x, so the last edge in that path w → x is compelled  Then w is a parent of x in P, and it must also be a parent of y  In G there is a cycle y → w → y XY W

InsertU Operator – “Lemma 24”  Let P c be a completed PDAG, and let P ’ denote a PDAG that results from adding a single edge between x and y to P c  Consider any consistent extension G of P c, and G ’ that results by inserting a directed edge between x and y in G  Then any v-structure in G ’ but not in P ’, or any v-structure in P ’ but not in G ’ must include the edge between x and y

InsertU Operator – Theorem 6 G’ is a consistent extension of P’  By Lemma 24, any v-structure different between G ’ and P ’ must include the edge x―y  The v-structure must be in G ’, because in P ’ this edge is undirected  The other edge in the v-structure cannot be reversible in G ’ x does not have reversible parents y ’ s reversible parents are adjacent to x  But any compelled parent of x or y is a parent of both Q.E.D

Local Operator Evaluation  Since the only difference between G and G ’ is the edge x → y, we can use score decomposability to compute the score of P ’ in O(1) time s(P ’ ) = s(P c )+s(y,N x,y {x} y )-s(y,N x,y  y )  In general we do not need to transform the CPDAG to compute neighbor scores: Calculate scores for all the neighbor states (locally!) Check operator validity (efficiently!) starting from the highest score