1 Acceleration of Inductive Inference of Causal Diagrams Olexandr S. Balabanov Institute of Software Systems of NAS of Ukraine

Slides:

Advertisements

Similar presentations

Bayesian Belief Propagation

Advertisements

Gated Graphs and Causal Inference

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.

Topic Outline Motivation Representing/Modeling Causal Systems

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.

Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Weakening the Causal Faithfulness Assumption

Outline 1)Motivation 2)Representing/Modeling Causal Systems 3)Estimation and Updating 4)Model Search 5)Linear Latent Variable Models 6)Case Study: fMRI.

Recursive Definitions and Structural Induction

Structure Learning Using Causation Rules Raanan Yehezkel PAML Lab. Journal Club March 13, 2003.

Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.

Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.

Dynamic Bayesian Networks (DBNs)

Learning Causality Some slides are from Judea Pearl’s class lecture

Using Markov Blankets for Causal Structure Learning Jean-Philippe Pellet Andre Ellisseeff Presented by Na Dai.

From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.

1 1 The Scientist Game Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics.

PGM 2003/04 Tirgul 3-4 The Bayesian Network Representation.

1 Optimisation Although Constraint Logic Programming is somehow focussed in constraint satisfaction (closer to a “logical” view), constraint optimisation.

Probabilistic video stabilization using Kalman filtering and mosaicking.

A Probabilistic Approach to Collaborative Multi-robot Localization Dieter Fox, Wolfram Burgard, Hannes Kruppa, Sebastin Thrun Presented by Rajkumar Parthasarathy.

*Department of Computing Science University of Newcastle upon Tyne **Institut für Informatik, Universität Augsburg Canonical Prefixes of Petri Net Unfoldings.

. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.

CS 4700: Foundations of Artificial Intelligence

. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.

Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.

Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.

1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.

Bayes Net Perspectives on Causation and Causal Inference

1 Part 2 Automatically Identifying and Measuring Latent Variables for Causal Theorizing.

Bump Hunting The objective PRIM algorithm Beam search References: Feelders, A.J. (2002). Rule induction by bump hunting. In J. Meij (Ed.), Dealing with.

Combining Exact and Metaheuristic Techniques For Learning Extended Finite-State Machines From Test Scenarios and Temporal Properties ICMLA ’14 December.

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

Probabilistic Graphical Models David Madigan Rutgers University

CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct

1 / 20 Arkadij Zakrevskij United Institute of Informatics Problems of NAS of Belarus A NEW ALGORITHM TO SOLVE OVERDEFINED SYSTEMS OF LINEAR LOGICAL EQUATIONS.

ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.

Sampling “Sampling is the process of choosing sample which is a group of people, items and objects. That are taken from population for measurement and.

Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.

Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.

Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.

Schreiber, Yevgeny. Value-Ordering Heuristics: Search Performance vs. Solution Diversity. In: D. Cohen (Ed.) CP 2010, LNCS 6308, pp Springer-

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.

UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.

MCMC in structure space MCMC in order space.

Summer School for Integrated Computational Materials Education 2015 Computational Thermodynamics Module Review Katsuyo Thornton, 1 Paul Mason, 2 Larry.

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

NTU & MSRA Ming-Feng Tsai

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.

Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.

A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:

Learning Hidden Graphs Hung-Lin Fu 傅恆霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.

Generating Random Spanning Trees via Fast Matrix Multiplication Keyulu Xu University of British Columbia Joint work with Nick Harvey TexPoint fonts used.

A fault tree – Based Bayesian network construction for the failure rate assessment of a complex system 46th ESReDA Seminar May 29-30, 2014, Politecnico.

Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.

On the Ability of Graph Coloring Heuristics to Find Substructures in Social Networks David Chalupa By, Tejaswini Nallagatla.

1 Day 2: Search June 14, 2016 Carnegie Mellon University Center for Causal Discovery.

Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.

International Workshop

Learning Bayesian Network Models from Data

Markov Properties of Directed Acyclic Graphs

Center for Causal Discovery: Summer Short Course/Datathon

Wellington Cabrera Advisor: Carlos Ordonez

Presentation transcript:

1 Acceleration of Inductive Inference of Causal Diagrams Olexandr S. Balabanov Institute of Software Systems of NAS of Ukraine

2 Generic Task: Our Goal : to speed up a model induction We obey a constraint-based approach to model induction Statistical Data (Sample) Structure of Data Generation Process (Causal Model) No prior knowledge No Temporal Order of Variables

3 3 Phases of Causal Inference Calculating parameter values Structure of model Model skeleton identification via searching for separators Separators Edges Edge orientation Data

4 A constraint-based algorithm deletes an edge X — Y when finds a fact that variables X and Y are conditionally independent under some condition. The algorithm tries to find a separator for each pair of variables. The key idea of PC-algorithm is: to include in a tentative separator for pair (X, Y) only those variables which are supposedly adjacent to X or to Y.

5 But still a task of searching for separators remains computationally very expensive even for networks of moderate density. The worst situation: When there edge X — Y exists, the PC continues an attempts to find a separator for ( X,Y). The algorithm would examine all subsets of Adj(X) and all subsets of Adj(Y) as tentative separators.

6 It is especiall desirable to recognize the edge presence as early as possible. Also very useful is to come with tests of low rank whenever possible. This means to find minimal separators. Idea to achieve the goal Idea to achieve the goal – to exploit pairwise Markov properties of ADG-model, concept of locally-minimal separator and their logical consequences.

7 We have developed several rules of inductive inference acceleration. These rules perform: 1) recognition of edge presence; 2) recognition of edge absence; 3) deleting some variables from list of candidates in supposed separator; 4) recognition of some variables as obligate members of respective separator (if it exists at all).

8 Rule of ‘placing aside’ : If there Ds(Z;X;Y) &  Ds(Z;;Y) holds in model G, then vertex Z is not a member of any locally-minimal d-separator for pair (X,Y) in G. One of the most effective rules:

9 If we equip an algorithm (like PC) with just the two rules (placing aside rule and “lack of separator’s pivot” rule), then the algorithm would recover a forest (or poly-forest) by executing tests of zero- and first-rank. In particular, algorithm Razor-1.1 would identify a forest, presented below, by tests of 0-rank and 1-rank only. Basic PC algorithm for the same model would work out test up to 9-rank.

10 Algorithm Razor-1.1 (or even simpler one, but with the two rules) requires tests of first rank at max. PC algorithm requires tests of 8-rank at max

11 More complicated and realistic example. This structure consists of 15 vertices and 30 edges. Razor-1.1 requires tests of 4th rank at max. PC algorithm requires tests of 8th rank at max

12 Below – r esults of inference from data samples. ADG structures were generated randomly for 20 vertices (variables) and number of edges = 40 – 70. Variables – binary and ternary. Model’s parameters – also randomly generated. Sample size =

13 Experimental results: Performance PC Razor 20 vertices (variables), 50 edges.

14 Inference Errors PC Razor PC Razor 20 vertices (variables), 50 edges. Notice : These results present uncomfortable cases (with binary and ternary variables and random parameters).

15 As demonstrated, algorithm equipped with proposed rules performs learning Bayesian nets (of moderate density) multiple times faster then PC algorithm. At the same time, number of errors grows much more slowly. Thus inductive inference acceleration rules facilitate fast identification of skeleton of causal model. Most of the rules of inductive inference acceleration may be extended to the case of causal diagrams with latent variables (some corrections to the algorithm should be done). Algorithm needs to be upgraded for the case of causally- insufficient models. Extension Conclusion

16 Thanks for attention Balabanov A.S. Minimal separators in dependency structures: Properties and identification. Cybernetics and Systems Analysis. – Vol. 44, – No 6, 2008, – P.803–815. – Springer N.Y. Balabanov A. S. Construction of minimal d-separators in a dependency system. Cybernetics and Systems Analysis. – Vol. 45, – No 5, –2009. – P. 703–713. Balabanov O. S. Accelerating algorithms for Bayesian network recovery. Adaptation to structures without cycles (in Ukrainian). Problems in programming journal, – – No 1. – P.63–69. – Kiev, Ukraine, ISBN Balabanov O.S., O.S. Gapyeyev, A.M. Gupal, S.S. Rzhepetskyy. Fast algorithm for learning Bayesian networks from data. Journal of Automation and Information Sciences. – Vol. 43, – No 10, – 2011, to appear. Recent publications