Modelling Relational Statistics With Bayes Nets School of Computing Science Simon Fraser University Vancouver, Canada Tianxiang Gao Yuke Zhu.

Slides:



Advertisements
Similar presentations
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Advertisements

Learning Probabilistic Relational Models Daphne Koller Stanford University Nir Friedman Hebrew University Lise.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Relational Representations Daniel Lowd University of Oregon April 20, 2015.
The IMAP Hybrid Method for Learning Gaussian Bayes Nets Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada
Markov Logic Networks Instructor: Pedro Domingos.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
Selectivity Estimation using Probabilistic Models Author: Lise Getoor, Ben Taskar, Daphne Koller Presenter: Qidan Cheng.
Learning First-Order Probabilistic Models with Combining Rules Sriraam Natarajan Prasad Tadepalli Eric Altendorf Thomas G. Dietterich Alan Fern Angelo.
A Hierarchy of Independence Assumptions for Multi-Relational Bayes Net Classifiers School of Computing Science Simon Fraser University Vancouver, Canada.
Random Regression: Example Target Query: P(gender(sam) = F)? Sam is friends with Bob and Anna. Unnormalized Probability: Oliver Schulte, Hassan Khosravi,
CPSC 422, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Apr, 8, 2015 Slide source: from David Page (MIT) (which were.
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
School of Computing Science Simon Fraser University Vancouver, Canada.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Hierarchical Probabilistic Relational Models for Collaborative Filtering Jack Newton and Russ Greiner
A Tractable Pseudo-Likelihood for Bayes Nets Applied To Relational Data Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.
Causal Modelling for Relational Data Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
Pseudo-Likelihood for Relational Data Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada To appear at SIAM SDM conference.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
AI - Week 24 Uncertain Reasoning (quick mention) then REVISION Lee McCluskey, room 2/07
September, 13th gR2002, Vienna PAOLO GIUDICI Faculty of Economics, University of Pavia Research carried out within the laboratory: Statistical.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Oregon State University – CS539 PRMs Learning Probabilistic Models of Link Structure Getoor, Friedman, Koller, Taskar.
Probabilistic Entity-Relationship Models, PRMs, and Plate Models David Heckerman, Chris Meek, and Daphne Koller Slides from SRL 2004 talk.
CSE 590ST Statistical Methods in Computer Science Instructor: Pedro Domingos.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
Determining the Significance of Item Order In Randomized Problem Sets Zachary A. Pardos, Neil T. Heffernan Worcester Polytechnic Institute Department of.
A simple method for multi-relational outlier detection Sarah Riahi and Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.
Learning Bayesian Networks for Relational Databases Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
A Comparison Between Bayesian Networks and Generalized Linear Models in the Indoor/Outdoor Scene Classification Problem.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Markov Logic And other SRL Approaches
Collective Classification A brief overview and possible connections to -acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Lectures 2 – Oct 3, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Slides for “Data Mining” by I. H. Witten and E. Frank.
CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.
FACTORBASE: SQL for Multi-Relational Model Learning Zhensong Qian and Oliver Schulte, Simon Fraser University, Canada 1.Qian, Z.; Schulte, O. The BayesBase.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Challenge Paper: Marginal Probabilities for Instances and Classes Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
From Relational Statistics to Degrees of Belief School of Computing Science Simon Fraser University Vancouver, Canada Tianxiang Gao Yuke Zhu.
Daphne Koller Template Models Plate Models Probabilistic Graphical Models Representation.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
Reasoning Patterns Bayesian Networks Representation Probabilistic
 Dynamic Bayesian Networks Beyond Graphical Models – Carlos Guestrin Carnegie Mellon University December 1 st, 2006 Readings: K&F: 18.1, 18.2,
Learning Bayesian Networks for Complex Relational Data
Brief Intro to Machine Learning CS539
First-Order Bayesian Networks
Relational Bayes Net Classifiers
General Graphical Model Learning Schema
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 32
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 32
Basic Intro Tutorial on Machine Learning and Data Mining
CSE 515 Statistical Methods in Computer Science
Logic for Artificial Intelligence
Discriminative Probabilistic Models for Relational Data
Label and Link Prediction in Relational Data
Statistical Relational AI
Presentation transcript:

Modelling Relational Statistics With Bayes Nets School of Computing Science Simon Fraser University Vancouver, Canada Tianxiang Gao Yuke Zhu

2/12 Class-Level and Instance-Level Queries Classic AI research distinguished two types of probabilistic relational queries. (Halpern 1990, Bacchus 1990). Halpern, “An analysis of first-order logics of probability”, AI Journal Bacchus, “Representing and reasoning with probabilistic knowledge”, MIT Press Relational Query Class-level QueryReference Class What is the percentage of flying birds? Birds What is the percentage of friendship pairs where both are women? Pairs of Friends What is the percentage of A grades awarded to highly intelligence students? Student-course pairs where student is registered in course. Instance-Level Query Given that Tweety is a bird, what is the probability that Tweety flies? Given that Sam and Hilary are friends, and given the genders of their other friends, what is the probability that Sam and Hilary are both women? What is the probabiity that Jack is highly intelligent given his grades? Instance-level queries Ground facts Type 2 probabilities Class-level queries Relational Statistics Type 1 probabilities

3/12 Visualizing Class-Level Probability Modelling Relational Statistics With Bayes Nets Percentage of Flying Birds = 90%. Halpern: Probability that a typical or random bird flies is 90%. Contains some free variables. e.g. P(Flies(B)) = ?. Syntactic Distinction Contains no free variables. e.g. P(Flies(tweety)) = ?.

4/12 Applications of Class-Level Modelling 1 st -order rule learning (e.g., “intelligent students take difficult courses”). Strategic Planning (e.g., “increase SAT requirements to decrease student attrition”). Query Optimization (Getoor, Taskar, Koller 2001). Class-level queries support selectivity estimation  optimal evaluation order for SQL query. Getoor, Lise, Taskar, Benjamin, and Koller, Daphne. Selectivity estimation using probabilistic models. ACM SIGMOD Record, 30(2):461–472, 2001.

5/12 No Grounding Semantics for Class- level Queries “Unrolling” a network → model of individual entities.  No classes, cannot ask class-level queries. Modelling Relational Statistics With Bayes Netsa intelligence(S) diff(C) Registered(S,C) Class-level Template with 1st-order Variables intelligence(jack) diff(100) Registered(jack,100) intelligence(jane) diff(200) Registered(jack,200) Registered(jane,100) Registered(jane,200) Instance-level Model w/ domain(S) = {jack,jane} domain(C) = {100,200}

6/12 Previous Work: Probabilistic Queries in Statistical-Relational Learning Class-LevelInstance-Level Statistical-Relational Models (Lise Getoor, Taskar, Koller 2001) Many Model Types: Probabilistic Relational Models, Markov Logic Networks, Bayes Logic Programs, Logical Bayesian Networks, …

7/12 New Unified Approach David Poole, “First-Order Probabilistic Inference”, IJCAI H. Khosravi, O. Schulte, T. Man, X. Xu, and B. Bina, “Structure learning for Markov logic networks with many descriptive attributes”, in AAAI, O. Schulte and H. Khosravi. “Learning graphical models for relational data via lattice search”. Machine Learning, Class-LevelInstance-Level Parametrized Bayes Nets + new class-level semantics Parametrized Bayes Nets + combining rules (Poole 2003) + log-linear model (Khosravi, Schulte et al. 2010, Schulte and Khosravi 2012)

8/12 Random Selection Semantics: Example Apply the random selection semantics for probabilistic 1 st - order logic (Halpern 1990; Bacchus 1990). Halpern, “An analysis of first-order logics of probability”, AI Journal Bacchus, “Representing and reasoning with probabilistic knowledge”, MIT Press intelligence(S)diff(C) Registered(S,C) P(intelligence(S) = hi, diff(C) = hi, Registered(S,C) = true) = 20% means: hi true “if we randomly select a student and a course, then the probability is 20% that the student is registered in the course, and that the intelligence of the student and the difficulty of the course are high.”

9/12 Computing Parameter Estimates (I) Use conditional database probabilities as Bayes net parameters. Maximizes the random selection pseudo- likelihood (Schulte 2011). For database probabilities with all true relationships, use SQL or Virtual Join (Yin, Han et al. 2004). Schulte, O. “A tractable pseudo-likelihood function for Bayes nets applied to relational data.” SIAM SDM, Yin, X., Han. J. et al. “CrossMine: Efficient Classification Across Multiple Database Relations”. Constraint-Based Mining and Inductive Databases, R1R1 R2R2

10/12 Computing Parameter Estimates (II) How to compute database probabilities for negated relations? e.g., number of U.S. users who are not friends? Materializing complement tables is unscalable. For single false relation, “1-minus trick” (Getoor et al. 2007). General case: New application of the fast Möbius transform (Kennes and Smits 1990). Getoor, Lise, Friedman, Nir, Koller, Daphne, Pfeffer, Avi, and Taskar, Benjamin. Probabilistic relational models, Kennes, Robert and Smets, Philippe. Computational aspects of the Mobius transformation. In UAI, 1990.

11/12 The Möbius Parametrization Modelling Relational Statistics With Bayes Netsa R1R1 R2R2 Count(*) R1R1 R2R2 R1R1 R2R2 R1R1 R2R2 For two link types R1R1 R2R2 Count(*) R1R1 R2R2 no condition Joint probabilities Möbius Parameters

12/12 Evaluation 1. Fast: parameters in minutes or less. 2. Accurate queries/estimates. 3. Try it yourself in our demo! Modelling Relational Statistics With Bayes Nets