ABSTRACT: We examine how to detect hidden variables when learning probabilistic models. This problem is crucial for for improving our understanding of.

Slides:



Advertisements
Similar presentations
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200.
Advertisements

$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500.
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
CS188: Computational Models of Human Behavior
Learning Bayesian Networks from Data
CIS: Compound Importance Sampling for Binding Site p-value Estimation The Hebrew University, Jerusalem, Israel Yoseph Barash Gal Elidan Tommy Kaplan Nir.
ABSTRACT: We examine how to determine the number of states of a hidden variables when learning probabilistic models. This problem is crucial for improving.
Active Learning based on Bayesian Networks Luis M. de Campos, Silvia Acid and Moisés Fernández.
Pattern Finding and Pattern Discovery in Time Series
Ideal Parent Structure Learning School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel Gal Elidan with Iftach Nachman and Nir.
Weight Annealing Data Perturbation for Escaping Local Maxima in Learning Gal Elidan, Matan Ninio, Nir Friedman Hebrew University
$100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300.
On / By / With The building blocks of the Mplus language.
BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
Learning with Missing Data
Graphical Models - Inference - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Learning: Parameter Estimation
Information Bottleneck EM School of Engineering & Computer Science The Hebrew University, Jerusalem, Israel Gal Elidan and Nir Friedman.
Graphical Models - Learning -
Bayesian Networks - Intro - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP HREKG.
Nir Friedman, Iftach Nachman, and Dana Peer Announcer: Kyu-Baek Hwang
Graphical Models - Inference -
Graphical Models - Modeling - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.
Graphical Models - Learning - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Graphical Models - Inference - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.
. PGM: Tirgul 10 Learning Structure I. Benefits of Learning Structure u Efficient learning -- more accurate models with less data l Compare: P(A) and.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
. Inference I Introduction, Hardness, and Variable Elimination Slides by Nir Friedman.
Copyright N. Friedman, M. Ninio. I. Pe’er, and T. Pupko. 2001RECOMB, April 2001 Structural EM for Phylogentic Inference Nir Friedman Computer Science &
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
Data Analysis with Bayesian Networks: A Bootstrap Approach Nir Friedman, Moises Goldszmidt, and Abraham Wyner, UAI99.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
Introduction to Bayesian Networks
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
COMP 538 Reasoning and Decision under Uncertainty Introduction Readings: Pearl (1998, Chapter 1 Shafer and Pearl, Chapter 1.
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Wednesday, March 14, 2001 Haipeng Guo.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
1 COROLLARY 4: D is an I-map of P iff each variable X is conditionally independent in P of all its non-descendants, given its parents. Proof  : Each variable.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
CS498-EA Reasoning in AI Lecture #10 Instructor: Eyal Amir Fall Semester 2009 Some slides in this set were adopted from Eran Segal.
Exploiting Structure in Probability Distributions Irit Gat-Viks Based on presentation and lecture notes of Nir Friedman, Hebrew University.
Some Neat Results From Assignment 1. Assignment 1: Negative Examples (Rohit)
Advances in Bayesian Learning Learning and Inference in Bayesian Networks Irina Rish IBM T.J.Watson Research Center
Guidance: Assignment 3 Part 1 matlab functions in statistics toolbox  betacdf, betapdf, betarnd, betastat, betafit.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Introduction on Graphic Models
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
. Bayesian Networks Some slides have been edited from Nir Friedman’s lectures which is available at Changes made by Dan Geiger.
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Latent variable discovery in classification models
Introduction to Artificial Intelligence
Learning Bayesian Network Models from Data
Irina Rish IBM T.J.Watson Research Center
Introduction to Artificial Intelligence
CSCI 5822 Probabilistic Models of Human and Machine Learning
Read R&N Ch Next lecture: Read R&N
Introduction to Artificial Intelligence
Arthur Choi and Adnan Darwiche UCLA
Efficient Learning using Constrained Sufficient Statistics
CAP 5636 – Advanced Artificial Intelligence
An Algorithm for Bayesian Network Construction from Data
CS 188: Artificial Intelligence
Presentation transcript:

ABSTRACT: We examine how to detect hidden variables when learning probabilistic models. This problem is crucial for for improving our understanding of the domain and as a preliminary step that guides the learning procedure. A natural approach is to search for ``structural signatures'' of hidden variables. We make this basic idea concrete, and show how to integrate it with structure-search algorithms. We evaluate this method on several synthetic and real-life datasets, and show that it performs surprisingly well. Summary and Future Work We introduced the importance of hidden variables and implemented a natural idea to detect them. FindHidden performed surprisingly well and proved extremely useful as a preliminary step to a learning algorithm. Further extensions: Experiment with multi-valued hidden variables Explore additional structural signatures Use additional information such as edge confidence Detect hidden variables when the data is sparse Explore hidden variables in Probabilistic Relational Models Detecting Hidden Variables: A Structure-Based Approach PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP PCWP CO HRBP HREKG HRSAT ERRCAUTER HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP HR PCWP CO HRBP HREKG HRSATERRCAUTER HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP Hidden HR PCWP CO HRBP HREKG HRSAT ERRCAUTER HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 ANAPHYLAXIS PVSAT TPR LVFAILURE ERRBLOW STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP Hidden HR Real-life example: Stockdata HIDDEN (MARKET TREND) all other nodes MICROSOFTDELL3ComCOMPAQ market trend: Strong vs. Stationary Why hidden variables? X1 X2X3 Y1Y2 H Y3 Representation: The I-mapminimal structure which implies only independencies that hold in the marginal distributionis typically complex Improve Learning: Detecting approximate position is crucial pre-processing for the EM algorithm Understanding: A true hidden variable improves the quality and order of the explanation X1 X2X3 Y1Y2Y3 not introducing new independencies Gal Elidan, Noam Lotner, Nir Friedman Hebrew University Daphne Koller Stanford University The Alarm network HR is hidden and structure learned from data FindHidden breaks clique EM adapts structure Propose a candidate network: (1) Introduce H as a parent of all nodes in S (2) Replace all incoming edges to S by edges to H (3) Remove all inter- S edges (4) Make all children of S children of H if acyclic The FindHidden Algorithm Semi-Clique S with N nodes Search for semi-cliques by expansion of 3-clique seeds Reference: network with no hidden. Original: golden model for artificial datasets; best on test data. Naive: hidden parent of all nodes; acts as a straw-man. Hidden: best FindHidden network; outperforms Naive and Reference, excels Original on training data. Efficient Frozen EM performs as well as inefficient Flexible EM. M-Step: Score & Parameterize Learning: Structural EM Training Data X1X1 X2X2 X3X3 H Y1Y1 Y2Y2 Y3Y3 + E-Step: Computation X1X1 X2X2 X3X3 H Y1Y1 Y2Y2 Y3Y3 X1X1 X2X2 X3X3 H Y1Y1 Y2Y2 Y3Y3 Expected Counts N(X 1 ) N(X 2 ) N(X 3 ) N(H, X 1, X 1, X 3 )... re-iterate with best candidate Bayesian scoring metric: A Bayesian network represents a joint probability over a set of random variables using a DAG : What is a Bayesian Network Visit to Asia Smoking Lung Cancer Tuberculosis Abnormality in Chest Bronchitis X-Ray Dyspnea P(D|A,B) = 0.8 P(D|¬A,B)=0.1 P(D|A, ¬B)=0.1 P(D| ¬ A, ¬B)=0.01 P(X 1,…X n )=P(V)P(S)P(T|V) … P(X|A)P(D|A,B) Characterizing Hidden Variables H Children of H Parents of H Clique over children of H Parents of H preserve I-Map (not introducing new independencies) This following theorem helps us to detect structural signatures for the presence of hidden variables: all parents connected to all children Applying the algorithm EM was applied with Fixed structure, Frozen structure (modify only semi-clique neighborhood) and Flexible structure We choose the best scoring candidate produced by the SEM Original network Find Hidden X1 X2X3 Y1Y2 H Y3 X1 X2X3 Y1Y2 H Y3 X1 X2X3 Y1Y2 H Y3 Structural EM X1 X2X3 Y1 Y2 Y3 X1 X2 X3 Y1Y2 H Y3 Original Hidden Naive AGMV HILV HILV AGMV HILV HILV Stock Tuberculosis Insurance 1k Alarm 1kAlarm 10k Score on Training data Logloss on test data Stock Tuberculosis