Rafi Bojmel supervised by Dr. Boaz Lerner Automatic Threshold Selection for conditional independence tests in learning a Bayesian network.

Slides:



Advertisements
Similar presentations
Bayesian network for gene regulatory network construction
Advertisements

A Tutorial on Learning with Bayesian Networks
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Bucket Elimination: A unifying framework for Probabilistic inference Rina Dechter presented by Anton Bezuglov, Hrishikesh Goradia CSCE 582 Fall02 Instructor:
Graduate Center/City University of New York University of Helsinki FINDING OPTIMAL BAYESIAN NETWORK STRUCTURES WITH CONSTRAINTS LEARNED FROM DATA Xiannian.
Bayesian Networks. Introduction A problem domain is modeled by a list of variables X 1, …, X n Knowledge about the problem domain is represented by a.
Introduction of Probabilistic Reasoning and Bayesian Networks
Using Markov Blankets for Causal Structure Learning Jean-Philippe Pellet Andre Ellisseeff Presented by Na Dai.
From Variable Elimination to Junction Trees
Bayesian Networks A causal probabilistic network, or Bayesian network,
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Software Engineering Laboratory1 Introduction of Bayesian Network 4 / 20 / 2005 CSE634 Data Mining Prof. Anita Wasilewska Hiroo Kusaba.
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
BCCS 2008/09: GM&CSS Lecture 6: Bayes(ian) Net(work)s and Probabilistic Expert Systems.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
1 gR2002 Peter Spirtes Carnegie Mellon University.
Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L10: Model-Based Classification and Clustering Nevin.
Constraint Based (CB) Approach - ‘PC algorithm’  CB algorithm that learns a structure from complete undirected graph and then "thins" it to its accurate.
Summary of the Bayes Net Formalism David Danks Institute for Human & Machine Cognition.
A Brief Introduction to Graphical Models
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
On Learning Parsimonious Models for Extracting Consumer Opinions International Conference on System Sciences 2005 Xue Bai and Rema Padman The John Heinz.
Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Course files
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Lecture 2: Statistical learning primer for biologists
An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto Univ.)
Methods and software for editing and imputation: recent advancements at Istat M. Di Zio, U. Guarnera, O. Luzi, A. Manzari ISTAT – Italian Statistical Institute.
Advances in Bayesian Learning Learning and Inference in Bayesian Networks Irina Rish IBM T.J.Watson Research Center
The Visual Causality Analyst: An Interactive Interface for Causal Reasoning Jun Wang, Stony Brook University Klaus Mueller, Stony Brook University, SUNY.
Hybrid Intelligent Systems for Network Security Lane Thames Georgia Institute of Technology Savannah, GA
Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard Fenn, Damon Oberholster Supervisors: Anet Potgieter,
1 Machine Learning: Lecture 6 Bayesian Learning (Based on Chapter 6 of Mitchell T.., Machine Learning, 1997)
Introduction on Graphic Models
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Bayesian Networks in Bioinformatics Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
KDD CUP 2001 Task 1: Thrombin Jie Cheng (
Qian Liu CSE spring University of Pennsylvania
Probabilistic Data Management
Bell & Coins Example Coin1 Bell Coin2
Irina Rish IBM T.J.Watson Research Center
Learning Markov Blankets
Bayesian Networks: Motivation
Pairwise Markov Networks
An Algorithm for Bayesian Network Construction from Data
NRES 746: Laura Cirillo, Cortney Hulse, Rosie Perash
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Authors: Wai Lam and Kon Fan Low Announcer: Kyu-Baek Hwang
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Multiple DAGs Learning with Non-negative Matrix Factorization
Intro. to Data Mining Chapter 6. Bayesian.
Presentation transcript:

Rafi Bojmel supervised by Dr. Boaz Lerner Automatic Threshold Selection for conditional independence tests in learning a Bayesian network

Overview   Machine Learning (ML) investigates the mechanisms by which knowledge is acquired through experience.   Hard-core ML based applications: Web search engines, On-line help services Document processing (text classification, OCR) Biological data analysis, Military applications  The Bayesian network (BN) has become one of the most studied machine learning models for knowledge representation, probabilistic inference and recently also classification

Recent visit to Asia Tuberculosis Smoker Lung cancer Positive X-ray Either Tuberculosis or Lung cancer Bronchitis Dyspnea (shortness-of-breath) BN Example (1) A=yesA=no P(A)50%50% D=yesD=no P(D | B=yes)90%10% P(D | B=no)5%95% Chest Clinic (Asia) Problem

Recent visit to Abroad Tuberculosis Smoker Lung cancer Positive X-ray Either Tuberculosis or Lung cancer Bronchitis Dyspnea (shortness-of-breath) Markov Blanket of Lung cancer BN Example (2) Chest Clinic (Asia) Problem

Bayesian Networks Learning Bayesian networks Structure learning Parameter learning Search-and-score Constraint-based Inference (e.g., classification) Bayesian networkStructure/Graph

BN Structure Learning  Database  Training Set  Model Construction   Test set  Bayesian inference (classification)  Two main approaches in the area of BN Structure learning: Search-and-Score, uses heuristic search method Constraint based, analyzes dependency relationships among nodes, using conditional independence (CI) tests. The PC algorithm is a CB based algorithm. ……………………… # # # # # #1 D D yspnea X -ray E ither B B ronchitis L ung cancer T uberculosis S moker A sia

PC algorithm (1)  Inputs: V: set of variables (and corresponding database) I * (Xi,Xj|{S}) <> ε: A test of conditional independence ε: Threshold Order{V}: Ordering of V  Output: Directed Acyclic Graph (DAG) Xi,Xj = any two nodes in the graph I * (Xi,Xj|{S}) = Normalized Conditional Mutual Information {S} = subset of variables (other than Xi,Xj)

PC algorithm (2)  The algorithm contains three stages: Stage I: Start from the complete graph and find an undirected graph using conditional independence tests Stage II: Find some head to head (V-Structures) links ( X – Y – Z becomes X  Y  Z ) Stage III: Orient all those links that can be oriented

Recent visit to Asia Tuberculosis Smoker Lung cancer Positive X-ray Either Tuberculosis or Lung cancer Bronchitis Dyspnea (shortness-of-breath) PC Algorithm Simulation Stage I END Stage II V-structure Stage III Precise Structure

Threshold Selection – existing methods  Arbitrary (trial-and-error) selection Disadvantages: haphazardness, inaccuracy, time  Likelihood or Classifier Accuracy based selection Disadvantages: exponentially run-time The “risk” in selecting the wrong threshold: Too small  too many edges causality run-time Too large  loose important edges inaccuracy

Threshold selection - Novel Technique (1) M utual i nformation P robability D ensity F unctions based:  I*(Xi,Xj | {S})  Calculate the MI values, I*(Xi,Xj | {S}), for different sizes (orders) of condition set, S.  Create histograms (PDF estimation technique).  Techniques to define the best threshold automatically:  Zero-Crossing-Decision (ZCD)  Best-Candidate (BC)

Threshold selection - Novel Technique (2)

ZCD (order=0) ZCD (order=1) Zero-Crossing-Decision (ZCD)

Experiment and Results  Classification experiments with 8 real-world databases have been performed (UCI Repository)  Databases sizes: ,200 cases.  Graph sizes: nodes.  Dimension of class variable:

Summary  The PC algorithm requires selecting a threshold for structure learning, which is a time-consuming process that also undermines automatic structure learning.  Initial examination of our novel techniques testifies that there is a potential of both enjoying the automatic process and improving performance.  Further research is executed in order to valid and improve the proposed techniques.