MML, inverse learning and medical data-sets Pritika Sanghi Supervisors: A./Prof. D. L. Dowe Dr P. E. Tischer.

Slides:



Advertisements
Similar presentations
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Advertisements

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Brief introduction on Logistic Regression
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Introduction of Probabilistic Reasoning and Bayesian Networks
Chapter 4: Linear Models for Classification
Inferring phylogenetic models for European and other Languages using MML Jane N. Ooi Supervisor : A./Prof. David L. Dowe.
Data Mining Cardiovascular Bayesian Networks Charles Twardy †, Ann Nicholson †, Kevin Korb †, John McNeil ‡ (Danny Liew ‡, Sophie Rogers ‡, Lucas Hope.
Decision Tree Algorithm
SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.
Statistical Methods Chichang Jou Tamkang University.
1Causality & MDL Causal Models as Minimal Descriptions of Multivariate Systems Jan Lemeire June 15 th 2006.
MML, inverse learning and medical data-sets Pritika Sanghi Supervisors: A./Prof. D. L. Dowe Dr P. E. Tischer.
Overview of STAT 270 Ch 1-9 of Devore + Various Applications.
Bayesian Learning Rong Jin.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Maximum likelihood (ML)
. Expressive Graphical Models in Variational Approximations: Chain-Graphs and Hidden Variables Tal El-Hay & Nir Friedman School of Computer Science & Engineering.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
 Catalogue No: BS-338  Credit Hours: 3  Text Book: Advanced Engineering Mathematics by E.Kreyszig  Reference Books  Probability and Statistics by.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Inductive learning Simplest form: learn a function from examples
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models.
1 CSI 5388:Topics in Machine Learning Inductive Learning: A Review.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Data Mining: Classification & Predication Hosam Al-Samarraie, PhD. Centre for Instructional Technology & Multimedia Universiti Sains Malaysia.
Advanced Higher Statistics Data Analysis and Modelling Hypothesis Testing Statistical Inference AH.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Direct Message Passing for Hybrid Bayesian Networks Wei Sun, PhD Assistant Research Professor SFL, C4I Center, SEOR Dept. George Mason University, 2009.
7.4 – Sampling Distribution Statistic: a numerical descriptive measure of a sample Parameter: a numerical descriptive measure of a population.
Lectures 2 – Oct 3, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
2-1 Sample Spaces and Events Random Experiments Figure 2-1 Continuous iteration between model and physical system.
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
T06-02.S - 1 T06-02.S Standard Normal Distribution Graphical Purpose Allows the analyst to analyze the Standard Normal Probability Distribution. Probability.
ECON 504: Advanced Economic Statistics August 23, 2011 George R. Brown School of Engineering STATISTICS.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
An Introduction to Variational Methods for Graphical Models
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Bayesian Learning Provides practical learning algorithms
Data Mining and Decision Support
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
1Causal Performance Models Causal Models for Performance Analysis of Computer Systems Jan Lemeire TELE lab May 24 th 2006.
Csci 418/618 Simulation Models Dr. Ken Nygard, IACC 262B
Oliver Schulte Machine Learning 726 Decision Tree Classifiers.
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
XML Extensible Markup Language
Estimation of Distribution Algorithm and Genetic Programming Structure Complexity Lab,Seoul National University KIM KANGIL.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Oliver Schulte Machine Learning 726
Taking a deeper dive into your survey data with key driver analysis
Data Transformation: Normalization
Statistical Modelling
Advanced Higher Statistics
Akbar Akbari Esfahani1, Theodor Asch2
Analyzing Redistribution Matrix with Wavelet
Oliver Schulte Machine Learning 726
Data Mining Lecture 11.
Bayes Net Toolbox for Student Modeling (BNT-SM)
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Generalized Belief Propagation
Machine Learning: Lecture 6
Machine Learning: UNIT-3 CHAPTER-1
Presentation transcript:

MML, inverse learning and medical data-sets Pritika Sanghi Supervisors: A./Prof. D. L. Dowe Dr P. E. Tischer

2 Overview What is this project about? Bayesian Networks and their limitations Some techniques  Factor Analysis  Minimum Message Length (MML)  Decision Trees & Graphs  Logistic Regression Improving Bayesian Networks What is being done in this project?

3 What is this project about? The aim of the project is to enhance Bayesian Networks in general and then apply them to certain medical data-sets. These data-sets have a large number of attributes and small number of cases. This makes it difficult to model these data-sets using Bayesian Networks.

4 Bayesian Networks A popular tool for Data Mining. Model data to infer the probability of a certain outcome. They represent the frequency distributions for the values that an attribute can take as Conditional Probability Distributions. P(WS) 0.75 P(GO) 0.50 WS GOP(S | WS, GO) T T F F T F SP(A|S) TFTF

5 Bayesian Networks - Limitations When a child node depends on a large number of parent attributes, the conditional probability distribution (CPD) becomes very complex.  2 n rows in the CPD for n binary parent attributes. This makes the process of creating the CPD and inferring something from it once created very time consuming. A more compact representation for CPDs is required.

6 Factor Analysis Multiple attributes may be defined by a common factor. The Wallace and Freeman model for Single Factor Analysis will be implemented. This serves as dimensionality reduction. The validity of the program built will be checked using the data-sets specified in the Wallace and Freeman paper. Attributes A and B have a common factor F1. Attributes C, D and E have a common factor F2.

7 Factor Analysis

8 DataAttribute related term Standard Deviation x nk = μ k + а k ν n + σ k r nk MeanRecord related term Random variates N(0,1) SizeHeightWeight LargeTallAverage LargeShortHeavy MediumAverage SmallShortLight The equation for Single Factor analysis as defined by Wallace and Freeman is:

9 The Minimum Message Length (MML) Principle Models the data as a two-part message consisting of hypothesis H and the data it encodes, D. The best model is the one with minimum message length. This is done by maximising the posterior probability of the hypothesis given the data, -log Pr(H|D), as the message length is negative log likelihood of the probability. Message is represented as: HypothesisData

10 Decision Trees and Graphs Graphical way of representing the output attribute in terms of the input attributes. Used to model the Conditional Probability Distribution of the Bayesian Network. Graphs are generalisations of decision trees. They merge similar sub-trees.

11 Logistic Regression Mathematical modelling approach used for describing the dependence of a variable on other attributes. Will be used to define the probability of a discrete target attribute as a function of continuous attributes. f(z) = 1 / (1+e -z ) + c

12 Improving Bayesian Networks Comley and Dowe (2003, 2004) based on the ideas from Dowe and Wallace (1998) commenced the work of enhancing Bayesian Networks and introduced Generalised Bayesian Networks. This project will extend on their work by applying some of the techniques described before on Bayesian Networks.

13 What is being done in this project? Refinement to Generalised Bayesian Networks. Specifically, First the MML - Single Factor Analysis will be added to Bayesian Networks. Then, Logistic Regression will be looked into. The Generalised Bayesian Networks will then be used to infer models from some medical data-sets such as breast cancer data-sets. If time permits, which it almost definitely won’t, other methods of dimensionality reduction and/or decision graphs will be pursued.

14 References J W Comley and D L Dowe: General Bayesian Networks and Asymmetric Languages, Proceedings of the 2003 Hawaii International Conference on Statistics and Related Fields (HICS 2003), Honolulu, Hawaii, USA, 5-8 June 2003, ISSN: , pp J. W. Comley and D. L. Dowe: Minimum Message Length and Generalised Bayesian Nets with Asymmetric Languages, in P. D. Grunwald, I. J. Myung and M. A. Pitt (ed), Advances in Minimum Description Length: Theory and Applications, MIT Press. To be published D L Dowe, C S Wallace: Kolmogorov complexity, minimum message length and inverse learning, in W Robb (ed), Proceedings of the Fourteenth Biennial Australian Statistical Conference (ASC-14), Queensland, Australia, 6-10 July, 1998, p 144. C S Wallace and P R Freeman: Single factor analysis by MML estimation, J Royal Stat. Soc. B. 54, 1, , 1992.

15 More Information

16 Thank You Any questions?