Lecture 16: Wrap-Up COMP 538 Introduction of Bayesian networks.

Slides:



Advertisements
Similar presentations
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Advertisements

2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)
Phylogenetic Trees Lecture 4
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Phylogeny Tree Reconstruction
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Phylogenetic Trees Presenter: Michael Tung
Introduction to Bayesian Learning Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)
L11: Uses of Bayesian Networks Nevin L. Zhang Department of Computer Science & Engineering The Hong Kong University of Science & Technology
Lecture 15: Hierarchical Latent Class Models Based ON N. L. Zhang (2002). Hierarchical latent class models for cluster analysis. Journal of Machine Learning.
. Maximum Likelihood (ML) Parameter Estimation with applications to reconstructing phylogenetic trees Comput. Genomics, lecture 6b Presentation taken from.
Phylogeny Tree Reconstruction
Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema.
Visual Recognition Tutorial
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
CISC667, F05, Lec16, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (III) Probabilistic methods.
Copyright N. Friedman, M. Ninio. I. Pe’er, and T. Pupko. 2001RECOMB, April 2001 Structural EM for Phylogentic Inference Nir Friedman Computer Science &
Learning Bayesian Networks
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Phylogeny Tree Reconstruction
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
CIS786, Lecture 8 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
. Phylogenetic Trees Lecture 13 This class consists of parts of Prof Joe Felsenstein’s lectures 4 and 5 taken from:
Latent Tree Models Part II: Definition and Properties
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Crash Course on Machine Learning
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L10: Model-Based Classification and Clustering Nevin.
Molecular phylogenetics
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Bayesian Learning Chapter Some material adapted from lecture notes by Lise Getoor and Ron Parr.
Lecture 2: Principles of Phylogenetics
CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Comp. Genomics Recitation 9 11/3/06 Gene finding using HMMs & Conservation.
Slides for “Data Mining” by I. H. Witten and E. Frank.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Logistic Regression –NB & LR connections Readings: Barber.
Maximum Likelihood Given competing explanations for a particular observation, which explanation should we choose? Maximum likelihood methodologies suggest.
Selecting Genomes for Reconstruction of Ancestral Genomes Louxin Zhang Department of Mathematics National University of Singapore.
1 Machine Learning: Lecture 6 Bayesian Learning (Based on Chapter 6 of Mitchell T.., Machine Learning, 1997)
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
CS 395T: Computational phylogenetics January 18, 2006 Tandy Warnow.
Building Phylogenies Maximum Likelihood. Methods Distance-based Parsimony Maximum likelihood.
. The EM algorithm Lecture #11 Acknowledgement: Some slides of this lecture are due to Nir Friedman.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Latent variable discovery in classification models
Phylogenetic basis of systematics
New Approaches for Inferring the Tree of Life
394C, Spring 2012 Jan 23, 2012 Tandy Warnow.
Distance based phylogenetics
Data Science Algorithms: The Basic Methods
Data Mining Lecture 11.
Recitation 5 2/4/09 ML in Phylogeny
Systematics: Tree of Life
CS498-EA Reasoning in AI Lecture #20
CS 581 Tandy Warnow.
New methods for simultaneous estimation of trees and alignments
Bayesian Learning Chapter
CS 394C: Computational Biology Algorithms
September 1, 2009 Tandy Warnow
Algorithms for Inferring the Tree of Life
Sequence alignment CS 394C Tandy Warnow Feb 15, 2012.
Presentation transcript:

Lecture 16: Wrap-Up COMP 538 Introduction of Bayesian networks

Phylogeny / Slide 2 Nevin L. Zhang, HKUST Recap l Latent class models n Clustering n Clustering criterion: conditional independence n Drawback: Assumption too strong l Hierarchical latent class (HLC) models n Identifiability issues: regularity, equivalence n Hill climbing algorithm

Phylogeny / Slide 3 Nevin L. Zhang, HKUST Today l Phylogenetic (evolution) trees n Closely related to HLC models n An example of viewing existing models in the framework of BN –Another example: HMM n Interesting because –Ease understanding –Techniques in one field applied to another l Structural EM for phylogenetic trees l Dynamic BNs for speech understanding –Development of general purpose algorithms l Bayesian networks for classification n Hand waving only

Phylogeny / Slide 4 Nevin L. Zhang, HKUST Phylogenetic Tree Outline l Introduction to phylogenetic trees l Probabilistic models of evolution l Tree reconstruction

Phylogeny / Slide 5 Nevin L. Zhang, HKUST Phylogenetic Trees l Assumption n All organisms on Earth have a common ancestor n This implies that any set of species is related. l Phylogeny n The relationship between any set of species. l Phylogenetic tree n Usually, the relationship can be represented by a tree which is called a phylogenetic (evolution) tree –this is not always true

Phylogeny / Slide 6 Nevin L. Zhang, HKUST Phylogenetic Trees l Phylogenetic trees giant panda lesser panda moose goshawk vulture duck alligator Time Current-day species at bottom

Phylogeny / Slide 7 Nevin L. Zhang, HKUST Phylogenetic Trees l TAXA (sequences) identify species l Edge lengths represent evoluation time l Assumption: bifurcating tree toplogy Time AGGGCAT TAGCCCA TAGACTT AGCACAA AGCGCTT AAGACTT AGCACTT AAGGCCT AAGGCAT

Phylogeny / Slide 8 Nevin L. Zhang, HKUST l Characterize relationship between taxa using substitution probability: –P(x | y, t): probability that ancestral sequence y evolves into sequence x along an edge of length t –P(X 7 ), P(X 5 |X 7, t 5 ), P(X 6 |X 7, t 6 ), P(S 1 |X 5, t 1 ), P(S 2 |X 5, t 2 ), …. Probabilistic Models of Evolution s3s3 s4s4 s1s1 s2s2 t5t5 t6t6 t1t1 t2t2 t3t3 t4t4 x5x5 x6x6 x7x7

Phylogeny / Slide 9 Nevin L. Zhang, HKUST l What should P(x|y, t) be? l Two assumptions of commonly used models n There are only substitutions, no insertions/deletions (aligned) –One-to-one correspondence between sites in different sequences n Each site evolves independently and identically P(x|y, t) =  i=1 to m P(x(i) | y(i), t) n m is sequence length Probabilistic Models of Evolution AGGGCAT TAGCCCA TAGACTT AGCACAA AGCGCTT AAGACTT AGCACTT AAGGCCT AAGGCAT

Phylogeny / Slide 10 Nevin L. Zhang, HKUST l What should P(x(i )|y(i), t) be? n Jukes-Cantor (Character Evolution) Model [1969] –Rate of substitution  (Constant or parameter?) l Multiplicativity (lack of memory) Probabilistic Models of Evolution rtrt stst stst stst stst rtrt stst stst stst stst rtrt stst stst stst stst rtrt A C G T ACGT r t = 1/4 (1 + 3e -4  t ) s t = 1/4 (1 - e -4  t ) Limit values when t = 0 or t = infinity?

Phylogeny / Slide 11 Nevin L. Zhang, HKUST Tree Reconstruction l Given: collection of current-day taxa l Find: tree n Tree topology: T n Edge lengths: t l Maximum likelihood n Find tree to maximize P(data | tree) AGGGCAT TAGCCCA TAGACTT AGCACAA AGCGCTT AGGGCAT, TAGCCCA, TAGACTT, AGCACAA, AGCGCTT

Phylogeny / Slide 12 Nevin L. Zhang, HKUST l When restricted to one particular site, a phylogenetic tree is an HLC model where n The structure is a binary tree and variables share the same state space. n The conditional probabilities are from the character evolution model, parameterized by edge lengths instead of usual parameterization. n The model is the same for different sites Tree Reconstruction AGGGCAT TAGCCCA TAGACTT AGCACAA AGCGCTT AAGACTT AGCACTT AAGGCCT

Phylogeny / Slide 13 Nevin L. Zhang, HKUST Tree Reconstruction Current-day Taxa : AGGGCAT, TAGCCCA, TAGACTT, AGCACAA, AGCGCTT Samples for HLC model. One Sample per site. The samples are i.i.d. 1 st site : (A, T, T, A, A), 2 nd site : (G, A, A, G, G), 3 rd site : (G, G, G, C, C), n…n… AGGGCAT TAGCCCA TAGACTT AGCACAA AGCGCTT AAGACTT AGCACTT AAGGCCT

Phylogeny / Slide 14 Nevin L. Zhang, HKUST Tree Reconstruction l Finding ML phylogenetic tree == Finding ML HLC model l Model space: n Model structures: binary tree where all variables share the same state space, which is known. n Parameterization: one parameter for each edge. (In general, P(x|y) has |x||y|-1 parameters).

Phylogeny / Slide 15 Nevin L. Zhang, HKUST Bayesian Networks for Classification l The problem: n Given data: n Find mapping –(A1, A2, …, An) |- C l Possible solutions n ANN n Decision tree (Quinlan) n…n… A1A2…AnC 0110T 1011F..

Phylogeny / Slide 16 Nevin L. Zhang, HKUST Bayesian Networks for Classification l Naïve Bayes model n From data, learn –P(C), P(Ai|C) n Classification –arg max_c P(C=c|A1=a1, …, An=an) n Very good in practice

Phylogeny / Slide 17 Nevin L. Zhang, HKUST l Drawback of NB: n Attributes mutually independent given class variable n Often violated, leading to doubling counting. l Fixes: n General BN classifiers n Tree augmented Naïve Bayes (TAN) models n Hierarchical NB n…n… Bayesian Networks for Classification

Phylogeny / Slide 18 Nevin L. Zhang, HKUST l General BN classifier n Treat class variable just as another variable n Learn a BN. n Classify the next instance based on values of variables in the Markov blanket of the class variable. n Pretty bad because it does not utilize all available information Bayesian Networks for Classification

Phylogeny / Slide 19 Nevin L. Zhang, HKUST Bayesian Networks for Classification l TAN model n Friedman, N., Geiger, D., and Goldszmidt, M. (1997). Bayesian networks classifiers. Machine Learning, 29: Bayesian networks classifiers. n Capture dependence among attributes using a tree structure. n During learning, – First learn a tree among attributes: use Chow-Liu algorithm –Add class variable and estimate parameters n Classification –arg max_c P(C=c|A1=a1, …, An=an)

Phylogeny / Slide 20 Nevin L. Zhang, HKUST Bayesian Networks for Classification l Hierarchical Naïve Bayes models n N. L. Zhang, T. D. Nielsen, and F. V. Jensen (2002). Latent variable discovery in classification models. Artificial Intelligence in Medicine, to appear.Latent variable discovery in classification models. n Capture dependence among attributes using latent variables n Detect interesting latent structures besides classification n Currently, slow