Probabilistic Prediction Algorithms Jon Radoff Biophysics 101 Fall 2002.

Slides:



Advertisements
Similar presentations
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Advertisements

Applying Hidden Markov Models to Bioinformatics
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
Dynamic Bayesian Networks (DBNs)
Bayesian Decision Theory
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Introduction of Probabilistic Reasoning and Bayesian Networks
Hidden Markov Models Eine Einführung.
Markov Models Charles Yan Markov Chains A Markov process is a stochastic process (random process) in which the probability distribution of the.
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Statistical NLP: Lecture 11
Hidden Markov Models (HMMs) Steven Salzberg CMSC 828H, Univ. of Maryland Fall 2010.
. Hidden Markov Model Lecture #6. 2 Reminder: Finite State Markov Chain An integer time stochastic process, consisting of a domain D of m states {1,…,m}
Visual Recognition Tutorial
… Hidden Markov Models Markov assumption: Transition model:
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
Hidden Markov Model Special case of Dynamic Bayesian network Single (hidden) state variable Single (observed) observation variable Transition probability.
Lecture 5: Learning models using EM
Visual Recognition Tutorial1 Bayesian decision making with discrete probabilities – an example Looking at continuous densities Bayesian decision.
Visual Recognition Tutorial
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
Fall 2001 EE669: Natural Language Processing 1 Lecture 9: Hidden Markov Models (HMMs) (Chapter 9 of Manning and Schutze) Dr. Mary P. Harper ECE, Purdue.
. Class 5: Hidden Markov Models. Sequence Models u So far we examined several probabilistic model sequence models u These model, however, assumed that.
Hidden Markov Model Continues …. Finite State Markov Chain A discrete time stochastic process, consisting of a domain D of m states {1,…,m} and 1.An m.
1 Markov Chains. 2 Hidden Markov Models 3 Review Markov Chain can solve the CpG island finding problem Positive model, negative model Length? Solution:
Probability, Bayes’ Theorem and the Monty Hall Problem
CISC 4631 Data Mining Lecture 06: Bayes Theorem Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Eamonn Koegh (UC Riverside)
Machine Learning Queens College Lecture 3: Probability and Statistics.
Conditional & Joint Probability A brief digression back to joint probability: i.e. both events O and H occur Again, we can express joint probability in.
Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models.
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Introduction to Probability Theory March 24, 2015 Credits for slides: Allan, Arms, Mihalcea, Schutze.
BINF6201/8201 Hidden Markov Models for Sequence Analysis
Visibility Graph. Voronoi Diagram Control is easy: stay equidistant away from closest obstacles.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Review: Probability Random variables, events Axioms of probability Atomic events Joint and marginal probability distributions Conditional probability distributions.
4 Proposed Research Projects SmartHome – Encouraging patients with mild cognitive disabilities to use digital memory notebook for activities of daily living.
CSC321: Neural Networks Lecture 16: Hidden Markov Models
Hidden Markovian Model. Some Definitions Finite automation is defined by a set of states, and a set of transitions between states that are taken based.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
MaskIt: Privately Releasing User Context Streams for Personalized Mobile Applications SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference.
Lecture 2: Statistical learning primer for biologists
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Matching ® ® ® Global Map Local Map … … … obstacle Where am I on the global map?                                   
Markov Models Brian Jackson Rob Caldwell March 9, 2010.
Hidden Markov Models BMI/CS 576
Review of Probability.
Bayes Rule and Bayes Classifiers
Hidden Markov Models Part 2: Algorithms
Hidden Markov Models (HMMs)
CONTEXT DEPENDENT CLASSIFICATION
LECTURE 15: REESTIMATION, EM AND MIXTURES
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Presentation transcript:

Probabilistic Prediction Algorithms Jon Radoff Biophysics 101 Fall 2002

Bayesian Decision Theory Originally developed by Thomas Bayes in 1763.* The general idea is that the likelihood of a future event occurring is based on the past probability that it occurred. *Thomas Bayes. An essay towards solving a problem in the doctrine of chances. Philosophical Transaction of the Royal Society (London), 53: , 1763.

Bayes Theorem: Basic Example A simplified Bayes Theorem simply tells us that in the absence of other evidence, the likelihood of an event is equal to its past likelihood. It assumes that the consequences of an incorrect classification are always the same (unlike, for example, a state such as “infected with HIV vs. uninfected with HIV). Let’s say we have a basket full of apples and oranges. We remove 10 fruit from the basket, and observe that 8 are apples and 2 are oranges. A friend comes along and picks a fruit, but hides it behind their back, challenging us to guess what the fruit is. What’s our guess?

Bayes Theorem: Basic Example In the format of Bayes Theorem, we’d give each possibility a “class,” typically. We’ll designate this with the set w: w={w 1,w 2 } w 1 =apple w 2 =orange Based on prior information, P(w 1 )=0.8 and P(w 2 )=0.2. Since P(w 1 ) > P(w 2 ), we guess that it is an apple.

Bayes: Multiple Evidence You might have multiple pieces of evidence. For example, in addition to knowing the likelihood of a random fruit from our basket, perhaps we had previously used a colorimeter to that let us associate the wavelength of light emitted from a fruit with the type of fruit it was. In an informal form, you could think of this as: Posterior = likelihood X prior / evidence

Example: Continued Let’s say we had taken those light readings earlier, and generated a graph of the probability density of detecting a particular wavelength of light given a particular fruit. We’ll call the probability density p(x|wj). The wavelength of light will be represented by the random variable x. Note that since this is a probability density, the area under either curve is always 1.

Bayes: Formal Definition Let P(x) be the probability mass of an event. Let p(x) be the probability density of an event (lowercase p). P(w j |x) = p(x|w j ) X P(w j ) / p(x)

Bayes: Example, Continued Let’s redo our original experiment. Our friend is going to take a fruit from the basket again, but he’s also going to tell us that the wavelength of light detected by his colorimeter was 575nm. What is our prediction now?

Bayes: Example, Continued Previous probabilities from our light readings: p(575|w1) =.05, p(575|w2) =.25 P(w1|575) = 0.05 X 0.8 / ((.05 X 0.8)+(0.25 X 0.2)) =.44 P(w2|575) = 0.25 X 0.2 / ((.05 X 0.8)+(0.25 X 0.2)) =.56 In this case, the additional evidence from the colorimeter leads us to guess that it is an orange (.56 >.44).

Bayesian Belief Networks A belief network consists of nodes labeled by their discrete states. The links between nodes represent conditional probabilities. Links are directional: when A points at B, A is said to influence B. A B C D E P(a) P(b) P(c|a)P(c|b) P(e|c) P(d|c)

Bayesian Belief Network: Example This over-simplified network illustrates how lung cancer is influenced by other states, and how the presence of particular symptoms might be influenced by lung cancer. A Smoker B Asbestos exposure C Lung Cancer D Chest pain E Coughing P(a)P(b) P(c|a) P(c|b) P(e|c)P(d|c)

Bayesian Belief Networks: Example A human expert might provide us with the matrices of all the probabilities in the network, for example: P(cancer|smoking): cancerhealthy Never former heavy light …and so forth for the 4 other nodes.

Bayesian Belief Networks If you have complete matrices for your belief network, you can make predictions for any state in the network given a set of input variables. According to Bayes Theorem, if we don’t know a particular state, we can simply default to the overall prior probability.

Bayesian Belief Networks: Example Continued In our example, we could now answer questions such as: What is the likelihood a person will have lung cancer given that they are a heavy smoker and have been exposed to asbestos? A person has severe coughing, chest pain and is a smoker. What is the likelihood of a cancer diagnosis? What is the likelihood of past absestos exposure given that a person has been diagnosed with cancer?

Bayesian Belief Networks The probability of a particular state is the product of the probabilities of all the states, given their prior states. Remember that p(x) is for probability density, and P(x) is for probability mass.

Markov Chain A markov chain is a type of belief network where you have a sequence of states (x 1, x 2 … x i ) where the probability of each state is dependent only on the previous state, i.e: P(x 1,x 2,…xi,x i+1 ) =P(x i+1 |x 1,x 2,…x i )P(x 1,x 2,…x i ) Some content in this section from Matthew Wright, Hidden Markov Models

Markov Chain: Example Let’s say we know the probability of any particular nucleotide following another nucleotide in a DNA sequence. For example, the probability that a C follows an A might be written as P(AC). If we wanted to find the probability of finding the sequence ACGTC, it would be expressed as follows: P(ACGTC) =P(A)P(AC)P(CG)P(GT)P(TC)

Hidden Markov Models In a Hidden Markov Model (HMM) the a Markov Chain is expanded to include the idea of hidden states. Given a set of observations x 1, x 2 …x n and a set of hidden underlying states s 1, s 2 …s n, there is now a transition probability for moving between the hidden states: …where l and k are the states at positions I and i-1.

Emission Probabilities At each state, there is a probability of “emitting” a particular observation. We define this as: …where e is the probability that the state k at position i emits observation b, and s i is the state at position i and x i is the observation at that point.

Probability of a Path The probability of an individual path through a sequence of hidden states is a restatement of Bayes theorem: In other words, the probability that we observe the sequence of visible states is equal to the product of the conditional probability that the system has made a particular transition multiplied by the probability that it emitted the observation in our target sequence.

HMM Example: CpG Island Let’s go back to our earlier example of the sequence ACGTC, but now we’ll introduce two hidden states: we may either be “in” or “out” of a CpG island. Let’s designate in by + and out by -. Here is how you would depict all the transitions visually: ACGTC A+C+G+T+C+ A-C-G-T-C-

HMM Example: CpG Island What is the probability that we’re in a CpG island at position 3? (G+) We could depict all of the potential paths to this as follows: ACGTC A+C+G+T+C+ A-C-G-T-C-

Simplified Total Probability of a Path In this example, we exclude emission probabilities because there won’t be a case of something such as a C- state emitting an A observation. Algebraically, the recursion of our formula for determining the probability of a path simplifies to:

HMM Example: CpG Island Since this is a recursive algorithm, we need to find position 2 before we can find position 3. The possibility of C- at position 2 may be expressed as: ACGTC A+C+G+T+C+ A-C-G-T-C-

HMM Example: CpG Island Likewise, the likelihood of C+ in position 2 is: ACGTC A+C+G+T+C+ A-C-G-T-C-

HMM Example: CpG Island The probability of G+ in position 3 is thus: ACGTC A+C+G+T+C+ A-C-G-T-C-