Cadeias de Markov Escondidas Fevereiro 2007 Magnos Martinello Universidade Federal do Espírito Santo - UFES Departamento de Informática - DI Laboratório.

Slides:



Advertisements
Similar presentations
Lecture 16 Hidden Markov Models. HMM Until now we only considered IID data. Some data are of sequential nature, i.e. have correlations have time. Example:
Advertisements

Applying Hidden Markov Models to Bioinformatics
Bioinformatics lectures at Rice University
Operations Research: Applications and Algorithms
HIDDEN MARKOV MODELS Prof. Navneet Goyal Department of Computer Science BITS, Pilani Presentation based on: & on presentation on HMM by Jianfeng Tang Old.
Introduction of Probabilistic Reasoning and Bayesian Networks
Markov Chains 1.
. Computational Genomics Lecture 7c Hidden Markov Models (HMMs) © Ydo Wexler & Dan Geiger (Technion) and by Nir Friedman (HU) Modified by Benny Chor (TAU)
. Computational Genomics Lecture 10 Hidden Markov Models (HMMs) © Ydo Wexler & Dan Geiger (Technion) and by Nir Friedman (HU) Modified by Benny Chor (TAU)
Topics Review of DTMC Classification of states Economic analysis
TCOM 501: Networking Theory & Fundamentals
Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
數據分析 David Shiuan Department of Life Science Institute of Biotechnology Interdisciplinary Program of Bioinformatics National Dong Hwa University.
Андрей Андреевич Марков. Markov Chains Graduate Seminar in Applied Statistics Presented by Matthias Theubert Never look behind you…
Statistical NLP: Lecture 11
Ch-9: Markov Models Prepared by Qaiser Abbas ( )
Entropy Rates of a Stochastic Process
Profiles for Sequences
Hidden Markov Models Theory By Johan Walters (SR 2003)
Statistical NLP: Hidden Markov Models Updated 8/12/2005.
Hidden Markov Models in NLP
Albert Gatt Corpora and Statistical Methods Lecture 8.
Introduction to PageRank Algorithm and Programming Assignment 1 CSC4170 Web Intelligence and Social Computing Tutorial 4 Tutor: Tom Chao Zhou
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Review.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
HIDDEN MARKOV MODELS IN MULTIPLE ALIGNMENT. 2 HMM Architecture Markov Chains What is a Hidden Markov Model(HMM)? Components of HMM Problems of HMMs.
Chapter 4: Stochastic Processes Poisson Processes and Markov Chains
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
1 Markov Chains Algorithms in Computational Biology Spring 2006 Slides were edited by Itai Sharon from Dan Geiger and Ydo Wexler.
Markov Chains Chapter 16.
Stochastic Process1 Indexed collection of random variables {X t } t   for each t  T  X t is a random variable T = Index Set State Space = range.
CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy
The effect of New Links on Google Pagerank By Hui Xie Apr, 07.
Isolated-Word Speech Recognition Using Hidden Markov Models
Random Walks and Semi-Supervised Learning Longin Jan Latecki Based on : Xiaojin Zhu. Semi-Supervised Learning with Graphs. PhD thesis. CMU-LTI ,
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.
Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.
Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Courtesy of J. Akinpelu, Anis Koubâa, Y. Wexler, & D. Geiger
PGM 2003/04 Tirgul 2 Hidden Markov Models. Introduction Hidden Markov Models (HMM) are one of the most common form of probabilistic graphical models,
Theory of Computations III CS-6800 |SPRING
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Pattern Recognition and Machine Learning-Chapter 13: Sequential Data
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
To be presented by Maral Hudaybergenova IENG 513 FALL 2015.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Auto-Regressive HMM Recall the hidden Markov model (HMM) – a finite state automata with nodes that represent hidden states (that is, things we cannot necessarily.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
11. Markov Chains (MCs) 2 Courtesy of J. Bard, L. Page, and J. Heyl.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Other Models for Time Series. The Hidden Markov Model (HMM)
From DeGroot & Schervish. Example Occupied Telephone Lines Suppose that a certain business office has five telephone lines and that any number of these.
Bioinformatics lectures at Rice University
Advanced Statistical Computing Fall 2016
PageRank and Markov Chains
An INTRODUCTION TO HIDDEN MARKOV MODEL
Intelligent Information System Lab
Hidden Markov Models Part 2: Algorithms
Hidden Markov Autoregressive Models
CS 188: Artificial Intelligence Spring 2007
Randomized Algorithms Markov Chains and Random Walks
CONTEXT DEPENDENT CLASSIFICATION
CPSC 503 Computational Linguistics
Presentation transcript:

Cadeias de Markov Escondidas Fevereiro 2007 Magnos Martinello Universidade Federal do Espírito Santo - UFES Departamento de Informática - DI Laboratório de Pesquisas em Redes Multimidia - LPRM

Magnos Martinello – UFES História n Andrey (Andrei) Andreyevich Markov (Russian: Андрей Андреевич Марков) (June 14, 1856 N.S. – July 20, 1922) was a Russian mathematician. He is best known for his work on theory of stochastic processes. His research later became known as Markov chains.RussianJune N.S. July Russianmathematicianstochastic processes Markov chains n His son, another Andrey Andreevich Markov ( ), was also a notable mathematician.

Magnos Martinello – UFES Cadeia de Markov n In mathematics, a Markov chain, named after Andrey Markov, is a discrete-time or continuous-time stochastic process with the Markov property.mathematicsAndrey Markovdiscrete-time or continuous-time stochastic processMarkov property n A Markov chain is a series of states of a system that has the Markov property.Markov property n A series with the Markov property, is a sequence of states for which the conditional probability distribution of a state in the future can be deduced using only the current stateMarkov propertyconditional probability

Magnos Martinello – UFES Definição formal n A Markov chain is a sequence of random variables X1, X2, X3,... with the Markov property, namely that, given the present state, the future and past states are independent. Formally,random variablesMarkov property l n The possible values of Xi form a countable set S called the state space of the chain. Markov chains are often described by a directed graph, where the edges are labeled by the probabilities of going from one state to the other states.countable setdirected graph n A finite state machine is an example of a Markov chain.finite state machine

Magnos Martinello – UFES Propriedades n Reducibility : a Markov chain is said to be irreducible if its state space is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state n Periodicity : A state i has period k if any return to state i must occur in some multiple of k time steps and k is the largest number with this property. If k = 1, then the state is said to be aperiodic n Recurrence : A state i is said to be transient if, given that we start in state i, there is a non-zero probability that we will never return back to i. l If a state i is not transient (it has finite hitting time with probability 1), then it is said to be recurrent or persistent l A state i is called absorbing if it is impossible to leave this state n Ergodicity : A state i is said to be ergodic if it is aperiodic and positive recurrentergodic

Magnos Martinello – UFES Aplicações científicas n Markovian systems appear extensively in physics, particularly statistical mechanicsphysicsstatistical mechanics n Markov chains can also be used to model various processes in queueing theory and statistics. Claude Shannon's famous 1948 paper A mathematical theory of communication, which at a single step created the field of information theory, opens by introducing the concept of entropy (effective data compression through entropy coding techniques) through Markov modeling. They also allow effective state estimation and pattern recognition.queueing theorystatisticsClaude Shannon's1948A mathematical theory of communicationinformation theoryentropydata compressionentropy codingstate estimationpattern recognition

Magnos Martinello – UFES Aplicações científicas n The PageRank of a webpage as used by Google is defined by a Markov chain. It is the probability to be at page i in the stationary distribution on the following Markov chain on all (known) webpages.PageRankGoogle n Markov models have also been used to analyze web navigation behavior of users. A user's web link transition on a particular website can be modeled using first or second order Markov models n Markov chain methods have also become very important for generating sequences of random numbers to accurately reflect very complicated desired probability distributions - a process called Markov chain Monte Carlo or MCMC for short. In recent years this has revolutionised the practicability of Bayesian inference methods.Markov chain Monte CarloBayesian inference n Markov parody generator (Emacs, M-x dissociated-press )Emacs

Magnos Martinello – UFES Modelo de previsão de tempo n The probabilities of weather conditions, given the weather on the preceding day, can be represented by a transition matrix:transition matrix l n Pij is the probability that, if a given day is of type i, it will be followed by a day of type j. n Note that the rows of P sum to 1: this is because P is a stochastic matrix.stochastic matrix

Magnos Martinello – UFES Prevendo o tempo n The weather on day 0 is known to be sunny. This is represented by a vector in which the "sunny" entry is 100%, and the "rainy" entry is 0%: l n The weather on day 1 can be predicted by: l n Thus, there is an 90% chance that day 1 will also be sunny. n The weather on day 2 can be predicted in the same way: l n General rules for day n are: l

Magnos Martinello – UFES Regime estacionário n In this example, predictions for the weather on more distant days are increasingly inaccurate and tend towards a steady state vector.steady state vector n The steady state vector is defined as: n Since the q is independent from initial conditions, it must be unchanged when transformed by P.

Magnos Martinello – UFES Regime estacionário So − 0.1q q2 = 0

Magnos Martinello – UFES Conclusão n Since they are a probability vector we know that l q1 + q2 = 1. n Solving this pair of simultaneous equations gives the steady state distribution: n In conclusion, in the long term, 83% of days are sunny. n For the most prolific example of the use of Markov chains, see Google. A description behind the page rank algorithm, which is basically a Markov chain over the graph of the Internet, can be found in the seminal paper, "The Page Rank Citation Ranking: Bringing Order to the Web" by Larry Page, Sergey Brin, R. Motwani, and T. Winograd. Google"The Page Rank Citation Ranking: Bringing Order to the Web"

Magnos Martinello – UFES Definição n A hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters from the observable parameters. The extracted model parameters can then be used to perform further analysis, for example for pattern recognition applications. A HMM can be considered as the simplest dynamic Bayesian network.statistical model Markov processobservablepattern recognitiondynamic Bayesian network

Magnos Martinello – UFES Cadeia de Markov escondida State transitions in a hidden Markov model (example) x — hidden states y — observable outputs a — transition probabilities b — output probabilities

Magnos Martinello – UFES Intuição/applicação n In a regular Markov model, the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but variables influenced by the state are visible. n Hidden Markov models are especially known for their application in temporal pattern recognition such as speech, handwriting, gesture recognition, musical score following and bioinformatics.temporalbioinformatics

Magnos Martinello – UFES HMMs and their Usage n HMMs are very common in Computational Linguistics: l Speech recognition (observed: acoustic signal, hidden: words) l Handwriting recognition (observed: image, hidden: words) l Machine translation (observed: foreign words, hidden: words in target language)

Magnos Martinello – UFES Architecture of a Hidden Markov Model n The diagram below shows the general architecture of an HMM. Each oval shape represents a random variable that can adopt a number of values. The random variable x(t) is the value of the hidden variable at time t. The random variable y(t) is the value of the observed variable at time t. The arrows in the diagram denote conditional dependencies. n From the diagram, it is clear that the value of the hidden variable x(t) (at time t) only depends on the value of the hidden variable x(t − 1) (at time t − 1). This is called the Markov property. Similarly, the value of the observed variable y(t) only depends on the value of the hidden variable x(t) (both at time t).Markov property

Magnos Martinello – UFES Madame moderna caprichosa n Assume you have a friend who lives far away and to whom you talk daily over the telephone. n Your friend is only interested in three activities: walking in the park, shopping, and cleaning his apartment. n The choice of what to do is determined exclusively by the weather on a given day. n Based on what she tells you she did each day, you try to guess what the weather must have been like.

Magnos Martinello – UFES Madame moderna caprichosa n You believe that the weather operates as a discrete Markov chain. There are two states, "Rainy" and "Sunny", but you cannot observe them directly, that is, they are hidden from you.Markov chain n On each day, there is a certain chance that your friend will perform one of the following activities, depending on the weather: "walk", "shop", or "clean". Since your friend tells you about her activities, those are the observations. n The entire system is that of a hidden Markov model (HMM). n You know the general weather trends in the area, and what your friend likes to do on average. In other words, the parameters of the HMM are known

Magnos Martinello – UFES Probability of an observed sequence n The probability of observing a sequence Y = y(0),y(1),...,y(L − 1) of length L is given by: n n where the sum runs over all possible hidden node sequences X = x(0),x(1),...,x(L − 1). A brute force calculation of P(Y) is intractable for realistic problems, as the number of possible hidden node sequences typically is extremely high. The calculation can however be sped up enormously using an algorithm called the forward-backward procedure.

Magnos Martinello – UFES Using Hidden Markov Models n There are three canonical problems associated with HMMs:canonical n Given the parameters of the model, compute the probability of a particular output sequence. This problem is solved by the forward-backward algorithm. forward-backward algorithm n Given the parameters of the model, find the most likely sequence of hidden states that could have generated a given output sequence. This problem is solved by the Viterbi algorithm.Viterbi algorithm n Given an output sequence or a set of such sequences, find the most likely set of state transition and output probabilities. In other words, train the parameters of the HMM given a dataset of sequences. This problem is solved by the Baum-Welch algorithm.Baum-Welch algorithm