Download presentation
Presentation is loading. Please wait.
Published byStella Stevenson Modified over 9 years ago
1
Introduction to Sequence Models
2
Sequences Many types of information involve sequences: -Financial data: -DNA: -Robot motionRobot motion -Text: “Jack Flash sat on a candle stick.”
3
Sequence Models Sequence models try to describe how an element of a sequence depends on previous (or sometimes following) elements. For instance, financial models might try to predict a stock price tomorrow, given the stock prices for the past few weeks. As another example, robot motion models try to predict where a robot will be, given current location and the commands given to the motors.
4
Types of Sequence Models “Continuous-time” models: these try to describe situations where things change continously, or smoothly, as a function of time. For instance, weather models, models from physics and engineering describing how gases or liquids behave over time, some financial models, … Typically, these involve differential equations We won’t be talking about these
5
Types of Sequence Models “Discrete-time” models: These try to describe situations where the environment provides information periodically, rather than continuously. – For instance, if stock prices are quoted once per day, or once per hour, or once per time period T, then it’s a discrete sequence of data. – The price of a stock as it fluctuates all day, at any time point, is a continuous sequence of data. We’ll cover 2 examples of Discrete-time sequence models: – Hidden Markov Models (used in NLP, machine learning) – Particle Filters (primarily used in robotics)
6
Hidden Markov Models How students spend their time (observed once per time interval T): SleepStudy Video games 0.4 0.60.8 0.0 0.2 0.3 0.1 Markov Model: -A set of states -A set of transitions (edges) from one state to the next -A conditional probability P(destination state | source state)
7
Quiz: Markov Models How students spend their time (observed once per time interval T): SleepStudy Video games 0.4 0.60.8 0.0 0.2 0.3 0.1 Suppose a student starts in the Study state. What is P(Study) in the next time step? What about P(Study) after two time steps? And P(Study) after three time steps?
8
Answer: Markov Models How students spend their time (observed once per time interval T): SleepStudy Video games 0.4 0.60.8 0.0 0.2 0.3 0.1 Suppose a student starts in the Study state. What is P(Study) in the next time step?0.4 What about P(Study) after two time steps?0.4*0.4 + 0.3*0.2 + 0.3*0.1 = 0.16 + 0.06 + 0.03 =.25 And P(Study) after three time steps?… complicated
9
Simpler Example Suppose the student starts asleep. What is P(Sleep) after 1 time step? What is P(Sleep) after 2 time steps? What is P(Sleep) after 3 time steps? SleepStudy 0 0.5 1
10
Answer: Simpler Example Suppose the student starts asleep. What is P(Sleep) after 1 time step? 0.5 What is P(Sleep) after 2 time steps? 0.5*0.5 + 0.5*1 = 0.75 What is P(Sleep) after 3 time steps? 0.5*0.5*0.5 + 0.5*1*0.5 + 0.5*0.5*1 + 0.5*0*1 = 0.625 SleepStudy 0 0.5 1
11
Stationary Distribution What happens after many, many time steps? We’ll make three assumptions about the transition probabilities: 1.It’s possible to get from any state to any other state. 2.On average, the number of time steps it takes to get from one state back to itself is finite. 3.There are no cycles (or periods). Any Markov chains in this course will have these properties; in practice, most do anyway. SleepStudy 0 0.5 1
12
Stationary Distribution What happens after many, many time steps? If those assumptions are true, then: -After enough time steps, the probability of each state converges to a stationary distribution. -This means that the probability at one time step is the same as the probability at the next time step, and the one after that, and the one after that, … SleepStudy 0 0.5 1
13
Stationary Distribution Let’s compute the stationary distribution for this Markov chain: Let P t be the probability distribution for Sleep at time step t. For big enough t, P t (Sleep) = P t-1 (Sleep). P t (Sleep) = P t-1 (Sleep)*0.5 + P t-1 (Study)*1 x = 0.5x + 1*(1-x) 1.5x = 1 x = 2/3 SleepStudy 0 0.5 1
14
Quiz: Stationary Distribution Compute the stationary distribution for this Markov chain. AB 0.4 0.75 0.25 0.6
15
Answer: Stationary Distribution Compute the stationary distribution for this Markov chain. P t (A) = P t-1 (A) P t (A) = P t-1 (A) * 0.75 + P t-1 (B) * 0.6 x = 0.75 x + 0.6 (1-x) 0.85x = 0.6 x = 0.6 / 0.85 ~= 0.71 AB 0.4 0.75 0.25 0.6
16
Learning Markov Model Parameters There are six probabilities associated with Markov models: 1.Initial state probabilities P 0 (A), P 0 (B) 2.Transition probabilities P(A|A), P(B|A), P(A|B), and P(B|B). AB ? ? ? ? Initial state is A: ? is B: ?
17
Learning Markov Model Parameters Here is a sequence of observations from our Markov model: BAAABABBAAA Use maximum likelihood to estimate these parameters. 1.P 0 (A) = 0/1, P 0 (B) = 1/1 2.P(A|A) = 4/6 = 2/3. P(B|A) = 2/6=1/3. 3.P(A|B) = 3/4. P(B|B) = 1/4. AB ? ? ? ? Initial state is A: ? is B: ?
18
Quiz: Learning Markov Model Parameters Here is a sequence of observations from our Markov model: AAABBBBBABBBA Use maximum likelihood to estimate these parameters. AB ? ? ? ? Initial state is A: ? is B: ?
19
Answer: Learning Markov Model Parameters Here is a sequence of observations from our Markov model: AAABBBBBABBBA Use maximum likelihood to estimate these parameters. 1.P 0 (A) = 1/1. P 0 (B) = 0/1. 2.P(A|A) = 2/4. P(B|A) = 2/4. 3.P(A|B) = 2/8 = 1/4. P(B|B) = 6/8 = 3/4. AB ? ? ? ? Initial state is A: ? is B: ?
20
Restrictions on Markov Models SleepStudy Video games 0.4 0.60.8 0.0 0.2 0.3 0.1 -Probability only depends on previous state, not any of the states before that (called the Markov assumption) -Transition probabilities cannot change over time (called the stationary assumption)
21
Observations and Latent States Markov models don’t get used much in AI. The reason is that Markov models assume that you know exactly what state you are in, at each time step. This is rarely true for AI agents. Instead, we will say that the agent has a set of possible latent states – states that are not observed, or known to the agent. In addition, the agent has sensors that allow it to sense some aspects of the environment, to take measurements or observations.
22
Hidden Markov Models Suppose you are the parent of a college student, and would like to know how studious your child is. You can’t observe them at all times, but you can periodically call, and see if your child answers. SleepStudy 0.5 0.6 0.4 0.5 H1H1 H2H2 H3H3 … SleepStudy 0.5 0.6 0.4 0.5 SleepStudy 0.5 0.6 0.4 0.5 O1O1 O2O2 O3O3 Answer call or not? Answer call or not? Answer call or not?
23
Hidden Markov Models H1H1 H2H2 H3H3 … O1O1 O2O2 O3O3 H1H1 H2H2 P(H 2 |H 1 ) Sleep 0.6 StudySleep0.5 H2H2 H3H3 P(H 3 |H 2 ) Sleep 0.6 StudySleep0.5 H4H4 H3H3 P(H 4 |H 3 ) Sleep 0.6 StudySleep0.5 H1H1 O1O1 P(O 1 |H 1 ) SleepAns0.1 StudyAns0.8 H2H2 O2O2 P(O 2 |H 2 ) SleepAns0.1 StudyAns0.8 H3H3 O3O3 P(O 3 |H 3 ) SleepAns0.1 StudyAns0.8 H1H1 P(H 1 ) Sleep0.5 Study0.5 Here’s the same model, with probabilities in tables.
24
Hidden Markov Models HMMs (and MMs) are a special type of Bayes Net. Everything you have learned about BNs applies here. H1H1 H2H2 H3H3 … O1O1 O2O2 O3O3 H1H1 H2H2 P(H 2 |H 1 ) Sleep 0.6 StudySleep0.5 H2H2 H3H3 P(H 3 |H 2 ) Sleep 0.6 StudySleep0.5 H4H4 H3H3 P(H 4 |H 3 ) Sleep 0.6 StudySleep0.5 H1H1 O1O1 P(O 1 |H 1 ) SleepAns0.1 StudyAns0.8 H2H2 O2O2 P(O 2 |H 2 ) SleepAns0.1 StudyAns0.8 H3H3 O3O3 P(O 3 |H 3 ) SleepAns0.1 StudyAns0.8 H1H1 P(H 1 ) Sleep0.5 Study0.5
25
Quick Review of BNs for HMMs H1H1 O1O1 H1H1 H2H2
26
Hidden Markov Models H1H1 … O1O1 H1H1 H2H2 P(H 2 |H 1 ) Sleep 0.6 StudySleep0.5 H1H1 O1O1 P(O 1 |H 1 ) SleepAns0.1 StudyAns0.8 H1H1 P(H 1 ) Sleep0.5 Study0.5
27
Hidden Markov Models H1H1 O1O1 H1H1 H2H2 P(H 2 |H 1 ) Sleep 0.6 StudySleep0.5 H1H1 O1O1 P(O 1 |H 1 ) SleepAns0.1 StudyAns0.8 H1H1 P(H 1 ) Sleep0.5 Study0.5 H2H2 O2O2
28
Quiz: Hidden Markov Models H1H1 O1O1 H1H1 H2H2 P(H 2 |H 1 ) Sleep 0.6 StudySleep0.5 H1H1 O1O1 P(O 1 |H 1 ) SleepAns0.1 StudyAns0.8 H1H1 P(H 1 ) Sleep0.5 Study0.5 H2H2 O2O2 Suppose a parent calls twice, once at time step 1 and once at time step 2. The first time, the child does not answer, and the second time the child does. Now what is P(H 2 =Sleep)?
29
Answer: Hidden Markov Models H1H1 O1O1 H1H1 H2H2 P(H 2 |H 1 ) Sleep 0.6 StudySleep0.5 H1H1 O1O1 P(O 1 |H 1 ) SleepAns0.1 StudyAns0.8 H1H1 P(H 1 ) Sleep0.5 Study0.5 H2H2 O2O2
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.