Hidden Markov Models (cont.) Markov Decision Processes

Hidden Markov Models (cont.) Markov Decision Processes
CHAPTER 9 Hidden Markov Models (cont.) Markov Decision Processes

Markov Models

Conditional Independence

Weather Example

Mini-Forward Algorithm

Example

Stationary Distributions
If we simulate the chain long enough:  What happens?  Uncertainty accumulates  Eventually, we have no idea what the state is! Stationary distributions:  For most chains, the distribution we end up in is independent of the initial distribution  Called the stationary distribution of the chain  Usually, can only predict a short time out

Example: Web Link Analysis

Mini-Viterbi Algorithm

Hidden Markov Models

HMM Applications

Filtering: Forward Algorithm

Filtering Example

MLE: Viterbi Algorithm

Viterbi Properties

Markov Decision Processes

MDP Solutions

Example Optimal Policies

Stationarity

How (Not) to Solve an MDP
The inefficient way:  Enumerate policies  Calculate the expected utility (discounte rewards) starting from the start state  E.g. by simulating a bunch of runs  Choose the best policy We’ll return to a (better) idea like this later

Utilities of States

Infinite Utilities?

The Bellman Equation

Example: Bellman Equations

Value Iteration

Policy Iteration Alternate approach:
 Policy evaluation: calculate utilities for a fixed policy  Policy improvement: update policy based on resulting utilities  Repeat until convergence This is policy iteration  Can converge faster under some conditions

Comparison In value iteration:
 Every pass (or “backup”) updates both policy (based on current utilities) and utilities (based on current policy In policy iteration:  Several passes to update utilities  Occasional passes to update policies Hybrid approaches (asynchronous policy iteration):  Any sequences of partial updates to either policy entries or utilities will converge if every state is visited infinitely often

Hidden Markov Models (cont.) Markov Decision Processes

Similar presentations

Presentation on theme: "Hidden Markov Models (cont.) Markov Decision Processes"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hidden Markov Models (cont.) Markov Decision Processes

Similar presentations

Presentation on theme: "Hidden Markov Models (cont.) Markov Decision Processes"— Presentation transcript:

Similar presentations

About project

Feedback