Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS.

Slides:

Advertisements

Similar presentations

Discrete time Markov Chain

Advertisements

Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games.

Routing in a Parallel Computer. A network of processors is represented by graph G=(V,E), where |V| = N. Each processor has unique ID between 1 and N.

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.

Energy and Mean-Payoff Parity Markov Decision Processes Laurent Doyen LSV, ENS Cachan & CNRS Krishnendu Chatterjee IST Austria MFCS 2011.

NL equals coNL Section 8.6 Giorgi Japaridze Theory of Computability.

1 Chapter 22: Elementary Graph Algorithms IV. 2 About this lecture Review of Strongly Connected Components (SCC) in a directed graph Finding all SCC (i.e.,

Krishnendu Chatterjee1 Partial-information Games with Reachability Objectives Krishnendu Chatterjee Formal Methods for Robotics and Automation July 15,

Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger.

Discussion #34 1/17 Discussion #34 Warshall’s and Floyd’s Algorithms.

1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.

An Introduction to Markov Decision Processes Sarah Hickmott

Planning under Uncertainty

Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger.

CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy

The effect of New Links on Google Pagerank By Hui Xie Apr, 07.

Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.

Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.

Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.

Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete.

Dynamic Programming Applications Lecture 6 Infinite Horizon.

Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …

Topics in Algorithms 2007 Ramesh Hariharan. Tree Embeddings.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

Theory of Computational Complexity Probability and Computing Lee Minseon Iwama and Ito lab M1 1.

Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.

PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.

Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.

EMGT 6412/MATH 6665 Mathematical Programming Spring 2016

Markov Chains and Random Walks

Inductive Proof (the process of deriving generalities from particulars) Mathematical Induction (reasoning over the natural numbers)

Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Prof. Dr. Holger Schlingloff 1,2 Dr. Esteban Pavese 1

Part VI NP-Hardness.

Random walks on undirected graphs and a little bit about Markov Chains

Topological Sort In this topic, we will discuss: Motivations

Markov Chains Mixing Times Lecture 5

The Multiple Dimensions of Mean-Payoff Games

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Lectures on Network Flows

Lecture 18: Uniformity Testing Monotonicity Testing

Chapter 5. Optimal Matchings

Minimal Spanning Trees

Markov Decision Processes

Spanning Trees Longin Jan Latecki Temple University based on slides by

CSE 421: Introduction to Algorithms

Turnstile Streaming Algorithms Might as Well Be Linear Sketches

Markov Decision Processes

Hidden Markov Models Part 2: Algorithms

Depth Estimation via Sampling

Alternating tree Automata and Parity games

Randomized Algorithms Markov Chains and Random Walks

3.5 Minimum Cuts in Undirected Graphs

Instructors: Fei Fang (This Lecture) and Dave Touretzky

Lectures on Graph Algorithms: searching, testing and sorting

Uri Zwick Tel Aviv University

CS 188: Artificial Intelligence Fall 2007

Memoryless Determinacy of Parity Games

András Sebő and Anke van Zuylen

Instructor: Aaron Roth

Discrete time Markov Chain

Algorithms (2IL15) – Lecture 7

Spanning Trees Longin Jan Latecki Temple University based on slides by

Instructor: Aaron Roth

Markov Decision Processes

Markov Decision Processes

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Presentation transcript:

Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS

Krishnendu ChatterjeeFormal Methods Class2 Markov Chains  Markov chain model: G=((S,E),  )  Finite set S of states.  Probabilistic transition function   E ={ (s,t) |  (s)(t) > 0}  The graph (S,E) is useful.

Krishnendu ChatterjeeFormal Methods Class3 Markov Chain: Example 1/3

Krishnendu ChatterjeeFormal Methods Class4 Markov Chain: Example 1/3

Krishnendu ChatterjeeFormal Methods Class5 Markov Chain: Example 1/3

Krishnendu ChatterjeeFormal Methods Class6 Markov Chain  Properties of interest  Target set T: probability to reach the target set.  Target set B: probability to visit B infinitely often.

Krishnendu ChatterjeeFormal Methods Class7 Average Frequency  Let the states be {1,2, …, n} and the Markov chain strongly connected.  We have the following linear equations whose solution gives the frequency:  for every i we write the following equation  x i =   1<=j<=n   x j ¢  (j)(i)  The solution x i ’s of the above equation gives the frequency.

Krishnendu ChatterjeeFormal Methods Class8 Questions  Qualitative question  The set where the property holds with probability 1.  Qualitative analysis.  Quantitative question  What is the precise probability that the property holds.  Quantitative analysis.

Krishnendu ChatterjeeFormal Methods Class9 Qualitative Analysis of Markov Chains  Consider the graph of Markov chain.  Closed recurrent set:  Bottom strongly connected component.  Closed: No probabilistic transition out.  Strongly connected.

Krishnendu ChatterjeeFormal Methods Class10 Qualitative Analysis of Markov Chains  Theorem: Reach the set of closed recurrent set with probability 1.  Proof.  Consider the DAG of the scc decomposition of the graph.  Consider a scc C of the graph that is not bottom.  Let ® be the minimum positive transition prob.  Leave C within n steps with prob at least ¯ = ® n.  Stay in C for at least k*n steps is at most (1- ¯ ) k.  As k goes to infinity this goes to 0.

Krishnendu ChatterjeeFormal Methods Class11 Qualitative Analysis of Markov Chains  Theorem: Reach the set of closed recurrent set with probability 1.  Proof.  Path goes out with ¯.  Never gets executed for k times is (1- ¯ ) k. Now let k goto infinity.

Krishnendu ChatterjeeFormal Methods Class12 Qualitative Analysis of Markov Chains  Theorem: Given a closed recurrent set C, for any starting state in C, all states is reached with prob 1, and hence all states visited infinitely often with prob 1.  Proof. Very similar argument like before.

Krishnendu ChatterjeeFormal Methods Class13 Quantitative Reachability Analysis  Let us denote by C the set of bottom scc’s (the quantitative values are 0 or 1). We now define a set of linear equalities. There is a variable x s for every state s. The equalities are as follows (for every state s):  x s = 0 if s in C and bad bottom scc.  x s = 1 if s in C and good bottom scc.  x s =  t 2 S x t *  (s)(t).  Brief proof idea: The remaining Markov chain is transient. Matrix algebra det(I-  )  0.

Krishnendu ChatterjeeFormal Methods Class14 Qualitative and Quantitative Analysis  Previous two results are the basis.  Example: Liveness objective.  Compute max scc decomposition.  Reach the bottom scc’s with prob 1.  A bottom scc with a target is a good bottom scc, otherwise bad bottom scc.  Qualitative: if a path to a bad bottom scc, not with prob 1. Otherwise with prob 1.  Quantitative: reachability probability to good bottom scc.

Krishnendu ChatterjeeFormal Methods Class15 Markov Chain: Example  Reachability: starting state is blue.  Red: probability is less than 1.  Blue: probability is 1.  Green probability is 1.  Liveness: infinitely often visit  Red: probability is 0.  Blue: probability is 0.  Green: probability is 1.

Krishnendu ChatterjeeFormal Methods Class16 Markov Chain: Example  Parity  Blue infinitely often, or 1 finitely often.  In general, if priorities are 0,1, …, 2d, then we require for some 0 · i · d, that priority 2i infinitely often, and all priorities less than 2i is finitely often. 012

Krishnendu ChatterjeeFormal Methods Class17 Objectives  Objectives are subsets of infinite paths, i.e., Ã µ S !.  Reachability: set of paths that visit the target T at least once.  Liveness (Buechi): set of paths that visit the target B infinitely often.  Parity: given a priority function p: S ! {0,1,…, d}, the objective is the set of infinite paths where the minimum priority visited infinitely often is even.

Krishnendu ChatterjeeFormal Methods Class18 Markov Chain Summary ReachabilityLivenessParity QualitativeLinear time QuantitativeLinear equalities (Gaussian elimination) Linear equalities

Krishnendu ChatterjeeFormal Methods Class19 MARKOV DECISION PROCESSES

Krishnendu ChatterjeeFormal Methods Class20 Markov Decision Processes  Markov decision processes (MDPs)  Non-determinism.  Probability.  Generalizes non-deterministic systems and Markov chains.  An MDP G= ((S,E), (S 1, S P ),  )   : S P ! D(S).  For s 2 S P, the edge (s,t) 2 E iff  (s)(t)>0.  E(s) out-going edges from s, and assume E(s) non- empty for all s.

Krishnendu ChatterjeeFormal Methods Class21 MDP: Example 1/3 1/2

Krishnendu ChatterjeeFormal Methods Class22 MDP: Example 1/3 1/2

Krishnendu ChatterjeeFormal Methods Class23 MDP: Example 1/3 1/2

Krishnendu ChatterjeeFormal Methods Class24 MDP: Example 1/3 1/2

Krishnendu ChatterjeeFormal Methods Class25 MDP: Example 1/3 1/2 y1-y

Krishnendu ChatterjeeFormal Methods Class26 MDP  Model  Objectives  How is non-determinism resolved: notion of strategies. At each stage can be resolved differently and also probabilistically.

Krishnendu ChatterjeeFormal Methods Class27 Strategies  Strategies are recipe how to move tokens or how to extend plays. Formally, given a history of play (or finite sequence of states), it chooses a probability distribution over out-going edges.  ¾ : S * S 1  D(S).

Krishnendu ChatterjeeFormal Methods Class28 MDP: Strategy Example 1/3 1/2 Token for k-th time: choose left with prob 1/k and right (1-1/k).

Krishnendu ChatterjeeFormal Methods Class29 Strategies  Strategies are recipe how to move tokens or how to extend plays. Formally, given a history of play (or finite sequence of states), it chooses a probability distribution over out-going edges.  ¾ : S * S 1 ! D(S).  History dependent and randomized.  History independent: depends only current state (memoryless or positional).  ¾ : S 1 ! D(S)  Deterministic: no randomization (pure strategies).  ¾ : S * S 1 ! S  Deterministic and memoryless: no memory and no randomization (pure and memoryless and is the simplest class).  ¾ : S 1 ! S

Krishnendu ChatterjeeFormal Methods Class30 Example: Cheating Lovers Visit green and red infinitely often. Pure memoryless not good enough. Strategy with memory: alternates. Randomized memoryless: choose with uniform probability. Certainty vs. probability 1.

Krishnendu ChatterjeeFormal Methods Class31 Values in MDPs  Value at a state for an objective Ã  Val( Ã )(s) = sup ¾ Pr s ¾ ( Ã ).  Qualitative analysis  Compute the set of almost-sure (prob 1) winning states (i.e., set of states with value 1).  Quantitative analysis  Compute the value for all states.

Krishnendu ChatterjeeFormal Methods Class32 Qualitative and Quantitative Analysis  Qualitative analysis  Liveness (Buechi) and reachability as a special case.  Reduction of quantitative analysis to quantitative reachability.  Quantitative reachability.

Krishnendu ChatterjeeFormal Methods Class33 Qualitative Analysis for Liveness  An MDP G, with a target set B.  Set of states such that there is a strategy to ensure that B is visited infinitely often with probability 1.  We will show pure memoryless is enough.  The generalization to parity (left as an exercise).

Krishnendu ChatterjeeFormal Methods Class34 Attractor  Random Attractor for a set U of states.  U 0 = U.  U i+1 = U i [ {s 2 S 1 j E(s) µ U i } [ {s 2 S P j E(s) Å U i  ; }.  From U i+1 no matter what is the choice, U i is reached with positive probability. By induction U is reached with positive probability.

Krishnendu ChatterjeeFormal Methods Class35 Attractor  Attr P (U) = [ i ¸ 0 U i.  Attractor lemma: From Attr P (U) no matter the strategy of the player (history dependent, randomized) the set U is reached with positive probability.  Can be computed in O(m) time (m number of edges).  Thus if U is not in the almost-sure winning set, then Attr P (U) is also not in the almost-sure winning set.

Krishnendu ChatterjeeFormal Methods Class36 Iterative Algorithm  Compute simple reachability to B (exist a path in the graph of the MDP (S,E). Let us call this set A. B

Krishnendu ChatterjeeFormal Methods Class37 Iterative Algorithm  Let U= S n A. Then there is not even a path from U to B. Clearly, U is not in the almost-sure set.  By attractor lemma can take Attr P (U) out and iterate. B A U

Krishnendu ChatterjeeFormal Methods Class38 Iterative Algorithm  Attr P (U) may or may not intersect with B. B A U Attr P (U )

Krishnendu ChatterjeeFormal Methods Class39 Iterative Algorithm  Iterate on the remaining sub-graph.  Every iteration what is removed is not part of almost- sure winning set.  What happens when the iteration stops. B A U Attr P (U )

Krishnendu ChatterjeeFormal Methods Class40 Iterative Algorithm  The iteration stops. Let Z be the set of states removed overall iteration.  Two key properties. B AZ

Krishnendu ChatterjeeFormal Methods Class41 Iterative Algorithm  The iteration stops. Let Z be the set of states removed overall iteration.  Two key properties:  No probabilistic edge from outside to Z. B AZ

Krishnendu ChatterjeeFormal Methods Class42 Iterative Algorithm  The iteration stops. Let Z be the set of states removed overall iteration.  Two key properties:  No probabilistic edge from outside to Z.  From everywhere in A (the remaining graph) path to B. B AZ

Krishnendu ChatterjeeFormal Methods Class43 Iterative Algorithm  Two key properties:  No probabilistic edge from outside to Z.  From everywhere in A (the remaining graph) path to B.  Fix a memoryless strategy as follows:  In A n B: shorten distance to B. (Consider the BFS and choose edge).  In B: stay in A. B AZ

Krishnendu ChatterjeeFormal Methods Class44 Iterative Algorithm  Fix a memoryless strategy as follows:  In A n B: shorten distance to B. (Consider the BFS and choose edge).  In B: stay in A.  Argue all bottom scc’s intersect with B. By Markov chain theorem done. B AZ

Krishnendu ChatterjeeFormal Methods Class45 Iterative Algorithm  Argue all bottom scc’s intersect with B. By Markov chain theorem done.  Towards contradiction some bottom scc that does not intersect.  Consider the minimum BFS distance to B. B AZ

Krishnendu ChatterjeeFormal Methods Class46 Iterative Algorithm  Argue all bottom scc’s intersect with B. By Markov chain theorem done.  Towards contradiction some bottom scc that does not intersect.  Consider the minimum BFS distance to B.  Case 1: if a state in S P, all edges must be there and so must be the one with shorter distance.  Case 2: if a state in S 1, then the successor chosen has shorter distance.  In both cases we have a contradiction. B AZ