Eager Markov Chains Parosh Aziz Abdulla Noomene Ben Henda Richard Mayr Sven Sandberg TexPoint fonts used in EMF. Read the TexPoint manual before you delete.

Slides:

Advertisements

Similar presentations

Model Checking Base on Interoplation

Advertisements

1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.

The Contest between Simplicity and Efficiency in Asynchronous Byzantine Agreement Allison Lewko The University of Texas at Austin TexPoint fonts used in.

Lirong Xia Reinforcement Learning (2) Tue, March 21, 2014.

Techniques to analyze workflows (design-time)

Keeping a Crowd Safe On the Complexity of Parameterized Verification Javier Esparza Technical University of Munich.

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.

Workflow Management Kap. 4. Analyzing Workflows Wil van der Aalst has copyrights to almost all figures in the following slideshow made by Lars Frank.

Tuning bandit algorithms in stochastic environments The 18th International Conference on Algorithmic Learning Theory October 3, 2007, Sendai International.

Parosh Aziz Abdulla Pritha Mahata Aletta Nyl é n Uppsala University Downward Closed Language Generators.

From Monotonic Transition Systems to Monotonic Games Parosh Aziz Abdulla Uppsala University.

1 A class of Generalized Stochastic Petri Nets for the performance Evaluation of Mulitprocessor Systems By M. Almone, G. Conte Presented by Yinglei Song.

On the Dynamics of PB Systems with Volatile Membranes Giorgio Delzanno* and Laurent Van Begin** * Università di Genova, Italy ** Universitè Libre de Bruxelles,

Krishnendu Chatterjee1 Partial-information Games with Reachability Objectives Krishnendu Chatterjee Formal Methods for Robotics and Automation July 15,

. Markov Chains as a Learning Tool. 2 Weather: raining today40% rain tomorrow 60% no rain tomorrow not raining today20% rain tomorrow 80% no rain tomorrow.

On Systems with Limited Communication PhD Thesis Defense Jian Zou May 6, 2004.

Probabilistic CEGAR* Björn Wachter Joint work with Holger Hermanns, Lijun Zhang TexPoint fonts used in EMF. Read the TexPoint manual before you delete.

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA A.

Belief Propagation in a Continuous World Andrew Frank 11/02/2009 Joint work with Alex Ihler and Padhraic Smyth TexPoint fonts used in EMF. Read the TexPoint.

1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.

MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.

Face Recognition Using Embedded Hidden Markov Model.

1 Petri Nets H Plan: –Introduce basics of Petri Net models –Define notation and terminology used –Show examples of Petri Net models u Calaway Park model.

Lattices for Distributed Source Coding - Reconstruction of a Linear function of Jointly Gaussian Sources -D. Krithivasan and S. Sandeep Pradhan - University.

1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.

Markov Decision Processes Value Iteration Pieter Abbeel UC Berkeley EECS TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Reinforcement Learning Yishay Mansour Tel-Aviv University.

Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.

Chair of Software Engineering 1 Unreliable Channels are Easier To Verify Than Perfect Channels by G. Cécé, A. Finkel, and S. Purushotaman Iyer Arnaud Bailly.

Linear Codes for Distributed Source Coding: Reconstruction of a Function of the Sources -D. Krithivasan and S. Sandeep Pradhan -University of Michigan,

Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich, Idit Keidar Technion.

CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy

CDA6530: Performance Models of Computers and Networks Examples of Stochastic Process, Markov Chain, M/M/* Queue TexPoint fonts used in EMF. Read the TexPoint.

Regular Model Checking Ahmed Bouajjani,Benget Jonsson, Marcus Nillson and Tayssir Touili Moran Ben Tulila

Random Walks and Markov Chains Nimantha Thushan Baranasuriya Girisha Durrel De Silva Rahul Singhal Karthik Yadati Ziling Zhou.

Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.

Artificial Neural Networks

Fluid Limits for Gossip Processes Vahideh Manshadi and Ramesh Johari DARPA ITMANET Meeting March 5-6, 2009 TexPoint fonts used in EMF. Read the TexPoint.

General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning Duke University Machine Learning Group Discussion Leader: Kai Ni June 17, 2005.

CSE-473 Artificial Intelligence Partially-Observable MDPS (POMDPs)

IEEE Globecom 2010 Tan Le Yong Liu Department of Electrical and Computer Engineering Polytechnic Institute of NYU Opportunistic Overlay Multicast in Wireless.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

CSE-573 Reinforcement Learning POMDPs. Planning What action next? PerceptsActions Environment Static vs. Dynamic Fully vs. Partially Observable Perfect.

Computer Science CPSC 502 Lecture 14 Markov Decision Processes (Ch. 9, up to 9.5.3)

Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.

Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …

Model under consideration: Loss system Collection of resources to which calls with holding time  (c) and class c arrive at random instances. An arriving.

Reinforcement Learning Yishay Mansour Tel-Aviv University.

Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.

Regularization and Feature Selection in Least-Squares Temporal Difference Learning J. Zico Kolter and Andrew Y. Ng Computer Science Department Stanford.

An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.

Probabilistic Automaton Ashish Srivastava Harshil Pathak.

ETH Zurich – Distributed Computing Group Stephan Holzer 1ETH Zurich – Distributed Computing – Stephan Holzer Yvonne Anne Pignolet Jasmin.

When Simulation Meets Antichains Yu-Fang Chen Academia Sinica, Taiwan Joint work with Parosh Aziz Abdulla, Lukas Holik, Richard Mayr, and Tomas Vojunar.

Markov Decision Process (MDP)

Iftach Haitner and Eran Omri Coin Flipping with Constant Bias Implies One-Way Functions TexPoint fonts used in EMF. Read the TexPoint manual before you.

R. Brafman and M. Tennenholtz Presented by Daniel Rasmussen.

Lecture 7: Turning Machines 虞台文大同大學資工所智慧型多媒體研究室.

Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS.

Reinforcement Learning (1)

Markov Decision Processes

Concurrent Systems Modeling using Petri Nets – Part II

Markov Decision Processes

TexPoint fonts used in EMF.

Introduction to Petri Nets (PNs)

‘Crowds’ through a PRISM

Reinforcement Learning Dealing with Partial Observability

TexPoint fonts used in EMF.

Presentation transcript:

Eager Markov Chains Parosh Aziz Abdulla Noomene Ben Henda Richard Mayr Sven Sandberg TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A

Informationsteknologi Institutionen för informationsteknologi | Outline Introduction Expectation Problem Algorithm Scheme Termination Conditions Subclasses of Markov Chains  Examples Conclusion

Informationsteknologi Institutionen för informationsteknologi | Introduction Model: Infinite-state Markov chains  Used to model programs with unreliable channels, randomized algorithms… Interest: Conditional expectations  Expected execution time of a program  Expected resource usage of a program

Informationsteknologi Institutionen för informationsteknologi | Introduction Infinite-state Markov chain  Infinite set of states  Target set  Probability distributions Example

Informationsteknologi Institutionen för informationsteknologi | Introduction Reward function  Defined over paths reaching the target set Example

Informationsteknologi Institutionen för informationsteknologi | Expectation Problem Instance  A Markov chain  A reward function Task  Compute/approximate the conditional expectation of the reward function

Informationsteknologi Institutionen för informationsteknologi | Expectation Problem Example:  The weighted sum  The reachability probability  The conditional expectation *4+0.1*(-5)= = /0.9=3

Informationsteknologi Institutionen för informationsteknologi | Expectation Problem Remark  Problem in general studied for finite-state Markov chains Contribution  Algorithm scheme to compute it for infinite- state Markov chains  Sufficient conditions for termination

Informationsteknologi Institutionen för informationsteknologi | Algorithm Scheme At each iteration n  Compute paths up to depth n  Consider only those ending in the target set  Update the expectation accordingly Path Exploration

Informationsteknologi Institutionen för informationsteknologi | Algorithm Scheme Correctness  The algorithm computes/approximates the correct value Termination  Not guaranteed: lower-bounds but no upper- bounds

Informationsteknologi Institutionen för informationsteknologi | Termination Conditions Exponentially bounded reward function  The intuition: limit on the growth of the reward functions  Remark: The limit is reasonable: for example polynomial functions are exponentially bounded

Informationsteknologi Institutionen för informationsteknologi | Termination Conditions 0 The abs of the reward Bound on the reward

Informationsteknologi Institutionen för informationsteknologi | Termination Conditions Eager Markov chain  The intuition: Long paths contribute less in the expectation value  Remark: Reasonable: for example PLCS, PVASS, NTM induce all eager Markov chains

Informationsteknologi Institutionen för informationsteknologi | Termination Conditions 0 1 Prob. of reaching the target in more than n steps Bound on the probability

Informationsteknologi Institutionen för informationsteknologi | Termination Conditions Pf Ws Ce

Informationsteknologi Institutionen för informationsteknologi | Subclasses of Markov Chains Eager Markov chains Markov chains with finite eager attractor Markov chains with the bounded coarseness property NTM PVASS PLCS

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor Attractor:  Almost surely reached from every state Finite eager attractor:  Almost surely reached  Unlikely to stay ”too long” outside of it A EA

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor EA 0 1 b Prob. to return in More than n steps

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor Finite eager attractor implies eager Markov chain??  Reminder: Eager Markov chain: Prob. of reaching the target in more than n steps

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor FEA Paths of length n that visit the attractor t times

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor Proof idea: identify 2 sets of paths  Paths that visit the attractor often without going to the target set:  Paths that visit the attractor rarely without going the target set:

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor Paths visiting the attractor rarely: t less than n/c FEA Pr_n

Informationsteknologi Institutionen för informationsteknologi | Finite Eager Attractor Paths visiting the attractor often: t greater than n/c FEA PtPl Po_n

Informationsteknologi Institutionen för informationsteknologi | Probabilistic Lossy Channel Systems (PLCS) Motivation:  Finite-state processes communicating through unbounded and unreliable channels  Widely used to model systems with unreliable channels (link protocol)

Informationsteknologi Institutionen för informationsteknologi | PLCS ab b Send c!a ab Receive c?b a c?b q0 q3q2 q1 nop c!a c!b aba Channel c nop

Informationsteknologi Institutionen för informationsteknologi | PLCS c?b q0 q3q2 q1 nop c!a c!b aba Channel c nop ab b Loss b b a a

Informationsteknologi Institutionen för informationsteknologi | PLCS Configuration  Control location  Content of the channel Example  [q3,”aba”] c?b q0 q3q2 q1 nop c!a c!b aba Channel c nop

Informationsteknologi Institutionen för informationsteknologi | PLCS A PLCS induces a Markov chain:  States: Configurations  Transitions: Loss steps combined with discrete steps

Informationsteknologi Institutionen för informationsteknologi | PLCS Example:  [q1,”abb”] [q2,”a”]  By losing one of the messages ”b” and firing the marked step. Probability:  P=Ploss*2/3 c?b q0 q3q2 q1 nop c!a c!b aba Channel c nop

Informationsteknologi Institutionen för informationsteknologi | PLCS Result: Each PLCS induces a Markov chain with finite eager attractor.  Proof hint: When the size of the channels is big enough, it is more likely (with a probability greater than ½) to lose a message.

Informationsteknologi Institutionen för informationsteknologi | Bounded Coarseness The probability of reaching the target within K steps is bounded from below by a constant b.

Informationsteknologi Institutionen för informationsteknologi | Bounded Coarseness Boundedly coarse Markov chain implies eager Markov chain??  Reminder: Eager Markov chain: Prob. of reaching the target in more than n steps

Informationsteknologi Institutionen för informationsteknologi | Bounded Coarseness Prob. Reach. Within K steps KnK steps 2K PnP2 Pn:Prob. of avoiding the target in nK steps P1

Informationsteknologi Institutionen för informationsteknologi | Probabilistic Vector Addition Systems with states (PVASS) Motivation:  PVASS are generalizations of Petri-nets.  Widely used to model parallel processes, mutual exclusion program…

Informationsteknologi Institutionen för informationsteknologi | PVASS Configuration  Control location  Values of the variables x and y Example:  [q1,x=2,y=0] q0 q3q2 q1 nop --x --y ++x ++y x

Informationsteknologi Institutionen för informationsteknologi | PVASS A PVASS induces a Markov chain:  States: Configurations  Transitions: discrete steps

Informationsteknologi Institutionen för informationsteknologi | PVASS Example:  [q1,1,1] [q2,1,0]  By taking the marked step. Probability:  P=2/3 q0 q3q2 q1 nop --x --y ++x ++y x

Informationsteknologi Institutionen för informationsteknologi | PVASS Result: Each PVASS induces a Markov chain which has the bounded coarseness property.

Informationsteknologi Institutionen för informationsteknologi | Noisy Turing Machines (NTM) Motivation:  They are Turing Machines augmented with a noise parameter.  Used to model systems operating in ”hostile” environment

Informationsteknologi Institutionen för informationsteknologi | NTM Fully described by a Turing Machine and a noise parameter. q1 q3q2 q4 a/bb b # # RR R RR S S ab#b#aab

Informationsteknologi Institutionen för informationsteknologi | NTM q1 q3q2 q4 a/bb b # # RR R RR S S ab#b#aab Discret Step ab#b#aab bb#b#aab

Informationsteknologi Institutionen för informationsteknologi | NTM q1 q3q2 q4 a/bb b # # RR R RR S S ab#b#aab Noise Step ab#b#aab #b#b#aab

Informationsteknologi Institutionen för informationsteknologi | NTM Result: Each NTM induces a Markov chain which has the bounded coarseness property.

Informationsteknologi Institutionen för informationsteknologi | Conclusion Summary:  Algorithm scheme for approximating expectations of reward functions  Sufficient conditions to guarantee termination:  Exponentially bounded reward function  Eager Markov chains

Informationsteknologi Institutionen för informationsteknologi | Conclusion Direction for future work  Extending the result to Markov decision processes and stochastic games  Find more concrete applications

Thank you

Informationsteknologi Institutionen för informationsteknologi | PVASS Order on configurations: <=  Same control locations  Ordered values of the variables Example:  [q0,3,4] <= [q0,3,5] q0 q3q2 q1 nop --x --y ++x ++y x

Informationsteknologi Institutionen för informationsteknologi | PVASS Probability of each step > 1/10 Boundedly coarse: parameters K and 1/10^K q0 q3q2 q1 nop --x --y ++x ++y x Target set K iterations