Download presentation
Presentation is loading. Please wait.
Published byAngela Blankenship Modified over 9 years ago
1
PRISM n A Probabilistic Model Checker, Birmingham n Supports 3 models: n 1.Discrete-time Markov chain(DTMC) n 2.Markov decision processes(MDP) n 3.Continuous-time Markov chain(CTMC)
2
n Supports 2 languages: n 1.PCTL n 2. CSL
3
CTMC MDP DTMC PCTL formulas CSL formulas CSL Model Checker System description Properties file Results PCLT Model Checker
4
Our Model: Web server Model Detect_and_restart application hardware application
5
System Parameters n Application failure rate: 0.001 n Hardware failure rate:0.0001 n Time to detect app. failure: 1/12 n Time to detect hardware failure:1/30 n Time to restart app.:1/20 n Time to restart hardware: 1/12 n Probability of unsuccessful restart:2% Time Unit: hour
6
Some modeling codes n Software and hardware crashes with some rate: n [] (s1 = 1) -> gamma_p : (s1' = 0); n [] (h1 = 1) -> gamma_m : (h1'= 0)&(s1' = 0);
7
Continue... n Primary Web server module n module primary n … n [inspect_p_soft] (sr1 = 0)&(s1 = 0)&(h1 = 1) n &!((s2 = 1 )&(h2 = 1 ))&(sr2 = 0) n -> 1: (sr1' = 1); n [restart_p_soft] (sr1 = 1)&(s1 = 0) n -> 0.98:(s1' = 1)&(sr1' = 0) n + 0.02:(sr1' = 0); n [inspect_p_hard] (hr1 = 0)&(h1 = 0) n -> 1:(hr1' = 1); n [restart_p_hard] (hr1 = 1)&(h1 = 0) n -> 1:(h1' = 1)&(hr1' = 0); n … n endmodule
8
Continue... n Detect and restart module n f:[0..1] init 0; n [inspect_p_soft] (f=0) -> 12.0:(f'=1); n [inspect_p_hard] (f=0) -> 30.0:(f'=1); n [inspect_s_soft] (f=0) -> 12.0:(f'=1); n [inspect_s_hard] (f=0) -> 30.0:(f'=1); n [restart_p_soft] (f=1) -> 20.0:(f'=0); n [restart_p_hard] (f=1) -> 12.0:(f'=0); n [restart_s_soft] (f=1) -> 20.0:(f'=0); n [restart_s_hard] (f=1) -> 12.0:(f'=0);
9
Properties verified to be valid: n // (1). Availability: for at least 99% of the time, the n // system is operational on the long run. n S>=0.99[(s1 = 1 & h1 = 1)|(s2 = 1 & h2 = 1)] n // (2). Instantaneous availability: the probability that n // the system is operational at a given point of time n // (100 hours) is at least 0.99. n P>0.99[true U[100,100] (s1 = 1 & h1 = 1)|(s2 = 1 & h2 = 1)] n // (3). The system will eventually be operational. n P>=1[true U (s1 = 1 & h1 = 1)|(s2 = 1 & h2 = 1)] n // (4). The probability that system will go down in 1000 n // hours is less than 0.01. n P<0.01[true U<=1000 (s1 = 0)&(s2 = 0)] n // (5). If primary software is down, it will eventually n // recover n s1 = 0 -> P>=1[true U s1 = 1 & h1 = 1]
10
Continue... n // (6). The probability that the whole system is not n // working in the 10,100, 1000, 10000 time unit is less n // than 0.001 n P<0.001[true U[0.1,0.1] s1 = 0 & s2 = 0] n P<0.001[true U[1,1] s1 = 0 & s2 = 0] n P<0.001[true U[10,10] s1 = 0 & s2 = 0] n P<0.001[true U[100,100] s1 = 0 & s2 = 0] n P<0.001[true U[1000,1000] s1 = 0 & s2 = 0] n P<0.001[true U[10000,10000] s1 = 0 & s2 = 0] n // (7). There's at least a 99% chance that the system n // will stabilize such that it is up to more than 90% in n // the long run. n P>0.99[true U S>0.90[(s1 = 1 & h1 = 1)|(s2 = 1 & h2 = 1)]] n // (8). When the software of one machine is down, the n // probability that the system will be available within 0.6 n // time units is at least 0.99. This is an extension of n // property (5). n s1 = 0 & h1 = 1 -> P>0.99[(s1 = 0 & h1 = 1) U<=0.6 n (s1 =1 & h1 = 1)|(s2 = 1 & s2 = 1)]
11
Ongoing research n 1. Problems: n scalability and state-space explosion n 2. Solutions: n (1).parallel/distributed implementation n of the model checking algorithm n (2).Exploit high-level properties of system such as symmetry and compositionality From Dave Parker@cs.bham.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.