Adviser: Frank, Yeong-Sung Lin Present by Wayne Hsiao
Introduction Network Survivability Network Survivability under Disaster Propagation Numerical Result Conclusion 2
Introduction Network Survivability Network Survivability under Disaster Propagation Numerical Result Conclusion 3
Telecommunication networks have become one of the critical infrastructures It is critically important that the network is survivable The ability of the network to deliver the required services in the face of various disastrous events Disaster propagation is one of the most common characteristics of disastrous events and has serious impact on communication networks 4
Disaster propagation A dynamic area-based event, in which the affected area can evolve spatially and temporally For example, the 2005 hurricane Katrina in Louisiana, caused approximately 8% of all customarily routed networks in Louisiana outraged The March 2011 earthquake and tsunami in east Japan, which cascaded from the center to Tohoku and Tokyo areas, damaged 1.9 million fixed-lines and 29 thousand wireless base stations 5
Network design and operation need to consider survivability This requires an understanding of the dynamical network recovery behaviors under failure patterns To analyze the impact of disasters on the network as well as for estimating the benefits of alternative network survivable proposals, many mathematical models have been considered However, up to now no much is known about the network survivability in the propagation of disastrous events 6
The present paper develops a network survivability modeling method, which takes into consideration the propagating dynamics of disastrous events The analysis is exemplified for three repair strategies. The results not only are helpful in estimating quantitatively the survivability, but also provide insights on choosing among different repair strategies 7
Introduction Network Survivability Network Survivability under Disaster Propagation Numerical Result Conclusion 8
We focus on survivability as the ability of a networked system to continuously deliver services in compliance with the given requirements in the presence of failures and other undesired events Network survivability is quantified as the transient performance from the instant when an undesirable event occurs until steady state with an acceptable performance level is attained defined by the ANSI T1A1.2 committee 9
The measure of interest M has the value m 0 before a failure occurs. m a is the value of M just after the failure occurs m u is the maximum difference between the value of M and m a after the failure m r is the restored value of M after some time t r t R is the relaxation time for the system to restore the value of M 10
Introduction Network Survivability Network Survivability under Disaster Propagation Numerical Result Conclusion 11
Develop such a model particularly for networked systems where disastrous events may propagate across geographical areas A network can be viewed as a directed graph consisting of nodes and directed edges Nodes represent the network infrastructures The directed edges denote the directions of transitions The network is vulnerable to all sorts of disaster, which may start on some network nodes and propagate to other nodes during a random time 12
Suppose the number of nodes in the networked system is n We consider a disastrous event, which occurs on these nodes in successive steps The propagation is assumed to have ’memoryless’ property The probability of disastrous events spreading from one given node to another depends only on the current system state but not on the history of the system The affected node can be repaired (or replaced by a new one) in a random period All the times of the disaster propagation and repair are exponentially distributed 13
The state of each node of the system at time t lies within the set {0, 1} At the initial time t = 0, a disastrous event affects the 1-st node and the system is in the state (0, 1,..., 1) The disaster propagates from the node i − 1 to node i according to Poisson processes with rate λ i A disastrous event can occur on only one node at a time Each node has a specific repair process which is all at once and the repair time period of node i is exponentially distributed with mean value μ i 14
The state of the system at any time t can be completely described by the collection of the state of each node Where X i (t) = 0 (1 ≦ i ≦ n) if the event has occurred on the i-th node at time t, X i (t) = 1 in the case when the event has not occurred on the i-th node at time t. 15
With the above assumptions, the transient process X(t) can be mathematically modeled as a continuous-time Markov chain (CTMC) with state space Ω = {(X 1, · · ·,X n ) : X 1, · · ·,X n ∈ {0, 1}} The state space Ω consists of total N = 2 n states The process X(t) starts in the state (0, 1,..., 1) and finishes in the absorbing state (1, 1,..., 1) 16
Suppose that the system states are ordered so that in states 1, 2,...,N f (N f < N) the system has failure propagation and in states N f +1,N f +2,...,N the system is only in restoration phase Then, the transition rate matrix Q = [qij] of the process {X(t), t ≧ 0} can be written in partitioned form as where q ij denotes the rate of transition from state i to state j 17
Let π (t) = { π i (t), i ∈ Ω } denote a row vector of transient state probabilities of X(t) at time t With Q, the dynamic behavior of the CTMC can be described by the Kolmogorov differential-difference equation Then the transient state probability vector can be obtained 18
NETWORK SURVIVABILITY UNDER DISASTER PROPAGATION (CONT.) Let Υ i be the reward rate associated with state i In our model, the performance is considered as reward The network survivability performance is measured by the expected instantaneous reward rate E[M(t)] as 19
An infrastructure wireless network example 20
The state space of the chain is defined as S = {S 0,..., S Φ } ( Φ = 2 3 − 1) State is described by a triple as (X 1, X 2, X 3 ) X i ∈ {0, 1} refers to the affected state of cell i, i = 1, 2, 3 The set of possible states is 21
Two repair strategies Scheme 1: each cell has its own repair facility Scheme 2: all cells share a single repair facility 22
Each cell i has its own repair facility with repair rate μ i Fig. 3 shows the 8-state transition diagram of the CTMC model of the network example The transition matrix is of size 8 × 8 and the initial probability vector is π = (1,0,0,0,0,0,0,0) 23
Given a disaster occurs and destroys BS1, then all the users in cell 1 disconnect to the network The initial state is (0, 1, 1) The transition to state (0, 0, 1) occurs with rate λ 2 and takes into account the impact of disaster propagation from cell 1 to cell 2 The CTMC may also jump to original normal state (1, 1, 1) with repair rate μ 1 24
On state (0, 0, 1), the CTMC may jump to three possible states it may jump back to state (0, 1, 1) if the BS2 is repaired (this occurs with rate μ 2 ) it may jump to state (1, 0, 1) if the BS1 is repaired (this occurs with rate μ 1 ) the CTMC may jump to state (0,0,0) if the disaster propagates to cell 3 (this occurs with rate λ 3 ) 25
Let π (t) = [ π (0,0,0) (t) · · · π (X1,X2,X3) (t) · · · π (1,1,1) (t)] denote the row vector of transient state probabilities at time t The infinitesimal generator matrix for this CTMC is defined as Λ which is depicted in Fig. 4 26
With Λ, the dynamic behavior of the CTMC can be described by the Kolmogorov differential- difference equation in the matrix form π (t) can be solved using uniformization method Let q ii be the diagnoal element of Λ and I be the unit matrix, then the transient state probability vector is obtained as follows: 27
SCHEME 1 Where β ≥ max i |q ii | is the uniform rate parameter and P = I+ Λ / β. Truncate the summation to a large number (e.g., K), the controllable error ε can be computed from 28
In the situation with this repair strategy, all cells share the same repair facility The repair sequence is the same as the propagation path cell1 → cell2 → cell3 The set of all possible states in this situation is: 29
Accordingly, the transition diagram of the CTMC has the reduced 6- state as illustrated in Fig. 5 30
The system is in each state k at time t, which is denoted by π k (t), k = 0,..., 5 They can be obtained in a closed-form by the convolution integration approach Inserting Eq. (8) into Eq. (2) we can derive 31
Continuing by induction, then we have 32
33
We remark that simplification has been made in transition diagrams in Fig. 3 and Fig. 5 A cell which is recovered from a hurricane is unlikely to be destroyed by the same hurricane 34
Introduction Network Survivability Network Survivability under Disaster Propagation Numerical Result Conclusion 35
The expected instantaneous reward rate E[M(t)] gives the impact of users of the system at time t Given the number of users N i of each cell i, as defined, the reward rate for each state is easily found 36
The coverage radius of one BS is 1 km For the three cells, we assume N 1 = 3000,N 2 = 5000, N 3 = 2000 For the setting of propagation rates, We refer to the data from Hurricane Katrina situation report The peak wind speed was reported as high as 115 mph (184 km/h) The units of repair time of BS is hours It is acceptable that the disaster propagation rates are more than two order of magnitude than repair rates 37
In Fig. 6, where the chosen repair strategy is Scheme 1 Consider the scenario The fault propagation rate is high ( λ 2 = 5, λ 3 = 5), and the repair rates ( μ 1 = 0.04, μ 2 = 0.08, μ 3 = 0.12) are low In this scenario, the fraction of active users is low (roughly 0.07, 2 hours after the failure) If the repair rates are relatively higher ( μ 1 = 0.36, μ 2 = 0.72, μ 3 = 1.08), the fraction of active users sharply increase The effect of the fault propagation rate is not as evident for longer observation time (after 10 hours) dd 38
The plus-marked and dashed (blue) curves cross each other at time t ≈ 2 at Fig. 6 If we account for up to roughly two hours after the disaster, the fault propagation rates affect the service performance more than the repair rates In contrast, if we account for longer periods of time, the repairs rates yield more benefits than to have lower fault propagation rate 39
In the following, we compare three repair schemes Scheme 1 Scheme 2 Scheme 3: same as Scheme 1 but with double repair rates 2 μ 1, 2 μ 2, 2 μ 3 40
NUMERICAL RESULT (CONT.) dd 41
Introduction Network Survivability Network Survivability under Disaster Propagation Numerical Result Conclusion 42
We have modeled the survivability of an infrastructure- based wireless network by a CTMC that incorporates the correlated failures caused by disaster propagation The focus has been on computing the transient reward measures of the model Numerical results have been presented to study the impact of the underlying parameters and different repair strategies on network survivability 43
44 Thanks for Your Listening !