Presentation is loading. Please wait.

Presentation is loading. Please wait.

E E 681 - Module 18 M.H. Clouqueur and W. D. Grover TRLabs & University of Alberta © Wayne D. Grover 2002, 2003 Analysis of Path Availability in Span-Restorable.

Similar presentations


Presentation on theme: "E E 681 - Module 18 M.H. Clouqueur and W. D. Grover TRLabs & University of Alberta © Wayne D. Grover 2002, 2003 Analysis of Path Availability in Span-Restorable."— Presentation transcript:

1 E E 681 - Module 18 M.H. Clouqueur and W. D. Grover TRLabs & University of Alberta © Wayne D. Grover 2002, 2003 Analysis of Path Availability in Span-Restorable Mesh Networks

2 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 2 Review of Mesh Design Motivation: Something must be done to reduce the impact of network element failures on service availability Solution: Mesh Restoration Mechanism (Requires extra capacity) Capacity planning methods: Max Latching Herzberg Modular capacity placement Joint working-spare capacity placement More and more capacity efficient but Availability ??? Availability Capacity Intuitively: Questions to answer: How much does mesh restoration improve the availability of service? How does the availability depend on the total capacity?

3 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 3 What Causes Unavailability? Single span failures Multiple span failures Node failures Span maintenance services Combinations of the above What we need to compare : Which are the most important? Number of such events Example: Probability of bringing the system under study in down state By doing this comparison the major contributor to unavailability appears to be: Combination of Span failure and Span maintenance service (equivalent to dual span failure in the worst case)

4 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 4 Impact of Failures For the previous comparison we could only guess or make assumptions for the value of the impact of each failure categories. Examples: Single span failure, Impact = 0 (network fully restorable to single failures) Dual span failure, Impact = 0.5 (at least half of the traffic on average should be restorable) Determination of availability of service paths: We need to know the exact value of the impact of each failure scenario on the availability of that service path Availability analysis of path p: Failure of (S 1, S 2 )  Impact = 0.3321 Failure of (S 1, S 3 )  Impact = 0.0000 Failure of (S i, S j )  Impact = 0.5243 We need a tool that determines the probability of path p being down for any given set of failed spans...

5 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 5 Problem of Independence of Span Failures The contribution of a failure event to the unavailability is: For a dual span failure: Based on the assumption that failures of S i and S j are independent Special Case: S1S1 S2S2 S3S3 S2S2 S1S1 In that case: This span does not really exist but rather Common cable sheaths

6 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 6 Path Availability Calculation Exact Expression: Simplified approach: Where: U * links, i is the equivalent link unavailability on span i Advantage: we only need to compute one value for each span and then use those values for the calculation of end-to-end availability. Drawback of simplified approach: Some failure events contribute to the unavailability of links on several spans in a neighbourhood and can therefore be counted several times when summing the U * links,i ’s.

7 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 7 Link Equivalent Unavailability Concept of Equivalent Unavailability: Non-restorable network: U link = U span (physical unavailability)  When the span is down, the link is down Restorable network: U link = U link * (Equivalent link unavailability) U link * is different from U span because of the restoration mechanism We will see that U link * is in the order of U span 2 therefore U link * << U span

8 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 8 Derivation of U link * It can easily be shown that the expected number of failed and non restored links in the network at any time is: R 2 : Average Dual Failure Restorability of Links In general U links,i * can be defined as: The only unknown S: Total number of spans Us: Average physical unavailability of spans w: Average working capacity of spans Span-specific average U * link (i) can be obtained using span-specific average R 2: R 2 (i) (calculated over S-1 dual-failure scenarios involving span i) Nab: non restored working units in the case of failure of span a and span b

9 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 9 Determination of R 2 There is no closed form model for R2 as the impact of each failure scenario depends on several factors specific to the failure case. However failures events can be divided into a few main categories: Case 0: Span failure and w i > feasible spare paths Case 1: Two failures but no spatial interactions Case 2: Two failures and spatial interactions (competition for spare capacity) Case 3: Two failures with second failure hitting the first restoration pathset Case 4: Two failures isolating a degree-2 node  not possible by definition in a restorable network  no outage  may be outage  certain outage  may be outage Unavailability Sequences:

10 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 10 Impact of Dual Failures Example #1, no spatial interaction:  NO OUTAGE W = 3 3 W = 2 2 The two restoration paths do not interfere W: working capacity S: spare capacity

11 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 11 Impact of Dual Failures W: working capacity S: spare capacity W = 3 W = 2 Example #2, spatial interaction - capacity dependency: S < 5 or S > 5 ? Is there enough spare capacity to restore both failures?  POSSIBLE OUTAGE depending on the value of S

12 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 12 Impact of Dual Failures W: working capacity S: spare capacity W = 2 2 The second failure hits the restoration path set deployed for first failed span The outcome in this situation depends on the adaptability of restoration mechanism and on the amount of remaining spare capacity  POSSIBLE OUTAGE Example #3, spatial interaction - special case:

13 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 13 Impact of Dual Failures W = 3 Nothing can be done to restore any of the two failures  OUTAGE W = 2 W: working capacity S: spare capacity Example #4, isolated node:

14 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 14 Adaptability of the restoration mechanism S2S2 S1S1 S2S2 S1S1 S2S2 S1S1 Static behaviour Partly adaptive behaviour Fully adaptive behaviour Restoration preplan says: “S 2 is to be restored through S 1 ” S 2 is restored via another route where spare capacity is available S 1 is left unrestored S 2 is restored via another route where spare capacity is available S 1 is restored again (if possible) with release of spare capacity previously used for restoration of span S 1 (similar to path restoration’s stub release) Optional: The spare capacity used on span S 2 gets “working status” and benefits from restoration effort for S 2

15 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 15 Results of Case Studies * Designed with Optimal Modular Spare Capacity Placement Typical test network : R 2 Results for 5 test networks: With a fully adaptive behaviour in a modular environment the working units enjoy almost full restorability to any dual span failures

16 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 16 Improvement over a non-restorable network Path availability improvement example: Test network: EuroNetA (19 nodes, 37 spans) Reference path: 5 hops Assumption: Us=3  10 -4 If the network is non-restorable: U link = U span, U path = 15  10 -4 = 13 hrs/year If the network is restorable, the simulation with fully adaptive behaviour gives: R 2 = 0.716735  U link * = 9.18  10 -7  U path = 4.59  10 -6 = 2.4 min/year Making a network restorable to single span failures brings a considerable improvement in the average availability of service paths. For specific services it might still not be enough … How can we make service paths even more available?

17 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 17 Design for High Availability The idea is to provision a network from an availability standpoint. Two integer programming formulations were developed: Dual Failure Minimum Capacity (DFMC) Finds the minimum capacity assignment for full restorability to dual-failures (R 2 =100%) note: Cannot be used for networks with any degree-2 graph cut. Dual Failure Max Restorability (DFMR) Finds the spare capacity placement that maximizes the average restorability to dual-failures for a given spare capacity budget.

18 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 18 R 2 Design - Experimental Results Cost of improving the R 2 restorability: Spare capacity for R1=100%  223 units (55% redundancy) Total working: 405 Spare capacity for R2=100%  628 units (155% redundancy) 628223 To go from R2 = 80% to R2 = 100% we need to almost TRIPLE the spare capacity

19 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 19 Conclusion of R2 studies It is very costly to guarantee R 2 restorability to all service paths in the network. However in most cases of dual span failures the restoration mechanism is able to restore part or all of the failed working units Idea: For little or no extra capacity it should be possible to guarantee full restorability to dual failures to selected network connections W = 3 (including 1 with higher priority) W = 2 2 1 S = 3 Constraint modification for the Dual Failure Minimum Capacity formulation (DFMC): instead of “For any dual failure restoration paths must be found for all failed working units” we now have “For any dual failure restoration paths must always be found for all working units requiring R2 restorability” Restoration of higher priority connection

20 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 20 Multi-Priority Mesh Design In case of single failure of span i, restore all working units that require R 1 restorability In case of failure of spans i and j, restore all working units (x i ) that require R 2 restorability Route p cannot be used if it crosses one of the two failed spans Spare capacity is needed to support restoration of single span failures Spare capacity is needed to support restoration of dual span failures Subject to:

21 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 21 Network Availability Simulator The simulator determines times of span failures and repairs according to given statistical distributions: span 12 span 5span 2 span 5span 8 At each stage: Set of failed spans Restoration analyzer Set of lost Connections Connections Outage Recorder stage 1stage 2stage 3stage 4stage 5stage 6 t The objective of the simulator is to obtain information about the availability of end-to-end network connections by generating span failures at random times and analyzing the restoration depending on connection priorities Characteristics of a network connection: Origin node Destination node Size (STS-1, STS-3, STS-12,…) Restorability Requirement (R0, R1, R2) Routing between O and D

22 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 22 Network Availability Simulator Advantages of the simulator: Confirm results obtained with theoretical availability expressions based on R2. Obtain information about the distribution of outage times (1000 outages of 0.1 sec has a different impact than 10 outages of 10 sec) Possibility to use different distributions of time-to-repair and time-between-failure for each span.

23 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 23 Mesh/Ring Availability Comparison Single span failures Mesh: Full protection Rings: Full protection Dual span failures with no spatial interaction Mesh: Full protection Rings: Full protection Dual span failures with spatial interaction Mesh: Protection from 0% to 100% of the working units depending on available spare capacity and adaptability of the restoration mechanism Rings (2 span failures on same ring): Protection of about 2/3 of the traffic (demands that are not isolated by the 2 span failures For connections requiring R1 restorability, the Ring-based solution and the mesh solution provide similar levels of availability

24 E E 681 - Module 18 © Wayne D. Grover 2002, 2003 24 Mesh/Ring Availability Comparison Possibility of guaranteeing R2 restorability : origin exit The connection is lost whatever his priority level is. Mesh Networks: Yes! With adequate design and an adaptive restoration mechanism. Ring Networks: No, in certain cases restorability to dual failures cannot be guaranteed Example: Conclusion: The mesh architecture seems to be more appropriate than rings to serve demands with high availability requirements


Download ppt "E E 681 - Module 18 M.H. Clouqueur and W. D. Grover TRLabs & University of Alberta © Wayne D. Grover 2002, 2003 Analysis of Path Availability in Span-Restorable."

Similar presentations


Ads by Google