High availability survivable networks Wayne D. Grover, Anthony Sack 9 October 2007 High Availability Survivable Networks: When is Reducing MTTR Better.

Slides:



Advertisements
Similar presentations
Chapter 9 Maintenance and Replacement The problem of determining the lifetime of an asset or an activity simultaneously with its management during that.
Advertisements

SMA 6304/MIT2.853/MIT2.854 Manufacturing Systems Lecture 19-20: Single-part-type, multiple stage systems Lecturer: Stanley B. Gershwin
Heat Exchanger Network Retrofit
Chapter 11 Optimal Portfolio Choice
Chapter 7 (7.1 – 7.4) Firm’s costs of production: Accounting costs: actual dollars spent on labor, rental price of bldg, etc. Economic costs: includes.
Capital Budgeting. Cash Investment opportunity (real asset) FirmShareholder Investment opportunities (financial assets) InvestPay dividend to shareholders.
Estimation in Sampling
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Introduction to Statistics
Planning under Uncertainty
W.D. Grover TRLabs & University of Alberta © Wayne D. Grover 2002, 2003 Mesh-restorable Network Design (2) E E Module 13.
Benefits of p-Cycles in a Mixed Protection and Restoration Approach DRCN Benefits of p-Cycles in a Mixed Protection and Restoration Approach François.
Duration and Yield Changes
Diversification, Beta and the CAPM. Diversification We saw in the previous week that by combining stocks into portfolios, we can create an asset with.
Capacity Design Studies of Span-Restorable Mesh Transport Networks With Shared-Risk Link Group (SRLG) Effects John Doucette, Wayne D. Grover
Quantitative Comparison of End-to-End Availability of Service Paths in Ring and Mesh- Restorable Networks Matthieu Clouqueur, Wayne D. Grover
Mesh Restorable Networks with Complete Dual Failure Restorability and with Selectvely Enhanced Dual-Failure Restorability Properties Matthieu Clouqueur,
BROADNETS 2004 San José, California, USA October 25-29, 2004 p-Cycle Network Design with Hop Limits and Circumference Limits Adil Kodian, Anthony Sack,
AN INTRODUCTION TO PORTFOLIO MANAGEMENT
Queueing Theory: Part I
E E Module 18 M.H. Clouqueur and W. D. Grover TRLabs & University of Alberta © Wayne D. Grover 2002, 2003 Analysis of Path Availability in Span-Restorable.
Mesh Restorable Networks with Multiple Quality of Protection Classes Wayne D. Grover, Matthieu Clouqueur TRLabs and.
Capacity Requirements for Network Recovery from Node Failure with Dynamic Path Restoration Gangxiang Shen and Wayne D. Grover TRLabs and University of.
Grant CMMI Service Enterprise Systems Robust Portfolio Management with Uncertain Rates of Return PI: Aurélie Thiele – Lehigh University PhD student:
Investment Analysis Lecture: 21 Course Code: MBF702.
Behind the Supply Curve:
A Switch Criterion for Defined Contribution Pension Schemes Bas Arts and Elena Vigna.
Production Possibilities Curve. PPC This illustrates the fundamental problem of scarcity. Since wants will always exceed available resources, people living.
ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland
10.1 Chapter 10 –Theory of Production and Cost in the Long Run(LR)  The theory of production in the LR provides the theoretical basis for firm decision-making.
Cost – The Root of Supply Total Cost Average Cost Marginal Cost Fixed Cost Variable Cost Long Run Average Costs Economies of Scale.
Introduction to Spreadsheet Modeling
Determining Sample Size
Roman Keeney AGEC  In many situations, economic equations are not linear  We are usually relying on the fact that a linear equation.
Copyright © Wayne D. Grover 2000 EE 681 Fall 2000 Lecture 15 Mesh-restorable Network Design (2) W. D. Grover, October 26, 2000 copyright © Wayne D. Grover.
1 The Basics of Capital Structure Decisions Corporate Finance Dr. A. DeMaskey.
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. The Costs of Production Chapter 6.
Some Background Assumptions Markowitz Portfolio Theory
Exponential and Logarithm
Theory of the Firm 1) How a firm makes cost- minimizing production decisions. 2) How its costs vary with output. Chapter 6: Production: How to combine.
THEORY OF PRODUCTION MARGINAL PRODUCT.
Consumer Choice 16. Modeling Consumer Satisfaction Utility –A measure of relative levels of satisfaction consumers enjoy from consumption of goods and.
Chapter 6 Production. ©2005 Pearson Education, Inc. Chapter 62 Topics to be Discussed The Technology of Production Production with One Variable Input.
PRODUCTION AND ESTIMATION CHAPTER # 4. Introduction  Production is the name given to that transformation of factors into goods.  Production refers to.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
Background on Reliability and Availability Slides prepared by Wayne D. Grover and Matthieu Clouqueur TRLabs & University of Alberta © Wayne D. Grover 2002,
1 of 41 chapter: 12 >> Krugman/Wells ©2009  Worth Publishers Behind the Supply Curve: Inputs and Costs.
EBIT/EPS Analysis The tax benefit of debt Trade-off theory Practical considerations in the determination of capital structure CAPITAL STRUCTURE Lecture.
Chapter 5 Demand: The Benefit Side of the Market.
Chapter 2 Risk Measurement and Metrics. Measuring the Outcomes of Uncertainty and Risk Risk is a consequence of uncertainty. Although they are connected,
Does Goldratt Understand the “Theory” of Constraints? Evaporating the “Do not balance”cloud.
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
1 Chapter 7 Applying Simulation to Decision Problems.
The Costs of Production Chapter 6. In This Chapter… 6.1. The Production Process 6.2. How Much to Produce? 6.3. The Right Size: Large or Small?
Investment Analysis and Portfolio Management First Canadian Edition By Reilly, Brown, Hedges, Chang 6.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Fundamentals of Microeconomics
© 2010 Pearson Prentice Hall. All rights reserved. CHAPTER 6 Algebra: Equations and Inequalities.
Copyright © 2011 Pearson Prentice Hall. All rights reserved. Risk and Return: Capital Market Theory Chapter 8.
Theory of the Firm Theory of the Firm: How a firm makes cost-minimizing production decisions; how its costs vary with output. Chapter 6: Production: How.
9-1 Learning Objectives  Graph a typical production isoquant and discuss the properties of isoquants  Construct isocost curves  Use optimization theory.
O PTIMAL R EPLACEMENT AND P ROTECTION S TRATEGY FOR P ARALLEL S YSTEMS R UI P ENG, G REGORY L EVITIN, M IN X IE AND S ZU H UI N G Adviser: Frank, Yeong-Sung.
Markowitz Risk - Return Optimization
Adv. Wireless Comm. Systems - Cellular Networks -
Software Reliability Models.
RAM XI Training Summit October 2018
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Algebra: Equations and Inequalities
Chapter 6 Beyond Duration
Presentation transcript:

high availability survivable networks Wayne D. Grover, Anthony Sack 9 October 2007 High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity? 7-10 October 2007 La Rochelle, France Presented at:

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Segue….back to DRCN 2005 Beautiful Ischia. Question for Dr. Grover: (paraphrasing) “In a network that is already designed for single failure restorability, to get yet higher availability of services, would you think it is better to add still more spare capacity to increase the dual-failure restorability or to invest at that point in MTRR reduction to enhance availability?” GOOD QUESTION ! …and it lead to this study....the closing Panel Discussion.

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Why this question is so insightful… To minimize the impact of dual-failure events, we can: Reduce physical MTTR values for network spans  physical repairs will happen faster  Time spent in an overlapping dual failure repair state will go down as MTTR -2  As MTTR ->0 there are no dual failures Increase the network restorability to dual span failures  By adding more spare capacity  fewer dual-failure pairs will be outage causing  R2 = 1 means a triple failure will be needed to cause outage! Which is best approach? Is there an optimal investment strategy?

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN In a survivable network, MTTR takes on new importance… Key point is that in an R1=1 survivable network, outage requires two failures which interact in the network (restoration-wise) with repair processes that overlap in time Otherwise the two failures are simply time-successive single failures. This means that: In a survivable network, unavailability drops as the square of the MTTR !

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Illustrating the principle TTR_Failure_1 TTR_Failure_2 No risk of outage (R1=1) R2<1: risk of outage. Duration proportional to repair overlap time. Reduces as Decreasing time to repair Increasing time to repair

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Study mandate and design Explore the trade-off between availability improvement through R2 enhancement and MTTR reduction. What is the most cost-effective strategy for combined investment in capacity additions and repair time improvements to maximize availability? Framework: Total Availability Investment 100% to Dual-failure Restorability (R2) 100% to Physical MTTR Reduction Interesting because the response of both variables to increasing investment is not linear.

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN  “R1” is the level of single-failure restorability (range 0 to 1)  “R2” is the level of dual-failure restorability (range 0 to 1 as well)  Examples:  R1 = 1 indicates a network fully restorable to all single failures;  R2 = 0.60 means that 60% of failed working capacity units (or service paths) are restorable to dual failures, on average. Some Terminology

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Typical capacity cost profile of enhancing R2 Any network designed for 100% R1 will always have a non-zero R2 level as well. This is true even if the R1 network design is optimal. R2 vs. cost curve then asymptotically approaches unity – always a diminishing return to further capacity investment. This characteristic curve shape is well-known in the literature. Exact shapes for this curve have been found for different networks in, for example, the Ph.D. thesis by Clouquer

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Properties of MTTR versus expenditure Shape of the MTTR vs. cost curve is much less certain, but a plausible parametric model seems defensible. For example, initial investments lead to large MTTR reductions, with diminishing returns thereafter. Conceivably,however, this curve could also be convex – with initial investments leading to only small reductions, and larger investments required for larger changes. Both scenarios will be tested in our experimental calculations.

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Theoretical model development Consider the availability of an individual reference path in an R1 survivable network: (for conceptual investigation, all spans taken as identical; S: number of spans in the network, N: number of spans on the path) (One failure on the path) (Both failures on the path) Number of dual- failure scenarios that may cause service path outage Probability of any dual-failure event actually occurring dual-span failure restorability of the network This expression excludes contributions due to triple (or higher-order) independent failure scenarios.

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Relation of unavailability to MTTR We can now express the span unavailability in terms of MTTR: Well-known expression: Approximation: (where λ is the failure rate) Also: Switch to an unavailability orientation Algebraically simplify “number of scenarios” term from before &

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Unavailability as a function of MTTR and R2 We now have the operative expression that relates unavailability to both MTTR and R2: Note that unavailability responds in a linear way to R2, but to the square of the MTTR or failure rate (λ). Could availability improvements be most optimally gained through MTTR improvements or some type of combined strategy with R2 enhancement?

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN MTTR and R2 as functions of cost To study the economic tradeoff, we define MTTR and R2 as functions of cost: If C m + C r is a constant total budget amount, then an optimum split of total investment must exist which minimizes Upath. What is this optimum split for each of the two MTTR characteristic curve shapes postulated earlier?

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Experimental calculations For experimental calculations, we set a numerical value of 100 “arbitrary cost units” to be divided between MTTR and R2. (i.e. total investment remains constant, only the allocation changes) Data points from the characteristic curves for both variables were used in the equation just presented to generate new curves for the unavailability of a typical reference path. Other assumed parameters: N = 6 (length of the reference path) S = 20 (number of spans in the network) λ = (failure rate per span) N, S, λ Test one: concave MTTR Test two: convex MTTR

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Experimental R2 curve (both tests) % of Total Budget Spent on R2 Enhancement Average R2(i,j) Achieved

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Experimental MTTR curve – concave (test one) % of Total Budget Spent on MTTR Reduction MTTR (hours)

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN % of Total Budget Spent on MTTR Reduction (Balance goes to R2 Enhancement) Reference Path Unavailability Test one result Region 2 Region 3 Region 1

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Test one: discussion Three distinct regions evident. In Region 1 Availability greatly benefits from relatively easily obtained initial reductions in MTTR. In Region 3 MTTR reduction is a matter of diminishing returns. It would have been better to add more capacity with the same money, to enhance R2. In Region 2, the overall availability is lowest and not very sensitive to exactly how the budget is spent on R2, or MTTR.

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Experimental MTTR curve – convex (test two) % of Total Budget Spent on MTTR Reduction MTTR (hours)

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Test two result Reference Path Unavailability % of Total Budget Spent on MTTR Reduction (Balance goes to R2 Enhancement)

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Test two discussion This is the curve portraying where it is very difficult (costly) to obtain any MTTR reductions. We think less plausible a shape, but worthwhile as a “what if” to show the range of strategy this analysis can inform an operator on. In this case, the preferred strategy is strongly on capacity addition to enhance R2,

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN Concluding comments Suffices to show that at least conceptually an optimum combined strategy in MTTR and R2 investment exists. A unique / interesting phenomenon arising specifically in the context of networks that are already “R1=1” survivable by design. (Note MTTR has no special role for R1=1 design) Once a network is R1=1, however, MTTR takes on new importance because thereafter U ~ O(MTTR -2 ) Other factors to consider: MTTR improvements are probably annual expenses, manpower R2 improvement is, however, probably a capital investment. Added capacity never hurts in a network (throughput, flexibility, grown) But fast repairs will be directly appreciated by users too

High Availability Survivable Networks: When is Reducing MTTR Better than Adding Protection Capacity?DRCN thank you Thank You (And thanks again to the great question from the DRCN 2005 Panel Discussion !)