A. BobbioReggio Emilia, June 17-18, Dependability & Maintainability Theory and Methods Part 2: Repairable systems: Availability Andrea Bobbio Dipartimento di Informatica Università del Piemonte Orientale, “A. Avogadro” Alessandria (Italy) - IFOA, Reggio Emilia, June 17-18, 2003
A. BobbioReggio Emilia, June 17-18, Repairable systems X 1, X 2 …. X n Successive UP times Y 1, Y 2 …. Y n Successive DOWN times t UP DOWN X 1 X 2 X 3 Y 1 Y 2
A. BobbioReggio Emilia, June 17-18, Repairable systems The usual hypothesis in modeling repairable systems is that: The successive UP times X 1, X 2 …. X n are i.i.d. random variable: i.e. samples from a common cdf F (t) The successive DOWN times Y 1, Y 2 …. Y n are i.i.d. random variable: i.e. samples from a common cdf G (t)
A. BobbioReggio Emilia, June 17-18, Repairable systems The dynamic behaviour of a repairable system is characterized by: the r.v. X of the successive up times the r.v. Y of the successive down times t UP DOWN X 1 X 2 X 3 Y 1 Y 2
A. BobbioReggio Emilia, June 17-18, Maintainability Let Y be the r.v. of the successive down times: G(t) = Pr { Y t } (maintainability) d G(t) g (t) = ——— (density) dt g(t) h g (t) = ———— (repair rate) 1 - G(t) MTTR = t g(t) dt (Mean Time To Repair) 0
A. BobbioReggio Emilia, June 17-18, Availability The availability A(t) of an item at time t is the probability that the item is correctly working at time t. The measure to characterize a repairable system is the availability (unavailability):
A. BobbioReggio Emilia, June 17-18, Availability The measure to characterize a repairable system is the availability (unavailability): A(t) = Pr { time t, system = UP } U(t) = Pr { time t, system = DOWN } A(t) + U(t) = 1
A. BobbioReggio Emilia, June 17-18, Definition of Availability An important difference between reliability and availability is: reliability refers to failure-free operation during an interval (0 — t) ; availability refers to failure-free operation at a given instant of time t (the time when a device or system is accessed to provide a required function), independently on the number of cycles failure/repair.
A. BobbioReggio Emilia, June 17-18, Definition of Availability Operating and providing a required function Failed and being restored 1 Operating and providing a required function System Failure and Restoration Process t I(t) indicator function 0 I(t) 1 working 0 failed
A. BobbioReggio Emilia, June 17-18, Availability evaluation In the special case when times to failure and times to restoration are both exponentially distributed, the alternating process can be viewed as a two-state homogeneous Continuous Time Markov Chain Time-independent failure rate Time-independent repair rate
A. BobbioReggio Emilia, June 17-18, State Markov Availability Model UP 1 DN 0 Transient Availability analysis: for each state, we apply a flow balance equation: – Rate of buildup = rate of flow IN - rate of flow OUT
A. BobbioReggio Emilia, June 17-18, State Markov Availability Model UP 1 DN 0
A. BobbioReggio Emilia, June 17-18, State Markov Availability Model 1 A(t) A ss =
A. BobbioReggio Emilia, June 17-18, State Markov Model 1) Pointwise availability A(t) : 2) Steady state availability: limiting value as 3)If there is no restoration ( =0) the availability becomes the reliability A(t) = R(t) =
A. BobbioReggio Emilia, June 17-18, Steady-state Availability Steady-state availability: In many system models, the limit: exists and is called the steady-state availability The steady-state availability represents the probability of finding a system operational after many fail-and- restore cycles.
A. BobbioReggio Emilia, June 17-18, Steady-state Availability 1 t 0 UPDOWN Expected UP time E[U(t)] = MUT = MTTF Expected DOWN time E[D(t)] = MDT = MTTR
A. BobbioReggio Emilia, June 17-18, Availability: Example (I) Let a system have a steady state availability Ass = 0.95 This means that, given a mission time T, it is expected that the system works correctly for a total time of: 0.95*T. Or, alternatively, it is expected that the system is out of service for a total time: Uss * T = (1- Ass) * T
A. BobbioReggio Emilia, June 17-18, Availability: Example (II) Let a system have a rated productivity of W $/year. The loss due to system out of service can be estimated as: Uss * W = (1- Ass) * W The availability (unavailability) is an index to estimate the real productivity, given the rated productivity. Alternatively, if the goal is to have a net productivity of W $/year, the plant must be designed such that its rated productivity W’ should satisfy: Uss * W’ = W
A. BobbioReggio Emilia, June 17-18, Availability We can show that: This result is valid without making any assumptions on the form of the distributions of times to failure & times to repair. Also:
A. BobbioReggio Emilia, June 17-18, Motivation – High Availability
A. BobbioReggio Emilia, June 17-18, MDT (Mean Down Time or MTTR - mean time to restoration). The total down time (Y ) consists of: Failure detection time Alarm notification time Dispatch and travel time of the repair person(s) Repair or replacement time Reboot time Maintainability
A. BobbioReggio Emilia, June 17-18, The total down time (Y ) consists of: Logistic (passive) time Administrative times Dispatch and travel time of the repair person(s) Waiting time for spares, tools … Effective restoration (active) time Access and diagnosis time Repair or replacement time Test and reboot time Maintainability
A. BobbioReggio Emilia, June 17-18, Logistic times depend on the organization of the assistance service: Number of crews; Dislocation of tools and storehouses; Number of spare parts. Logistics
A. BobbioReggio Emilia, June 17-18, The number of spares
A. BobbioReggio Emilia, June 17-18, The total cost of a maintenance action consists of: Cost of spares and replaced parts Cost of person/hours for repair Down-time cost (loss of productivity) Maintenance Costs The down-time cost (due to a loss of productivity) can be the most relevant cost factor.
A. BobbioReggio Emilia, June 17-18, Is the sequence of actions that minimizes the total cost related to a down time: Reactive maintenance: maintenance action is triggered by a failure. Proactive maintenance: preventive maintenance policy. Maintenance Policy
A. BobbioReggio Emilia, June 17-18, Life Cycle Cost