Reliability Analysis using Reliability Block Diagram( RBD)

Slides:



Advertisements
Similar presentations
Operation & Maintenance Engineering Detailed activity description
Advertisements

1 MM3 - Reliability and Fault tolerance in Networks Service Level Agreements Jens Myrup Pedersen.
ITIL: Service Transition
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity Module 3.1.
Module 3 UNIT I " Copyright 2002, Information Spectrum, Inc. All Rights Reserved." INTRODUCTION TO RCM RCM TERMINOLOGY AND CONCEPTS.
Software Quality Assurance (SQA). Recap SQA goal, attributes and metrics SQA plan Formal Technical Review (FTR) Statistical SQA – Six Sigma – Identifying.
SMJ 4812 Project Mgmt and Maintenance Eng.
Reliability Centered Maintenance From a Data Center Perspective March 2013.
CSC 402, Fall Requirements Analysis for Special Properties Systems Engineering (def?) –why? increasing complexity –ICBM’s (then TMI, Therac, Challenger...)
Introduction and Course overview by S. O. Duffuaa.
DELIVERING SAFE & RELIABLE OPERATION
Maintenance Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
1 Logistics Systems Engineering Availability NTU SY-521-N SMU SYS 7340 Dr. Jerrell T. Stracener, SAE Fellow.
RAM Modelling in the Project Design Phase Friday 30 th April, 2010 Paul Websdane Reliability Modelling for Business Decisions Asset Management Council.
Relex Reliability Software “the intuitive solution
ERT 312 SAFETY & LOSS PREVENTION IN BIOPROCESS RISK ASSESSMENT Prepared by: Miss Hairul Nazirah Abdul Halim.
Software Reliability SEG3202 N. El Kadri.
ERT 322 SAFETY AND LOSS PREVENTION RISK ASSESSMENT
Feasibility Study.
Frankfurt (Germany), 6-9 June 2011 EL-HADIDY – EG – S5 – 0690 Mohamed EL-HADIDY Dalal HELMI Egyptian Electricity Transmission Company Egypt EXAMPLES OF.
Chapter 7 Managing risk and quality. Learning objectives discuss the importance of risk in a project and how it can be managed explain the processes of.
FAULT TREE ANALYSIS (FTA). QUANTITATIVE RISK ANALYSIS Some of the commonly used quantitative risk assessment methods are; 1.Fault tree analysis (FTA)
QUALITY RISK MANAGEMENT RASHID MAHMOOD MSc. Analytical Chemistry MS in Total Quality Management Senior Manager Quality Assurance Nabiqasim Group of Industries.
Systems Analysis and Design in a Changing World, Fourth Edition
On the Definition of Survivability J. C. Knight and K. J. Sullivan, Department of Computer Science, University of Virginia, December 2000.
Maintenance McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
1 Component reliability Jørn Vatn. 2 The state of a component is either “up” or “down” T 1, T 2 and T 3 are ”Uptimes” D 1 and D 2 are “Downtimes”
Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005.
Risk Analysis for Major Rehabilitation. Major Rehabilitation Background  Prior to FY 1992  Funded under Operation and Maintenance, General, Appropriation.
Unit-3 Reliability concepts Presented by N.Vigneshwari.
Maintainance and Reliability Pertemuan 26 Mata kuliah: J Manajemen Operasional Tahun: 2010.
Stracener_EMIS 7305/5305_Spr08_ Systems Availability Modeling & Analysis Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7305/5305.
Overwiew of Various System Reliability Analysis Methods Kim Hyoung Ju 1.
Failure Modes, Effects and Criticality Analysis
ON “SOFTWARE ENGINEERING” SUBJECT TOPIC “RISK ANALYSIS AND MANAGEMENT” MASTER OF COMPUTER APPLICATION (5th Semester) Presented by: ANOOP GANGWAR SRMSCET,
1 Introduction to Engineering Spring 2007 Lecture 16: Reliability & Probability.
Information Systems Security
Product Lifecycle Management
ITIL: Service Transition
Strategic Information Systems Planning
PRA: Validation versus Participation in Risk Analysis PRA as a Risk Informed Decision Making Tool Richard T. Banke– SAIC
ANALYTICAL METHODS IN MAINTENANCE
Software Metrics 1.
Fundamentals of Information Systems, Sixth Edition
Most people will have some concept of what reliability is from everyday life, for example, people may discuss how reliable their washing machine has been.
MPE-PE Section Meeting
Non-observable failure progression
Fault-Tolerant Computing Systems #5 Reliability and Availability2
Software Reliability PPT BY:Dr. R. Mall 7/5/2018.
FMEA.
IAEA E-learning Program
Quality Risk Management
ITPD ISSUE MANAGEMENT PROCESS SEPTEMBER 5, 2008
Air Carrier Continuing Analysis and Surveillance System (CASS)
ABB SACE Maintenance Preventive Maintenance Program
Asset Governance – Integrated Strategic Asset Management
Presented By: Daniel J. Brown, CQA
Healthcare Failure Mode and Effect AnalysisSM
FACILITY LAYOUT Facility layout means:
Fault Tolerance Distributed Web-based Systems
KPI Familiarisation
Reliability.
TPM Definitions Goals and Benefits Components GEOP 4316.
Knowing When to Stop: An Examination of Methods to Minimize the False Negative Risk of Automated Abort Triggers RAM XI Training Summit October 2018 Patrick.
Unit I Module 3 - RCM Terminology and Concepts
Failure Mode and Effect Analysis
Overview Dependability: "[..] the trustworthiness of a computing system which allows reliance to be justifiably placed on the service it delivers [..]"
Definitions Cumulative time to failure (T): Mean life:
Project Risk Management Jiwei Ma
Presentation transcript:

Reliability Analysis using Reliability Block Diagram( RBD) ASQ RD Webinar Series Reliability Works Incorporated 830-1100 Melville St Vancouver B.C. Canada V6E 4A6 Presented by Frank Thede, P.Eng Principle Reliability Engineer E: fthede@reliabilityworks.com C: 780 722 0302 Roselia Moreno Manager Reliability Engineering E: rmoreno@reliabilityworks.com C: 778 987 7959 Copyright Reliability Works Inc.  2018

Reliability Block Diagrams Webinar Outline: General overview of Reliability terms and definitions Introducing the Reliability Block Diagram (RBD) RBD vs. Fault Tree RBD analysis Inputs Outputs Application Do I need a reliability analysis? Examples (Case Studies)

Reliability Basics – Terminology, Definitions and Measures Reliability in its simplest form ….. Reliability is the ability of equipment or system to operate without interruption for a desired period of time (mission time), under a given set of conditions Make the point that TIME is the key variable in reliability calculations. Performance over a time period.

Definitions Failures Costs/Risks Any unplanned interruption to operating equipment or systems in delivering the desired performance is a failure. Reliability methods aim to forecast failures through understanding the likelihood of failure occurrence in a given time. Failures Failures cost businesses money in lost production, repairs, safety hazards, environmental incidents, downtime, quality impacts and customer complaints. The business of reliability is to reduce the losses caused by failures. Costs/Risks Measures such as MTTF and MTBF are regularly used.

Definitions Reliability Engineering High reliability costs money Reliability Engineering aims to identify practical solutions to business issues Understanding the needs of the business allows an affordable level of reliability through design, maintenance and support Reliability Engineering Make the point that some maintenance is done without assessing whether or not there is a significant impact to the business.

Definitions Reliability and Availability are functions in time. The aspect of time is critical in their measurement and the key variables are: Mission Time Mean Time To Failure (MTTF) Mean Time Between Failures (MTBF) Mean Time To Repair (MTTR)

MTTR vs. MTBF MTBF MTTR MTTF MTTR MTTF When MTTR is small compared to MTTF, then MTTF can be assumed to be the same as MTBF.

Definitions - Availability “Is it available and functioning when I need it?” Availability is the fraction of time that an item (component, equipment, or system) can perform its required function. It is used when working with repairable systems. Availability is an important measure when system failure can be tolerated and repair can be carried out. It is represented by the expression: The compliment of availability is the Unavailability represented by Q: Q = 1- A A = MTTF MTTF + MTTR Make reference to the impact of repair time on the availability.

Reliability = e –t/MTTF Basic Reliability The relationship between Reliability and MTTF is given by the expression: Reliability* = e –lt Where l = 1/MTTF so… Reliability = e –t/MTTF Emphasise this is suitable when failure rate does not change with time.

Basic Reliability R(t) = e -(t/MTTF) R(8760) = e -(8760/8760) = e –1 Suppose a Level Transmitter must operate for one year between turnarounds and the transmitter has a known MTBF = 8760 hours. What is the system reliability? R(t) = e -(t/MTTF) R(8760) = e -(8760/8760) = e –1 = 0.36788 = 36.8% chance of making it to the next turnaround Illustrate how formula can be used. See if anyone surprised by result!

Reliability calculations Suppose the same turnaround schedule and the transmitter has a MTTF = 87600 hours. What is the probability of making it to the next turnaround without a failure? R(t) = e -(t/MTTF) R(8760) = e -(8760/87600) = e –.1 = 0.90 = 90% chance of making it to the next turnaround. What changed- not the equipment just your mission time! – Frank: please review this statement, I think the mission time has not changed (it is still one year, same turnaround schedule), what changed is the MTBF from 8,760 (previous slide) to MTTF = 87,600 hours.

Reliability calculations Suppose a target for turnaround to turnaround reliability is 95% What MTTF is required for the transmitter? R(t) = e -(t/MTTF) R(t) = .95 = e -(8760/MTTF) 1/.95 = e (8760/MTTF) ln(1/.95) = 8760/MTTF MTTF = 171000 = 19.5 yrs. Again can do the reverse and calculate MTBF – assuming steady state.

Reliability and Availability Reliability ≠ Availability Used when the system cannot be repaired Used when the system can be repaired Calculates probability the system will operate without failure Calculates the fraction of time the system is available to perform its required function Probability the system will operate for its defined lifetime/mission Probability the system will operate on demand Reliability Engineering uses/calculates either/both Reliability Analysis is a general term to describe the process of estimating System Reliability and/or System Availability

Reliability Engineering tools FMEA/FMECA Failure Mode Effect and Criticality Analysis. Fault Tree Analysis RBD’s Reliability Block Diagrams RCM Reliability Centered Maintenance Weibull data analysis and failure prediction RBI Risk Based Inspections RCA Root Cause Analysis LCC Lifecycle Costs Tools These basic tools provide a framework to make decisions that impact on the business regarding failures. They are simple, systematic and require understanding of the principles of reliability engineering. They can be used by anyone in the organization and are extremely powerful where there is widespread buy. It provides a language to be able to talk about Problems and causes of failures, and their impacts on the business.

Reliability Block Diagram (RBD) Tool to map the probable component failures and describe their relationship to each other and to the functionality of the overall system It is drawn as a series of blocks connected in parallel or series, configuration. Each block represents a potential component failure within the system In a series path any failure along the path will result in system failure Parallel paths shows redundancy, meaning that all of the parallel paths must fail for the parallel network to fail

Reliability Block Diagrams (RBD) Consist of blocks & nodes connected in parallel or series Connections are used to represent success paths Nodes are used to represent voting relationships Blocks represent equipment failure modes, operator errors, environmental factor Predicts system real life capacity, availability and reliability by considering: Failure rates Spares availability Redundancy Labour availability Equipment required Preventive and inspection programs

Reliability Block Diagrams If A has an availability of 95% then the system has an Availability of 95%. In the simplest System: the system is down if component A fails Because there is no open path between input and output. consist of blocks connected in parallel or series. connections are used to represent success relationships nodes are used to represent voting relationships. All blocks must be connected,

Reliability Block Diagrams Lets say our system has 100 blocks in series and each block has an availability of 0.99. …. input output 1 2 100 What would the overall availability of this system be? A S = 0.99 100 =0.366 or 36.6%

Reliability Block Diagrams Lets try our system with 3 components in parallel. In this case, if any of the components fail the system is still up as there is still a success path from input to output. System failure requires all three components to fail simultaneously.

Reliability Block Diagrams Availability of simple parallel system A = 1-(Q1xQ2xQ3….QN) (Unavailability “Q” is equal to 1-Availability)

Reliability Block Diagrams If availability of each block is 0.9 (Q= 1 – 0.9) What is the availability of the system? A = 1-(.1x.1x.1)=1-0.001=0.999

Reliability Block Diagrams Most systems are more complex, what is the system availability now?

Reliability Block Diagrams RBD Software Solution

Reliability Block Diagrams (RBD) vs. Fault Tree Diagrams (FTD) Reliability Block Diagrams (RBD) and Fault Tree Diagrams (FTD) represent the logical relationship between sub-system and component failures and how they combine to cause system failures. The most fundamental difference between the two tools is that when building RBDs, you work in the “success space” while building FTDs, you work in the “failure space”. The RBD looks at success combinations while FTD looks at failure combination. Fault trees have traditionally been used to analyze fixed probabilities (i.e. each event that composes the tree has a fixed probability of occurring) while RBDs may include time-varying distributions for the blocks' success or failure, as well as other properties such as repair/restoration distribution.

Reliability Block Diagrams (RBD) vs. Fault Tree Diagrams (FTD) RBD looks similar to a process diagram or a schematic

Reliability Block Diagrams - Inputs Quantitative inputs for each block can include: Failure rate (Q, MTTF, MTBF) Failure type (Rate, Normal, Weibull, Dormant…) Mean time to repair (MTTR) Common Cause Failure (CCF) System functional requirements Data sources: Existing failure histories: failure rates, Weibull analysis Industry failure histories Operations, Field forces External databases: OREDA, NPRD

Reliability Block Diagrams - Outputs Estimate System Unavailability (Q) Q=1.3x10-4 ~ Availability of 99.987% Pareto chart analysis (failure mode importance): Sub-systems with largest contribution to unavailability Sensitivity Analysis Manual intervention success rate Assess high level design decisions: Refurbish vs. Replace Choose mitigation strategy: Redundancy Hardened design Proactive maintenance Testing frequency

Begin with existing design – Pareto chart Existing designs Identify areas requiring improvement using Importance results from Reliability Model

Evaluate Improved Design – Pareto chart Proposed new designs What opportunities are there to further improve performance?

Estimating unavailability Provide options Estimating unavailability Assess alternate designs System model predicts performance (availability and capacity) System model provides high level resource requirements (maintenance, labour and parts) Modeling may uncover design solutions that are not viable Sensitivity analysis is performed to understand the impacts of design options “What If” – new solutions can be identified and tested (modeled) before implementation begins

Optimized design outcomes (Q) The model shows improvements in system unavailability for both assumed intervention success rates (ISRs) of 98% and 60%. Q

Reliability Myths (why do an analysis?) Redundant systems always perform better Increased flexibility for deploying back-up systems = greater availability System reliability should be independent of operational requirements Repair time for backup system is less important than for primary system Component failure rates are equipment specific Operating under design capacity = improved reliability The better design becomes obvious with more experience

MYTH: Redundant Systems always perform better

MYTH: Flexible/Configurable Systems Perform Better Which is better?

MYTH: System reliability should be independent of operational requirements Functional requirements must be defined before the success path can be defined

MYTH: Repair time for backup system is less important than for primary system Availability = MTTF/(MTTF+MTTR) Availability = 1 – [(Q1xQ2) + Qco]

MYTH: Component failure rates are equipment specific Reliability in its simplest form ….. Reliability is the ability of equipment or system to operate without interruption for a desired period of time (mission time), under a given set of conditions

MYTH: The better design becomes obvious with more experience Which is better?

Do the analysis If you want to know: How available is the system How Reliable is the system How likely is a system failure What design changes will yield the best performance How much will it cost to test and maintain the system How important is having spares on site What level of performance can I guarantee What’s the risk of environmental damage What’s the safety risk Do the analysis

System under-performing? How does reliability assessment change the process to find a solution for an under-performing system? Traditional approach System under-performing? NO Identify problem Initiate Capital project Is the system performing? Re-assess performance Identify & select solutions YES Implement solution System performs.

Reliability based approach System under-performing? Identify Major Contributors NO Understand system requirements and performance gaps Model system performing? Identify & select solutions Assess performance YES Initiate Capital Project Implement solution System performs.

Reliability Analysis using RBD – Examples:

Case Study – Spillway System Site condition: Full Remote operation 6 hours response time Staffing: business hours Analysis Impact: Redundant Gate (safety objective) Optioneering: Simplified Power Configuration ($750K) Eliminated an automatic transfer switch ($600K) Simplified Control ($500K)

Case Study – Telescope Observatory System Challenge: Confirm reliability targets proposed in the conceptual design Method: RBD was used to model the system for two mode of operation: Normal Operation and Degraded Operation Results: Overall unavailability of the system from RBD confirmed targets proposed by the conceptual design however unavailability of individual subsystem varied significantly – efforts to improve design were realigned.

THANK YOU