Idaho RISE System Reliability and Designing to Reduce Failure ENGR204 19 Sept 2005.

Slides:



Advertisements
Similar presentations
Risk Management Introduction Risk Management Fundamentals
Advertisements

Chapter 8 Fault Tolerance
EECE499 Computers and Nuclear Energy Electrical and Computer Eng Howard University Dr. Charles Kim Fall 2013 Webpage:
Systems Analysis and Design Feasibility Study. Introduction The Feasibility Study is the preliminary study that determines whether a proposed systems.
Optimal redundancy allocation for information technology disaster recovery in the network economy Benjamin B.M. Shao IEEE Transaction on Dependable and.
Module 3 UNIT I " Copyright 2002, Information Spectrum, Inc. All Rights Reserved." INTRODUCTION TO RCM RCM TERMINOLOGY AND CONCEPTS.
SAE AS9100 Quality Systems - Aerospace Model for Quality Assurance
SMJ 4812 Project Mgmt and Maintenance Eng.
1 Highly Accelerated Life Test (HALT) Wayne Bradley 8 April 2014.
Electrical and Computer Systems Engineering Postgraduate Student Research Forum 2001 Design and Development of a Distributed Avionics System for use in.
Reliability on Web Services Pat Chan 31 Oct 2006.
7. Fault Tolerance Through Dynamic or Standby Redundancy 7.5 Forward Recovery Systems Upon the detection of a failure, the system discards the current.
1 Software Fault Protection Allen Goldberg Kestrel Technology.
Power System for Ocean Bottom Observatories Taken from the Cabled Observatory Presentation School of Ocean and Earth Science and Technology February 2006.
SOX & ISO Protect your data and be ready to be audited!!!
Data Storage Technology
During a mains supply interruption the entire protected network is dependent on the integrity of the UPS battery as a secondary source of energy. A potential.
LSU 10/09/2007System Design1 Project Management Unit #2.
LSU 07/24/2004Defining Project Tasks1 Defining the Project Tasks Project Management Unit, Lecture 4.
Protection Against Occupational Exposure
Airbus flight control system  The organisation of the Airbus A330/340 flight control system 1Airbus FCS Overview.
SAFE 605: Principles of Safety Engineering Overview of Safety Engineering Safety Engineering Concepts.
Airbus flight control system
Software Dependability CIS 376 Bruce R. Maxim UM-Dearborn.
Project & Quality Management Quality Management Reliability.
1 Avionics Workshop Ottawa, Ontario Nov.2003 Installation Approval of Non-required Avionics Equipment ISSUE TCCA Regional aircraft certification engineers.
Definitions, Goals and Objectives
Electrical Installation 2
NE 127 – Codes, Standards, and Regulations NDT & QA/QC Standards: ISO, ANSI, ATA, AIA, IEEE, etc. INSTRUCTOR: Chattanooga State CC.
Safety-Critical Systems 6 Certification
How Aircraft Operators Can Benefit from PHM Techniques Big Sky - Montana 2012 IEEE Aerospace Conference Leonardo Ramos Rodrigues EMBRAER S.A., São José.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 3 Slide 1 Critical Systems 1.
Lecture 2: Combinatorial Modeling CS 7040 Trustworthy System Design, Implementation, and Analysis Spring 2015, Dr. Rozier Adapted from slides by WHS at.
NSTX Centerstack Upgrade: initial discussions of the Machine Protection System (MPS) Robert Woolley 4 November 2009.
Chapter 9 Testing the System Shari L. Pfleeger Joann M. Atlee
MAPLDDesign Integrity Concepts What Do You Mean It Doesn’t Do What We Thought? Validating a Design.
PRIVÉ ET CONFIDENTIEL © Bombardier Inc. ou ses filiales. Tous droits réservés. SMART TESTING BOMBARDIER THOUGHTS FAA Bombardier Workshop Montreal
GE 116 Lecture 1 ENGR. MARVIN JAY T. SERRANO Lecturer.
Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Software Testing and Quality Assurance Software Quality Assurance 1.
Secure Systems Research Group - FAU 1 Active Replication Pattern Ingrid Buckley Dept. of Computer Science and Engineering Florida Atlantic University Boca.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.
1 Safety - definitions Accident - an unanticipated loss of life, injury, or other cost beyond a pre-determined threshhold.  If you expect it, it’s not.
On the Definition of Survivability J. C. Knight and K. J. Sullivan, Department of Computer Science, University of Virginia, December 2000.
CprE 458/558: Real-Time Systems
Safety-Critical Systems 7 Summary T V - Lifecycle model System Acceptance System Integration & Test Module Integration & Test Requirements Analysis.
Product Liability and Safety In design In manufacturing In marketing.
RELIABILITY ENGINEERING 28 March 2013 William W. McMillan.
Reliability Assessments Scope Per paragraph of the MAR and PAIP “ When necessary/prudent or when agreed upon with the GSFC Project Office, Glast.
Probabilistic Risk Assessment (PRA) Mathew Samuel NASA/GSFC/MEI (301)
Final Rule for Preventive Controls for Animal Food 1 THE FUTURE IS NOW.
Unit-3 Reliability concepts Presented by N.Vigneshwari.
Erman Taşkın. Information security aspects of business continuity management Objective: To counteract interruptions to business activities and to protect.
SRR and PDR Charter & Review Team Linda Pacini (GSFC) Review Chair.
SYSTEMS RELIABILTY 1. SYSTEMS are basically built of different components and /or subsystems. For each component, there is an assigned role in the system.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 23 Slide 1 Software testing.
MH...CH LECT-021 SYSTEMS CONCEPT Adopting a materials handling systems from overall optimization point of view. Adopting a materials handling systems.
1 Software Testing and Quality Assurance Lecture 38 – Software Quality Assurance.
Failure Modes, Effects and Criticality Analysis
SENG521 (Fall SENG 521 Software Reliability & Testing Preparing for Test (Part 6a) Department of Electrical & Computer Engineering,
Week#3 Software Quality Engineering.
Air Carrier Continuing Analysis and Surveillance System (CASS)
Standards.
HMI Reliability Dale Wolfe Reliability Engineer LMSSC*ATC*LMSAL
System Testing.
Knowing When to Stop: An Examination of Methods to Minimize the False Negative Risk of Automated Abort Triggers RAM XI Training Summit October 2018 Patrick.
Unit I Module 3 - RCM Terminology and Concepts
Lock Out Tag Out.
Definitions Cumulative time to failure (T): Mean life:
Ignition systems for small engines
Presentation transcript:

Idaho RISE System Reliability and Designing to Reduce Failure ENGR Sept 2005

Reliability Analysis Let R = probability system (or instrument) will operate without failure for time t (Success Probability) R = e - t Note: = failure rate (failures/second), sec -1 =  -1 where  = average seconds/failure Failure Probability = 1 - R

If a system comprises n nonredundant systems all equally essential for mission success, then the total system reliability is R s = R 1 * R 2 * R 3 * R n = e -  t * e -  t * e -  t * … * e - n t where  i is the failure rate of the i th system

If a system comprises n redundant systems in parallel, each of which can satisfy the mission requirements individually, then the system parallel (redundant) reliability is R p = 1 - (1 - R 1 ) * (1 - R 2 ) * (1 - R 3 )... * … (1 - R n ) = 1 - F 1 * F 2 * F 3... * … F n where F i = (1 - R i )  is the failure probability of the i th system

Series Reliability ABC R tot = R A * R B * R C Full Redundancy A B C R tot = 1- (1- R A ) * (1 - R B ) * (1 - R C )

Partial Redundancy (A & B are redundant, C is essential) A B C R tot = R C * [ 1- (1- R A ) * (1 - R B ) ] A C B R tot = 1 - (1- R A * R B ) * (1 - R C ) Non-Identical Full Redundancy (A & B are Essential, C is redundant)

Designing for Reliability 1. Keep It Simple! 2. Design Margin - Assure adequate strength of all mechanical and electrical parts, including allowance for unusual loads due to environmental extremes. This includes environmental shielding. 3. Redundancy - Provide alternative means of accomplishing required functions where design for excess strength is not suitable / reasonable. This includes most electronics.

Notes on Redundancy Same Design Redundancy: two or more identical components or systems Switching allows only one system to be active Outputs can be combined so switching is not necessary (e.g. power distribution systems) Voting for combining outputs of redundant units. Requires three or more units (e.g. accelerometer activation of critical sequence) Offers high protection against random failures Not effective against design deficiencies

Notes on Redundancy, cont. Diverse Design Redundancy: utilize two or more systems of different design High protection against failures due to design deficiencies Can offer lower cost if backup is “lifeboat” with lesser accuracy and functionality, but still adequate for minimum mission needs

Notes on Redundancy, cont. Functional (Analytic) Redundancy: addressing requirements by different techniques. For example, determination of spacecraft attitude by gyroscope or by star tracker. Avoids cost and weight penalties of physical redundancy Provides protection against design faults Disadvantage: backup usually provides reduced performance. Temporal Redundancy: Repetition of unsuccessful operation (i.e., retry after failure)

Apollo Design Principles The primary consideration governing the design of the Apollo system was that, if it could be made so, no single failure should cause the loss of any crewmember, prevent the successful continuation of the mission, or, in the event of a second failure in the same area, prevent a successful abort of the mission. To implement this policy, the following specific principles were established: 1. Use established technology 2. Stress hardware reliability 3. Comply with safety standards 4. Minimize in-flight maintenance and testing for failure isolation 5. Simplify operations 6. Minimize interfaces 7. Make maximum use of experience gained from previous manned-space missions. Reference: NASA SP-287

Qualification and Acceptance Testing Assume - Engineering data is complete and exact - Engineering data completely controls manufacture - All items manufactured to same engineering data are identical. Therefore - the results of Qualification Tests for one component are considered valid for all components. - If a representative component passes a sequence of qualification tests, all other components built to same engineering specifications should also pass Design is said to be “Qualified” Acceptance Testing is less severe, and is for the purpose of certifying workmanship

Failure Mode Definitions Catastrophic failure – complete loss of mission, including flight hardware. (Examples: Loss of GPS; Parachute failure) Major failure – significant loss of mission primary goals; significant degradation expected. (Example: Power supply failure) Minor - minor loss of data or ability to achieve mission goals; system failure that is overcome by other flight systems. (Example: loss of primary temp sensor, but temp data still retrieved from backup sensor; Loss of single GPS) Negligible – negligible impact on achieving mission goals.

Team Assignment Consider Catastrophic and Single Point failure possibilities. 1. Initiate a list of potential Catastrophic, Major, and Minor Failures. 2. How can Catastrophic and Major failure possibilities be prevented? Consider simplifying design, redundancy, and design margins. 3. Which failures are Single Point (i.e., if a failure occurs there is no viable means of recovery)? Example of Catastrophic Single Point Failure: heat shield on atmospheric entry probe