CS 5150 1 CS 5150 Software Engineering Lecture 19 Reliability 1.

Slides:

Advertisements

Similar presentations

ASYCUDA Overview … a summary of the objectives of ASYCUDA implementation projects and features of the software for the Customs computer system.

Advertisements

INSE - Lecture 16  Documentation  Configuration Management  Program Support Environments  Choice of Programming Language.

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 20 Systems Operations and Support.

Chapter 19: Network Management Business Data Communications, 4e.

Overview Lesson 10,11 - Software Quality Assurance

1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 19 Reliability 1.

CS CS 5150 Software Engineering Lecture 20 Reliability 1.

SIM5102 Software Evaluation

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.

1 CS 501 Spring 2008 CS 501: Software Engineering Lecture 19 Reliability 1.

CS CS 5150 Software Engineering Lecture 21 Reliability 3.

Soft. Eng. II, Spr. 2002Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 9 Title : Reliability Reading: I. Sommerville, Chap. 16, 17 and 18.

CS 501: Software Engineering Fall 2000 Lecture 21 Dependable Systems I Reliability.

1 CS 501 Spring 2004 CS 501: Software Engineering Lecture 19 Reliability 1.

1 Case Study: Starting the Student Registration System Chapter 3.

1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 26 Delivering the System.

CS CS 5150 Software Engineering Lecture 21 Reliability 1.

CS CS 5150 Software Engineering Lecture 19 Reliability 1.

Maintaining and Updating Windows Server 2008

Design, Implementation and Maintenance

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 17 Slide 1 Extreme Programming.

CS 4310: Software Engineering

Software Reliability Categorising and specifying the reliability of software systems.

Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.

INFO 637Lecture #81 Software Engineering Process II Integration and System Testing INFO 637 Glenn Booker.

Hands-On Microsoft Windows Server 2003 Administration Chapter 2 Managing Windows Server 2003 Hardware and Software.

By Anthony W. Hill & Course Technology1 Common End User Problems.

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.

1 Software Engineering II Software Reliability. 2 Dependable and Reliable Systems: The Royal Majesty From the report of the National Transportation Safety.

CS 360 Lecture 3.  The software process is a structured set of activities required to develop a software system.  Fundamental Assumption:  Good software.

 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.

Royal Latin School. Spec Coverage: a) Explain the advantages of networking stand-alone computers into a local area network e) Describe the differences.

1 Maintain System Integrity Maintain Equipment and Consumables ICAS2017B_ICAU2007B Using Computer Operating system ICAU2231B Caring for Technology Backup.

Event Management & ITIL V3

Extreme/Agile Programming Prabhaker Mateti. ACK These slides are collected from many authors along with a few of mine. Many thanks to all these authors.

System Security Chapter no 16. Computer Security Computer security is concerned with taking care of hardware, Software and data The cost of creating data.

Rapid software development 1. Topics covered Agile methods Extreme programming Rapid application development Software prototyping 2.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.

CE Operating Systems Lecture 3 Overview of OS functions and structure.

1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 7 Business Aspects of Software Engineering.

1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 24 Delivering the System.

Managing Change 1. Why Do Requirements Change?  External Factors – those change agents over which the project team has little or no control.  Internal.

1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 22 Reliability II.

CS 5150 Software Engineering Lecture 20 Reliability 1 (and a little Privacy)

Lecture 4 – XP and Agile 17/9/15. Plan-driven and agile development Plan-driven development A plan-driven approach to software engineering is based around.

CS 360 Lecture 17.  Software reliability:  The probability that a given system will operate without failure under given environmental conditions for.

CS 5150 Software Engineering Lecture 2 Software Processes 1.

1 Business Aspects of Software Engineering SWE 513.

Requirements Engineering Requirements Engineering in Agile Methods Lecture-28.

1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 26 Delivering the System.

CS223: Software Engineering Lecture 2: Introduction to Software Engineering.

Requirements Engineering Requirements Validation and Management Lecture-24.

Unit 17: SDLC. Systems Development Life Cycle Five Major Phases Plus Documentation throughout Plus Evaluation…

Software Development Process CS 360 Lecture 3. Software Process The software process is a structured set of activities required to develop a software.

CS223: Software Engineering Lecture 16: The Agile Methodology.

T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.

CS223: Software Engineering Lecture 18: The XP. Recap Introduction to Agile Methodology Customer centric approach Issues of Agile methodology Where to.

1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 21 Reliability II.

Software Design and Architecture

1 CS 501 Spring 2004 CS 501: Software Engineering Lecture 20 Reliability 2.

Getting Ready for the NOCTI test April 30, Study checklist #1 Analyze Programming Problems and Flowchart Solutions Study Checklist.

 System Requirement Specification and System Planning.

MANAGEMENT INFORMATION SYSTEM

CS223: Software Engineering

Software Metrics and Reliability

Software Reliability Definition: The probability of failure-free operation of the software for a specified period of time in a specified environment.

Software testing strategies 2

CS 5150 Software Engineering

Systems Operations and Support

Presentation transcript:

CS CS 5150 Software Engineering Lecture 19 Reliability 1

CS Dependable and Reliable Systems: The Royal Majesty From the report of the National Transportation Safety Board: "On June 10, 1995, the Panamanian passenger ship Royal Majesty grounded on Rose and Crown Shoal about 10 miles east of Nantucket Island, Massachusetts, and about 17 miles from where the watch officers thought the vessel was. The vessel, with 1,509 persons on board, was en route from St. George’s, Bermuda, to Boston, Massachusetts." "The Raytheon GPS unit installed on the Royal Majesty had been designed as a standalone navigation device in the mid- to late 1980s,...The Royal Majesty’s GPS was configured by Majesty Cruise Line to automatically default to the Dead Reckoning mode when satellite data were not available."

CS The Royal Majesty: Analysis The ship was steered by an autopilot that relied on position information from the Global Positioning System (GPS). If the GPS could not obtain a position from satellites, it provided an estimated position based on Dead Reckoning (distance and direction traveled from a known point). The GPS failed one hour after leaving Bermuda. The crew failed to see the warning message on the display (or to check the instruments). 34 hours and 600 miles later, the Dead Reckoning error was 17 miles.

CS The Royal Majesty: Software Lessons All the software worked as specified (no bugs), but... Since the GPS software had been specified, the requirements had changed (stand alone system now part of integrated system). The manufacturers of the autopilot and GPS adopted different design philosophies about the communication of mode changes. The autopilot was not programmed to recognize valid/invalid status bits in message from the GPS (NMEA 0183). The warnings provided by the user interface were not sufficiently conspicuous to alert the crew. The officers had not been properly trained on this equipment.

CS Key Factors for Reliable Software Organization culture that expects quality Software design and implementation that hides complexity (e.g., structured design, object-oriented programming) Precise, unambiguous specification Software tools that restrict or detect errors (e.g., strongly typed languages, source control systems, debuggers) Programming style that emphasizes simplicity, readability, and avoidance of dangerous constructs Incremental validation (e.g., reviews, unit testing) Systematic testing (test plan, test cases, etc.) Change management (regression testing)

CS Building Dependable Systems: Organizational Culture Good organizations create good systems: Acceptance of the group's style of work (e.g., meetings, preparation, support for juniors) Visibility – tell people your problems Completion of a task before moving to the next (e.g., documentation, comments in code)

CS Building Dependable Systems: Three Principles For a software system to be dependable: Each stage of development must be done well. Changes should be incorporated into the structure as carefully as the original system development. Testing and correction do not ensure quality, but dependable systems are not possible without systematic testing.

CS Building Dependable Systems: Quality Management Processes Assumption: Good software is impossible without good processes The importance of routine: Standard terminology (requirements, specification, design, etc.) Software standards (coding standards, naming conventions, etc.) Regular builds of complete system Internal and external documentation Reporting procedures

CS Building Dependable Systems: Quality Management Processes When time is short... Pay extra attention to the early stages of the process: feasibility, requirements, design. There will be little time to redo mistakes in the requirements. Experience shows that taking extra time on the early stages will usually reduce the total time to release.

CS Building Dependable Systems: Involve the Client Specifications are of no value if they do not meet the client's needs The client must understand and review the requirements in detail. Appropriate members of the client's staff must review relevant areas of the design (including operations, training materials, system administration). The acceptance tests must belong to the client

CS Building Dependable Systems: Changes Requirements System design Testing Operation & maintenance Program design Implementation (coding) Acceptance & release Feasibility study Changes

CS Building Dependable Systems: Change Change management: Source code management and version control Tracking of change requests and bug reports Procedures for changing requirements specifications, designs and other documentation Regression testing Release control

CS Building Dependable Systems: Complexity The human mind can encompass only limited complexity: Comprehensibility Simplicity Partitioning of complexity A simple component is easier to get right than a complex one.

CS Reliability Metrics Reliability Probability of a failure occurring in operational use. Perceived reliability Depends upon: user behavior set of inputs pain of failure

CS Reliability Metrics Traditional measures for online systems Mean time between failures Availability (up time) Mean time to repair Market measures Complaints Customer retention User perception is influenced by Statistical distribution of failures

CS Metrics: User Perception of Reliability 1. A personal computer that crashes frequently, or a machine that is out of service for two days. 2. A database system that crashes frequently but comes back quickly with no loss of data, or a system that fails once in three years but data has to be restored from backup. 3. A system that does not fail but has unpredictable periods when it runs very slowly.

CS Reliability Metrics for Distributed Systems Traditional metrics are hard to apply in multi-component systems: A system that has excellent average reliability might give terrible service to certain users. In a big network, at any given moment something will be giving trouble, but very few users will see it. When there are many components, system administrators rely on automatic reporting systems to identify problem areas.

CS Requirements Metrics for System Reliability Example: ATM card reader Failure class ExampleMetric (requirement) Permanent System fails to operate1 per 1,000 days non-corrupting with any card -- reboot Transient System can not read1 in 1,000 transactions non-corrupting an undamaged card Corrupting A pattern ofNever transactions corrupts database

CS Metrics: Cost of Improved Reliability Cost $ Reliability metric 95% 100% Example. Many supercomputers average 10 hours productive work per day. How do you spend your money to improve availability?

CS Example: Central Computing System A central computer system (e.g., a server farm) is vital to an entire organization. Any failure is serious. Step 1: Gather data on every failure Many years of data in a data base Every failure analyzed: hardware software (default) environment (e.g., power, air conditioning) human (e.g., operator error)

CS Example: Central Computing System Step 2: Analyze the data Weekly, monthly, and annual statistics Number of failures and interruptions Mean time to repair Graphs of trends by component, e.g., Failure rates of disk drives Hardware failures after power failures Crashes caused by software bugs in each component

CS Example: Central Computing System Step 3: Invest resources where benefit will be maximum, e.g., Priority order for software improvements Changed procedures for operators Replacement hardware Orderly shut down after power failure