Lecture 4b: Risks and Liabilities of Computer-based Systems

Slides:



Advertisements
Similar presentations
Human-Computer Interaction
Advertisements

Operational Risk Management (ORM)
CSCI 5230: Project Management Software Reuse Disasters: Therac-25 and Ariane 5 Flight 501 David Sumpter 12/4/2001.
Anatomy of 4GL Disaster CS524 - Software Engineering I Fall I, Sheldon X. Liang, Ph.D. Nathan Scheck CS524 - Software Engineering I Fall I, 2007.
An Investigation of the Therac-25 Accidents Nancy G. Leveson Clark S. Turner IEEE, 1993 Presented by Jack Kustanowitz April 26, 2005 University of Maryland.
Computer Engineering 203 R Smith Project Tracking 12/ Project Tracking Why do we want to track a project? What is the projects MOV? – Why is tracking.
Syllabus Case Histories WW III Almost Medical Killing Machine
Software Engineering Disasters
Ethics in a Computing Culture
WHY DO SOME FIRMS SUCCEED? Why do some firms succeed and others fail? Possible explanations include- Luck. How does this help us understand decision-making?
Mr. R. R. Diwanji Techniques for Safety Improvements.
©Ian Sommerville 2000CS 365 Ariane 5 launcher failureSlide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
Extreme Programming Team Members Gowri Devi Yalamanchi Sandhya Ravi.
SWE Introduction to Software Engineering
The Australian/New Zealand Standard on Risk Management
1 Software Engineering Software has some special characteristics –Software is “developed” and not “manufactured”
Why need probabilistic approach? Rain probability How does that affect our behaviour? ?
Quality is about testing early and testing often Joe Apuzzo, Ngozi Nwana, Sweety Varghese Student/Faculty Research Day CSIS Pace University May 6th, 2005.
Health Informatics Series
Jacky: “Safety-Critical Computing …” ► Therac-25 illustrated that comp controlled equipment could be less safe. ► Why use computers at all, if satisfactory.
SE is not like other projects. l The project is intangible. l There is no standardized solution process. l New projects may have little or no relationship.
Applied Software Project Management 1 Introduction Dr. Mengxia Zhu Computer Science Department Southern Illinois University Carbondale.
Xtreme Programming. Software Life Cycle The activities that take place between the time software program is first conceived and the time it is finally.
Software Quality Assurance
CODING Research Data Management. Research Data Management Coding When writing software or analytical code it is important that others and your future.
©Ian Sommerville 2004Software Engineering Case Studies Slide 1 The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 27 Slide 1 Quality Management 1.
1 Reliability Engineering Program University of Maryland at College Park September 5, 2001 Integrating the Contribution of Software into Probabilistic.
Achieving Better Reliability With Software Reliability Engineering Russel D’Souza Russel D’Souza.
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
Why is software engineering worth studying?  Demand for software is growing dramatically  Software costs are growing per system  Many projects have.
Therac 25 Nancy Leveson: Medical Devices: The Therac-25 (updated version of IEEE Computer article)
Software Metrics - Data Collection What is good data? Are they correct? Are they accurate? Are they appropriately precise? Are they consist? Are they associated.
INTRODUCTION Why AIS threats are increasing
ITGS Software Reliability. ITGS All IT systems are a combination of: –Hardware –Software –People –Data Problems with any of these parts, or a combination.
Chapter 8: Errors, Failures, and Risk
1 Can We Trust the Computer? What Can Go Wrong? Case Study: The Therac-25 Increasing Reliability and Safety Perspectives on Failures, Dependence, Risk,
The Ariane 5 Launcher Failure June 4th 1996 Total failure of the Ariane 5 launcher on its maiden flight.
Software Testing Testing principles. Testing Testing involves operation of a system or application under controlled conditions & evaluating the results.
Liability for Computer Errors Not covered in textbook.
CS 430/530 Formal Semantics Paul Hudak Yale University Department of Computer Science Lecture 1 Course Overview September 6, 2007.
Security and Reliability THERAC CASE STUDY TEXTBOOK: BRINKMAN’S ETHICS IN A COMPUTING CULTURE READING: CHAPTER 5, PAGES
The Vision for Space Exploration Old Lessons Apply in the New World C. Herbert Shivers, PhD, PE, CSP Deputy Director Safety and Mission Assurance Directorate.
Dimitrios Christias Robert Lyon Andreas Petrou Dimitrios Christias Robert Lyon Andreas Petrou.
10-January-2003cse Context © 2003 University of Washington1 What is a development project? CSE 403, Winter 2003 Software Engineering
Introduction to Software Engineering ECSE-321 Unit 4 – Project Management 10/19/2015Introduction to Software Engineering – ECSE321Unit 4 – Project Management/1.
Software Engineering Chapter 3 CPSC Pascal Brent M. Dingle Texas A&M University.
Building Dependable Distributed Systems Chapter 1 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
By Team T-Rex James Houlihan And Gavin Herbert
Therac-25 CS4001 Kristin Marsicano. Therac-25 Overview  What was the Therac-25?  How did it relate to previous models? In what ways was it similar/different?
Chapter 1: Fundamental of Testing Systems Testing & Evaluation (MNN1063)
1 Software Quality Assurance COMP 4004 Notes Adapted from S. Som é, A. Williams.
CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.
Software Engineering Lecture 8: Quality Assurance.
Objectives By the end of this presentation you will know: What risk assessment is; Where the need for risk assessment comes from; and The principles behind.
Chapter 8: Errors, Failures, and Risk Zach Archer Daniel O’Hara Eric Strittmatter.
CHAPTER 9: PROFESSIONAL ETHICS AND RESPONSIBILITIES BY: MATT JENNINGS SHANE CRAKER KYLER RHOADES.
Topic 10Summer Ariane 5 Some slides based on talk from Sommerville.
1 Chapter 1- Introduction How Bugs affect our lives What is a Bug? What software testers do?
Project management. Software project management ■It is the discipline of planning, organizing and managing resources to bring about the successful completion.
Can We Trust the Computer? FIRE, Chapter 4. What Can Go Wrong? What are the risks and reasons for computer failures? How much risk must or should we accept?
Ashima Wadhwa.  Probably the most time-consuming project management activity.  Continuous activity from initial concept through to system delivery.
EE 585 : FAULT TOLERANT COMPUTING SYSTEMS B.RAM MOHAN
ECE 103 Engineering Programming Chapter 2 SW Disasters
Reliability and Safety
Please read this before using presentation
Week 13: Errors, Failures, and Risks
Software Engineering Disasters
Presentation transcript:

Lecture 4b: Risks and Liabilities of Computer-based Systems CSCI102 - Introduction to Information Technology B ITCS905 - Fundamentals of Information Technology

Overview Historical examples of software risks Implications of software complexity Risk assessment and management

Historical Examples Software errors Can KILL Cost MONEY Indirectly Loss of equipment Loss of business

Software Aids and Abets Murder: 1992 A New Jersey inmate escaped from computer-monitored house arrest in the spring of 1992 He simply removed the rivets holding his electronic anklet together and went off to commit a murder

Software Aids and Abets Murder: 1992 A computer detected the tampering when it called a second computer to report the incident, the first computer received a busy signal and never called back

Radiation Machine Kills Four: 1985 to 1987 Faulty software in a Therac-25 radiation-treatment machine resulted in several cancer patients receiving lethal overdoses of radiation

Radiation Machine Kills Four: 1985 to 1987 Four patients died

Radiation Machine Kills Four: 1985 to 1987 When their families sued, all the cases were settled out of court There were several errors, among them the failure of the programmer to detect a race condition (i.e., miscoordination between concurrent tasks) Faulty software in a Therac-25 radiation-treatment machine made by Atomic Energy of Canada Limited (AECL) resulted in several cancer patients receiving lethal overdoses of radiation Four patients died. When their families sued, all the cases were settled out of court. A later investigation by independent scientists Nancy Leveson and Clark Turner found that accidents occurred even after AECL thought it had fixed particular bugs. "A lesson to be learned from the Therac-25 story is that focusing on particular software bugs is not the way to make a safe system," they wrote in their report. "The basic mistakes here involved poor software-engineering practices and building a machine that relies on the software for safe operation.“ In 1986, two cancer patients at the East Texas Cancer Center in Tyler received fatal radiation overdoses from the Therac-25, a computer-controlled radiation-therapy machine. There were several errors, among them the failure of the programmer to detect a race condition (i.e., miscoordination between concurrent tasks).

Radiation Machine Kills Four: 1985 to 1987 It was found that found that accidents occurred even after AECL thought it had fixed particular bugs "A lesson to be learned from the Therac-25 story is that focusing on particular software bugs is not the way to make a safe system” "The basic mistakes here involved poor software-engineering practices and building a machine that relies on the software for safe operation” Faulty software in a Therac-25 radiation-treatment machine made by Atomic Energy of Canada Limited (AECL) resulted in several cancer patients receiving lethal overdoses of radiation Four patients died. When their families sued, all the cases were settled out of court. A later investigation by independent scientists Nancy Leveson and Clark Turner found that accidents occurred even after AECL thought it had fixed particular bugs. "A lesson to be learned from the Therac-25 story is that focusing on particular software bugs is not the way to make a safe system," they wrote in their report. "The basic mistakes here involved poor software-engineering practices and building a machine that relies on the software for safe operation.“ Item: In 1986, two cancer patients at the East Texas Cancer Center in Tyler received fatal radiation overdoses from the Therac-25, a computer-controlled radiation-therapy machine. There were several errors, among them the failure of the programmer to detect a race condition (i.e., miscoordination between concurrent tasks).

Hyphen Costs $80 Million: 1962 A probe launched from Cape Canaveral was set to go to Venus After takeoff, the unmanned rocket carrying the probe went off course NASA had to blow up the rocket to avoid endangering lives on earth

Hyphen Costs $80 Million: 1962 NASA later attributed the error to a faulty line of Fortran code “Somehow a hyphen had been dropped from the guidance program loaded aboard the computer, allowing the flawed signals to command the rocket to veer left and nose down ...Suffice it to say, the first U.S. attempt at interplanetary flight failed for want of a hyphen”

Hyphen Costs $80 Million: 1962 The vehicle cost more than $80 million, prompting Arthur C. Clarke to refer to the mission as “the most expensive hyphen in history”

AT&T Long Distance Service Fails: 1991 In the summer of 1991, telephone outages occurred in local telephone systems in California and along the Eastern seaboard These breakdowns were all the fault of an error in signalling software

AT&T Long Distance Service Fails: 1991 Right before the outages DSC Communications introduced a bug when it changed three lines of code in the several-million-line signalling program After this tiny change, nobody thought it necessary to retest the program

AT&T Long Distance Service Fails: 1991 These switching errors in AT&T's call-handling computers caused the company's long-distance network to go down for nine hours The meltdown affected thousands of services and was eventually traced to a single faulty line of code

There’s a Hole in the Bucket Small systems …form part of larger systems A fault within a small part could result in a catastrophe later on

There’s a Hole in the Bucket Designers have an ethical responsibility to design the best system possible

Bugs Bugs exist because …humans aren't perfect Since humans design and program hardware and software, mistakes are inevitable That's what computer and software vendors tell us, and it's partly true What they don't say is that software is buggier than it has to be

Bugs Why? Because time is money, especially in the software industry

Bugs This is how bugs are born

Bugs A software or hardware company sees a business opportunity and starts building a product to take advantage of that Long before development is finished, the company announces that the product is on the way A software or hardware company sees a business opportunity and starts building a product to take advantage of that Long before development is finished, the company announces that the product is on the way Because the public is (the company hopes) now anxiously awaiting this product, the marketing department fights to get the goods out the door before that deadline

Bugs All the while pressuring the software engineers to add more and more features

Bugs Shareholders and venture capitalists clamour for quick delivery because that's when the company will see the biggest surge in sales Meanwhile, the quality-assurance division has to battle for sufficient bug-testing time

Bugs “The simple fact is that you get the most revenues at the release of software,” “The faster you bring it out, the more money you make. You can always fix it later, when people howl. It's a fine line when to release something, and the industry accepts defects“  "The simple fact is that you get the most revenues at the release of software," says Bruce Brown, the founder of BugNet, a newsletter that has chronicled software bugs and fixes since 1994. "The faster you bring it out, the more money you make. You can always fix it later, when people howl. It's a fine line when to release something, and the industry accepts defects"

What Is Risk Assessment and Management? Risk and uncertainty are fundamental elements of modern life They are ever present in the actions of human beings and they are frequently magnified in large-scale technological systems Risk and uncertainty must be managed effectively to protect people from injury and to permit the development of reliable, high-quality products

What Is Risk Assessment and Management? Risk is often defined as a measure of the probability and severity of adverse effects

What Is Risk Assessment and Management? In risk assessment, the analyst often attempts to answer the following set of triplet questions What can go wrong? What is the likelihood that it would go wrong? What are the consequences?

What Is Risk Assessment and Management? Answers to these questions help risk analysts identify, measure, quantify, and evaluate risks and their consequences and impacts

What Is Risk Assessment and Management? Risk management builds on the risk assessment process by seeking answers to a second set of three questions What can be done? What options are available and what are their associated trade-offs in terms of all costs, benefits, and risks? What are the impacts of current management decisions on future options?

What Is Risk Assessment and Management? To be effective and meaningful, risk management must be an integral part of the overall management of a system This is particularly important in the management of technological systems, where the failure of the system can be caused by the failure of the hardware, the software, the organization, or the humans