COMS W3156: Software Engineering, Fall 2001 Lecture #2: The Open Class Janak J Parekh

Slides:



Advertisements
Similar presentations
Introduction to Embedded Systems Resource Management - III Lecture 19.
Advertisements

Priority Inversion BAE5030 Advanced Embedded Systems 9/13/04.
RollCall is a feature recently added to ControlSoft It allows you to have groups of devices checked periodically to see if they are working. The results.
Extreme Programming Alexander Kanavin Lappeenranta University of Technology.
Copyright © 2000, Daniel W. Lewis. All Rights Reserved. CHAPTER 8 SCHEDULING.
CS5270 Lecture 31 Uppaal, and Scheduling, and Resource Access Protocols CS 5270 Lecture 3.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Resource Access Control Protocols.
The Therac-25: A Software Fatal Failure
Mutual Exclusion.
Secure Operating Systems Lesson 5: Shared Objects.
Interprocess Communication
An Investigation of the Therac-25 Accidents Nancy G. Leveson Clark S. Turner IEEE, 1993 Presented by Jack Kustanowitz April 26, 2005 University of Maryland.
Therac-25 Lawsuit for Victims Against the AECL
1CS 338: Graphical User Interfaces. Dario Salvucci, Drexel University. Lecture 16: Eliminating Errors.
Choose your own adventure story Nayeon Kim. It was a dark stormy night in the middle of Italy. It was August. 1 year ago, you were walking on this road.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
Debugging Introduction to Computing Science and Programming I.
Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani University of Bozen- Bolzano Lesson 4 – Software Testing.
COMS W3156: Software Engineering, Fall 2001 Lecture #6: Objects I Janak J Parekh
CSC 395 – Software Engineering Lecture 21: Overview of the Term & What Goes in a Data Dictionary.
UCDavis, ecs251 Fall /23/2007ecs251, fall Operating System Models ecs251 Fall 2007 : Operating System Models #3: Priority Inversion Dr. S.
Issues on Software Testing for Safety-Critical Real-Time Automation Systems Shahdat Hossain Troy Mockenhaupt.
CS 235: User Interface Design January 22 Class Meeting
Personal Software Process Overview CIS 376 Bruce R. Maxim UM-Dearborn.
Software Reliability: The “Physics” of “Failure” SJSU ISE 297 Donald Kerns 7/31/00.
Lecture 7, part 2: Software Reliability
Study Tips for COP 4531 Ashok Srinivasan Computer Science, Florida State University Aim: To suggest learning techniques that will help you do well in this.
Introduction to Embedded Systems
INFO 637Lecture #81 Software Engineering Process II Integration and System Testing INFO 637 Glenn Booker.
Nachos Phase 1 Code -Hints and Comments
Therac 25 Nancy Leveson: Medical Devices: The Therac-25 (updated version of IEEE Computer article)
ITGS Software Reliability. ITGS All IT systems are a combination of: –Hardware –Software –People –Data Problems with any of these parts, or a combination.
Course: Software Engineering © Alessandra RussoUnit 1 - Introduction, slide Number 1 Unit 1: Introduction Course: C525 Software Engineering Lecturer: Alessandra.
CS 235: User Interface Design August 25 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
Prof. Matthew Hertz SH 1029F /
Liability for Computer Errors Not covered in textbook.
Security and Reliability THERAC CASE STUDY TEXTBOOK: BRINKMAN’S ETHICS IN A COMPUTING CULTURE READING: CHAPTER 5, PAGES
Prof. Matthew Hertz WTC 207D /
Chapter 11 Maintaining the System System evolution Legacy systems Software rejuvenation.
Unit 2 (task 28) In this PowerPoint I will tell you about 7 important IT job roles and if a candidate might want one what he would have to do to get one.
Software Engineering Chapter 3 CPSC Pascal Brent M. Dingle Texas A&M University.
Win32 Programming Lesson 2: The Tools of the Trade.
Deadlocks Silberschatz Ch. 7 and Priority Inversion Problems.
The Prodigal Son Year 5 Here I Am Lesson 4. The Prodigal Son Introduction Jesus told many stories to his friends to help them understand difficult things.
Optimizing Your Computer To Run Faster Using Msconfig Technical Demonstration by: Chris Kilkenny.
15-410, F’ Scheduling on Mars Oct. 29, 2004 Dave Eckhardt Bruce Maggs L22b_Mars “Delayed Impact”
Therac-25 CS4001 Kristin Marsicano. Therac-25 Overview  What was the Therac-25?  How did it relate to previous models? In what ways was it similar/different?
Deadlock Detection and Recovery
IT1001 – Personal Computer Hardware & system Operations Week7- Introduction to backup & restore tools Introduction to user account with access rights.
Introduction to Embedded Systems Rabie A. Ramadan 5.
Ten Commandments of Word Processing. I. Thou shall not use spaces n Put no more than two spaces together. n Use the key to line things up. n Better yet,
The Unintended Consequences of a career in Engineering Or How to end up a mass murderer without even trying.
Fragments, or Why That’s Not a Sentence One of the most common grammatical errors is the sentence fragment. Actually, “Sentence Fragment” is kind of a.
Chapter 1: Introduction Omar Meqdadi SE 3860 Lecture 1 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Version Control and SVN ECE 297. Why Do We Need Version Control?
Why inversion & integration is needed to see stratigraphy.
Get up to speed Find everyday commands You’ve got Outlook 2007 installed and you’ve taken time to learn about some of the ways it differs from previous.
Lab 4 : Real-Time OS Team #7 P 李彥勳 P 謝嵩淮 R 侯凱文.
Mental Block Post Mortem. The Team Kepera Amun –Programmer –Responsible for the original game concept –Really brought together the gameplay and did most.
Undergraduate course on Real-time Systems Linköping University TDDD07 Real-time Systems Lecture 2: Scheduling II Simin Nadjm-Tehrani Real-time Systems.
By the end of this lesson you will be able to explain: 1. Identify the support categories for reported computer problems 2. Use Remote Assistance to connect.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Chapter 6 Understanding the Technical Writing Process
RTOS Scheduling 2.0 Problems - Solutions
COMP60611 Directed Reading 1: Therac-25
Therac-25 Accidents What was Therac-25? Who developed it?
CSCE 315 – Programming Studio, Fall 2017 Tanzir Ahmed
Reliability and Safety
Real-Time Process Scheduling Concepts, Design and Implementations
Real-Time Process Scheduling Concepts, Design and Implementations
Presentation transcript:

COMS W3156: Software Engineering, Fall 2001 Lecture #2: The Open Class Janak J Parekh

Important terminology (I) NEW: Different colors from previous version. ALL NEW: Software is not compatible with previous version. UNMATCHED: Almost as good as the competition. ADVANCED DESIGN: Upper management doesn't understand it. NO MAINTENANCE: Impossible to fix.

Important terminology (II) BREAKTHROUGH: It finally booted on the first try. DESIGN SIMPLICITY: Developed on a shoestring budget. UPGRADED: Did not work the first time. UPGRADED AND IMPROVED: Did not work the second time.

Some leftover points from last class Plagiarism: I was being cute last time – you will get into trouble if you are caught. Books: They’re available from Papyrus, 114 th and Broadway Office hours: Sorry about this week… Questionnaire: finally done, see C/C++ students, talk to me

Next class – course “begins” Read chapters 1 and 4 of Schach, if you have the book The first one should be a breeze (introduction); the fourth isn’t that bad (teams) We will also start discussing the project in detail in next class Recitations will begin next week

Why Software Engineering? We started discussing this last class Mythical Man-Month: start reading it when you get a chance; we’ll go over it later In the meantime, let’s discuss some case studies of how software engineering (or lack thereof) changed certain operations

Success/Failure: Mars Rover (I) bj1http://catless.ncl.ac.uk/Risks/19.49.html#su bj1 To the public, it was said in 1997 that “software glitches” and “too many things trying to be done at once” were the cause of the Pathfinder’s failures In reality, “priority inversion” was at fault

Success/Failure: Mars Rover (II) There were three main threads, scheduled preemptively –Information bus data-moving: high priority, frequent –Meterological data-gathering: low priority, occasional –Communications task: medium priority, occasional Occasionally, the communications task would be scheduled during a blocked information bus operation, since the bus was waiting for the meteorological data to be gathered

Success/Failure: Mars Rover (III) The communications task would prevent the meterological data work to be done, since it was higher priority A watchdog would occur since the info bus was “dead”, resetting the entire system The low-priority meterological task upended the system: “priority inversion”

Success/Failure: Mars Rover (IV) Good news –They had left debugging mode on –The Rover was running VxWorks, a small runtime OS that has tracing capabilities –They managed to trace the source –Lastly, VxWorks has priority inheritance; this means a lower-priority process will inherit the priority of the blocked process if it’s higher. –They were able to upload a small change to solve the crash, as a consequence

Lessons: Mars Rover Black box testing would have been impossible – had to see interrupts, etc. Therefore, leaving debugging facilities on afterwards here was a big win –Designing for maintenance Just because the data bus maintenance task ran frequently and is short means nothing

Failure: Therac-25 (I) - don’t read it if you are squeamishhttp://sunnyday.mit.edu/papers/therac.pdf Therac-25 was a linear accelerator released in 1982 for cancer treatment by releasing limited doses of radiation This new model was software-controlled as opposed to hardware-controlled; previous units had software merely for convenience

Failure: Therac-25 (II) Controlled by a PDP-11 computer; software controlled safety In case of error, the software was designed to prevent harmful effects However, in case of software error, cryptic codes were given back to the operator: “MALFUNCTION xx”, where 1 < xx < 64

Failure: Therac-25 (III) Operators were rendered insensitive to the errors; they happened often, and they were told it was impossible to overdose a patient However, from , six people received massive overdoses of radiation; several of them died

Failure: Therac-25 (IV) Main cause: –Race condition often happened when operator entered data quickly, then hit the UP arrow key to correct, and values weren’t reset properly –AECL (the company) never noticed quick data- entry – their people didn’t do this on a daily basis –Apparently the problem existed in previous units, but they had a hardware interlock mechanism to prevent it; here, they trusted the software and took out the hardware interlock

Lessons from Therac-25 (I) Overconfidence in software, especially for embedded systems Reliability != safety No defensive design, bizarre error messages They just “bugfixed”, didn’t look for root causes Complacency

Lessons from Therac-25 (II) Improper software engineering practices –Most testing, in reality, was done in a simulated environment and a complete unit; little if any unit and software testing –They claimed 2700 hours of testing; it was really 2700 hours “of use” –Overly complex, poorly organized design –Blind software reuse

Is there a “successful” way? Hard to say – software engineering is an imprecise field There’s always “room to improve” Nevertheless, there are many examples of million-dollar savings after initial investments that seemed large, but was quickly offset by the cost-savings See the book