B. Todd AB/CO/MI 30 th January 2008 Safety in Mind…
LHC Beam Interlock System 2 of 12 Interesting Times… Safety Systems are all around us: designed by engineers, to a specification. like any other system must be careful! not acceptable to ‘put it together and see if it works’ we must be vigilant! Things can go wrong… 1.Software Failure 2.Hardware Failure 3.Incomplete Procedures 4.Human Error Human error is special, since it is us, humans, who build the systems in the first place…
LHC Beam Interlock System 3 of 12 Software Safety Difficult to quantify ‘safe software’ … A typical mobile phone can have 2 million lines of code A car can have 100 million lines How on earth can these be tested? Complicated verification tools and mathematical proofs can be done $$$$ & Time & People & Experience … When faults cost $$$$ we hear about them:
LHC Beam Interlock System 4 of 12 Software Failures IEEE (reliable source) Software Error - USDOD Software Reset badly written COST 1 Helicopter, 4 marines Airbus A320 Crash at Airshow The pilot claims he was misled on the aircraft's true height by a bug in the software COST 3 lives, one aircraft Ariane 5 Rocket Failure Software error in the inertial reference system COST $500 million
LHC Beam Interlock System 5 of 12 Hardware Safety It’s easier to quantify ‘safe hardware’ … Reduce the critical function Use military handbooks Use tried and tested methods Redundancy and testing But still it takes some energy $$ & Time & People & Experience … It takes extra effort to build safe systems… MUCH more effort to correct an existing system to be safe And it can still go wrong …
LHC Beam Interlock System 6 of 12 Hardware Failures Titan 4 Exploded after Takeoff Hardware failure COST $1 Billion Bruncefield oil fire Two safety interlocks failed
LHC Beam Interlock System 7 of 12 Procedural Safety Using the safety equipment … Needs PROCEDURES! Components degrade Safety must be verified by checking and testing Maintenance has to be carried out to make something as good as new Two good examples of bad procedures causing loss are: Chernobyl – ‘special’ procedure being followed Piper Alpha - safety maintenance was underway
LHC Beam Interlock System 8 of 12 Human Error Using the safety equipment … Needs operators! Humans are… ABSOLUTELY… the weakest link Human Error - CNN Engineers mis-converted English to Metric COST $125-million 1998 USS York town - GCN Managed to enter zero for a setting, which crashed the systems 2004 Thunderbird Crash Pilot miscalculated height above sea-level
LHC Beam Interlock System 9 of 12 Why are we the weakest link A couple of fun examples… change blindness from UBC in Canada inattention blindness from University of Illinois
LHC Beam Interlock System 10 of 12 And so… no magic bullet to make us ‘safe engineers’ We are after all, just human. This presentation is only intended to illustrate that. -Less Software means more provable safety -Hardware can be designed to be safe -Procedures must be complete so safety can be verified -we are just human -Everyone is entitled to make a mistake AB/CO/MI has gone considerable way to developing a safety culture We’ve learned from our mistakes and those of others The time is now, to expand this safety culture!
LHC Beam Interlock System 11 of 12 Rules for VHDL Design But there ARE rules for the VHDL realisation 1.Specification has to be complete 2.Add safety rules and recommendations to specification 3.Describe how you will check that those rules are met 4.Use lots of Asserts in VHDL 5.Use complete Testbenches that PROVE you tested them 6.Design small blocks of code that can be completely tested 7.Build a real-life test bench to prove your design 8.Document anything which is ‘dangerous’ These are the minimum. They all assume you have safe hardware as a basis We accept no compromise here.
LHC Beam Interlock System 12 of 12 FIN