SUMMARY: EXTERNAL REVIEW RESULTS FOR THE LHC BLM SYSTEM Christos Zamantzas for the BLM team. Machine Protection Panel 24/06/2011.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Configuration Management
Verification and Validation
SOFTWARE TESTING. INTRODUCTION  Software Testing is the process of executing a program or system with the intent of finding errors.  It involves any.
HP Quality Center Overview.
Information Risk Management Key Component for HIPAA Security Compliance Ann Geyer Tunitas Group
Auditing Concepts.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development.
Continuous Value Enhancement Process
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 30 Slide 1 Security Engineering.
The Australian/New Zealand Standard on Risk Management
Fundamentals of Information Systems, Second Edition
Computer Security: Principles and Practice
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Chapter 2 Principles of Programming & Software Engineering.
DITSCAP Phase 2 - Verification Pramod Jampala Christopher Swenson.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 30 Slide 1 Security Engineering.
Purpose of the Standards
Vegard Joa Moseng BI - BL Student meeting Reliability analysis summary for the BLEDP.
Protection Against Occupational Exposure
Technical review on UPS power distribution of the LHC Beam Dumping System (LBDS) Anastasia PATSOULI TE-ABT-EC Proposals for LBDS Powering Improvement 1.
Codex Guidelines for the Application of HACCP
Internal Auditing and Outsourcing
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
Introduction to ISO New and modified requirements.
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
Association for Biblical Higher Education February 13, 2013 Lori Jo Stanfield Evaluator Team Training for Business Officers.
Project Tracking. Questions... Why should we track a project that is underway? What aspects of a project need tracking?
Chapter 6 : Software Metrics
Learning Objectives LO5 Illustrate how business risk analysis is used to assess the risk of material misstatement at the financial statement level and.
S7: Audit Planning. Session Objectives To explain the need for planning To explain the need for planning To outline the essential elements of planning.
Audit Planning. Session Objectives To explain the need for planning To outline the essential elements of planning process To finalise the audit approach.
ISM 5316 Week 3 Learning Objectives You should be able to: u Define and list issues and steps in Project Integration u List and describe the components.
SPS policy – Information Presentation Presentation to ROS June 16, 2004.
Beam Loss Monitor System – Preliminary Feedback November 11 th, 2010.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
1 Introduction to Software Engineering Lecture 1.
Programme Objectives Analyze the main components of a competency-based qualification system (e.g., Singapore Workforce Skills) Analyze the process and.
Process Improvement. It is not necessary to change. Survival is not mandatory. »W. Edwards Deming Both change and stability are fundamental to process.
[Hayes, Dassen, Schilder and Wallage, Principles of Auditing An Introduction to ISAs, edition 2.1] © Pearson Education Limited 2007 Slide 7.1 Internal.
Safety-Critical Systems 7 Summary T V - Lifecycle model System Acceptance System Integration & Test Module Integration & Test Requirements Analysis.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc., All Rights Reserved. 6-1 Chapter 6 CHAPTER 6 INTERNAL CONTROL IN A FINANCIAL STATEMENT AUDIT.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Auditing Internal Control Studies & Risk Assessment Chapter 9 Internal Control Studies & Risk Assessment Chapter 9.
Process Improvement. It is not necessary to change. Survival is not mandatory. »W. Edwards Deming.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
Vegard Joa Moseng BI - BL Student meeting Reliability analysis of the Input Monitor in the BLEDP.
Kathy Corbiere Service Delivery and Performance Commission
Slide 1 Security Engineering. Slide 2 Objectives l To introduce issues that must be considered in the specification and design of secure software l To.
ANALYSIS PHASE OF BUSINESS SYSTEM DEVELOPMENT METHODOLOGY.
Report Performance Monitor & Control Risk Administer Procurement MONITORING & CONTROLLING PROCESS.
Copyright © 2007 Pearson Education Canada 9-1 Chapter 9: Internal Controls and Control Risk.
Continual Service Improvement Methods & Techniques.
TE/TM 30 th March - 0v1 CERN MPP SMP 3v0 - Introduction 3 *fast *safe *reliable *available generates flags & values.
Company LOGO. Company LOGO PE, PMP, PgMP, PME, MCT, PRINCE2 Practitioner.
Eva Barbara Holzer MPP, CERN July 31, Eva Barbara Holzer, CERN MPP CERN, July 31, 2009 BLM System Audit Sequel.
LECTURE 5 Nangwonvuma M/ Byansi D. Components, interfaces and integration Infrastructure, Middleware and Platforms Techniques – Data warehouses, extending.
CS223: Software Engineering Lecture 25: Software Testing.
Lecturer: Eng. Mohamed Adam Isak PH.D Researcher in CS M.Sc. and B.Sc. of Information Technology Engineering, Lecturer in University of Somalia and Mogadishu.
 System Requirement Specification and System Planning.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 17 – IT Security.
Auditing Concepts.
Internal Control Principles
2007 IEEE Nuclear Science Symposium (NSS)
Software Testing.
Security Engineering.
SCSC April 2018 A model for including cyber threat in safety cases
Software testing strategies 2
J. Emery, B. Dehning, E. Effinger, G. Ferioli, C
How to conduct Effective Stage-1 Audit
Presentation transcript:

SUMMARY: EXTERNAL REVIEW RESULTS FOR THE LHC BLM SYSTEM Christos Zamantzas for the BLM team. Machine Protection Panel 24/06/2011

External Auditors Auditors:  Dr. Jeffrey Joyce  Dr. Laurent Fabre  Dr. Naghmeh Ghafari  Dr. Lesley Shannon (external)  Dr. Wolfgang Strigel (observer) 24/06/20112 Critical Systems Labs, Inc. # Howe Street Vancouver, B.C. Canada V6C 2B3

Scope of the external audit CERN internal reviews have been performed in the past and these reviews focused primarily on the analogue parts of the system design. Therefore the CSL review will focus more on the digital parts and particularly on the program- mable parts of this system. The essential and foremost question that will drive the CSL technical review is: “Are the digital and programmable parts of the BLM system going to perform as they are intended to in the context of the BLM system?” 24/06/2011

LHC Beam Loss Monitoring System 4/12 Beam Loss Monitors Acquisition & Digitisation Processing, Analysis & Decision Combiner & Survey around the LHC ring Front-end CPU Beam Interlock System Interface x4000 x700 x400 x25 x16 TunnelSurface Settings and ThresholdsDatabases, fixed displays… 24/06/2011 grouped in 8 points Review participants: B. Dehning, E. Effinger, J. Emery, C. Zamantzas

Additional things checked The propagation of a signal from the monitor input up to the interlock output Existing documentation Choice of technical options System verification techniques Performance of the Successive Running Sums Possible future changes 24/06/2011

Successive Running Sums Investigated: Dynamic range, accuracy and speed requirements Simulation and calculation models we’ve used for validation Verified: Formal methods simulation model Collaborated with Cambridge university  Results will be presented this August at the “16 th International Workshop on Formal Methods for Industrial Critical Systems” 24/06/2011

Executive Summary “ With one minor reservation, we have found no reason to be concerned that the current configuration of the BLMS might fail to request a beam dump in response to a dangerous beam loss. This conclusion is based on the following assumptions: (1) appropriate threshold settings are used; (2) the current operating procedures, including regular execution of the “connectivity test”, are maintained; (3) no element of the BLMS, such as an individual detector, is disabled without a decision by the appropriate authority that the safety risk is acceptable; (4) known limitations of the BLMS are adequately addressed before the machine is used at higher energies; and (5) the risk associated with the dependency of the BLMS on other systems for an accurate indication of bean energy is deemed to be acceptable by the appropriate authority. Changes to any aspect of the design may invalidate this conclusion. ” 24/06/2011

The Reservation “ The one minor reservation associated with the above conclusion is the fact that the critical real- time datapath responsible for generation of a beam dump request is not fully redundant. Hence, there are single points of failure in the design of this critical portion of the BLMS. In this regard, we are concerned mostly about non-redundant elements of this datapath within the Threshold Comparator Field Programmable Gate Array (FPGA). ” 24/06/2011

SUMMARY OF FINDINGS All comments from CSL – emphasis mine Machine Protection Panel 24/06/2011

Summary of findings 1/4 1. The design of the BLMS is conservative, except where novel solutions are needed to overcome limitations of known solutions. 2. The design of the BLMS includes very substantial provision for error detection. Fault tolerance is used appropriately. 3. The critical path for beam dump requests includes some single points of failure. 4. The BLMS serves other purposes besides machine protection. This might bring other interests into conflict with the safety objective of the BLMS. 24/06/201110

Summary of findings 2/4 5. Separation of critical parts of the design from non-critical parts is not an explicit feature of the design. 6. Very high utilization of the logic elements in the BLECF and BLMTC FPGAs could be problematic as it evolves over the lifetime of the LHC. 7. The algorithm and circuit design for the BLMTC Successive Running Sums (SRS) calculation is very complex. 24/06/201111

Summary of findings 3/4 8. The lack of a comprehensive specification of the BLMS requirements will likely have an adverse effect on the maintainability of the system. 9. System maintainers could have difficulties understanding the VHDL code for several reasons, especially the lack of meaningful comments. 10. While impressively diverse, it is difficult to objectively assess the adequacy of the verification results. 11. Proof testing is very substantial, but operational procedures concerned with proof testing are unclear. 24/06/201112

Summary of findings 4/4 12. There are known limitations of the BLMS that need to be addressed, e.g., insufficient dynamic range of the detectors near the injection sites. 13. There is a lack of safeguards to protect against human error when critical data such as threshold settings is modified. 14. There are a number of ways in which the maintainability of the system could be improved. 15. It is unclear how operators and system maintainers will become aware of problems that arise while the system is operating. 16. The safety of the BLMS depends on other systems including the BETS and GMT link. 24/06/201113

RECOMMENDATIONS All comments from CSL – emphasis mine Machine Protection Panel 24/06/2011

Recommendations 1/4 #1: There should be a clear policy statement by the Director for Accelerators and Technology that changes to the BLMS design for the purpose of enhancing or modifying the capability of the BLMS to provide measurements of beam loss must not compromise the safety of the BLMS, i.e., its ability to reliably detect a dangerous loss. #2: The engineering documentation for the BLMS design should be enhanced to distinguish critical from non-critical elements of the design, and to record an evidence-based argument that the critical functionality is sufficiently isolated from the non-critical functionality (if such an argument can be made). #3: Any proposal to modify the BLMS in a manner that would increase the amount of utilization of the FPGAs should be strongly resisted unless some previous change to the BLMS has reduced the utilization to a level that easily accommodates additional functionality without increasing the utilization back to a very high level. 24/06/201115

Recommendations 2/4 #4: Options to reduce the utilization of the FPGAs should be identified and evaluated. #5: A comprehensive specification of the required functional behaviour of the BLMS should be created and maintained using a style and format that is amenable to modification as changes are made over its lifetime. #6: The understandability of the VHDL code should be significantly improved with the addition of meaningful in-line comments. 24/06/201116

Recommendations 3/4 #7: A Master Verification plan should be created to document the verification approach from a top-down perspective, e.g., what tests have been allocated to each component of the system. This document will guide system maintainers when changes are made to the system. #8: A formal policy should be established to impose limits on the ability of a single individual to modify the procedure for starting the accelerator in any way that would override, inhibit or otherwise interfere with the performance of the connectivity test. #9: A schedule for each form of proof testing should be defined along with a means by which operators would know if some form of proof testing has not been performed according to schedule, or if the results indicate an unresolved problem. 24/06/201117

Recommendations 4/4 #10: Safeguards should be added to the tools used to modify critical data to reduce the likelihood of simple human errors such as incorrect keystrokes. #11: A comprehensive procedure for monitoring the status of the BLMS should be defined to ensure that operators and system support maintainers will become aware of faults and other problems in a systematic and timely manner. This procedure should include review activities to be performed after each LHC run. #12: The LHC Machine Protection Committee should determine whether the risk associated with the dependency of the BLMS on other systems has been adequately controlled. 24/06/201118

CONCLUSIONS AND FUTURE Machine Protection Panel 24/06/2011

Conclusions Lengthy and hectic procedure but provided excellent experience. CSL had multiple people (each having his own expertise) “attacking” the system from all fronts (specifications, documentation, testing, procedures, implementation,...). Helped to organise our multiple priorities. Accumulated knowledge that we can transfer on new projects. 24/06/201120

Modifications List Long-term: (complete) the Master Verification Plan Short-term: Improve documentation Add functionality to inhibit the beam dump request on injection Add dedicated data for the automatic collimator beam-based alignment As soon as possible: Full deployment of the “Study” buffer (part of the Capture function) Deploy the continuous high voltage check Add VME block transfer (DMA memory) 24/06/2011

THANK YOU