Performance monitoring framework for the technical infrastructure

Slides:



Advertisements
Similar presentations
Making the System Operational
Advertisements

MAINTENANCE MANAGEMENT Module 9
ITIL: Service Transition
All content in this presentation is protected – © 2008 American Power Conversion Corporation Rael Haiboullin System Engineer Change Manager.
Intervention Priority Management This talk will show the CERN priority list, the corresponding check list and the tools used by operators to diagnose a.
Management of IT Environment (5) LS 2012/ Martin Sarnovský Department of Cybernetics and AI, FEI TU Košice ITIL:Service Design IT Services Management.
Chapter 19: Network Management Business Data Communications, 4e.
Information about system status Piquet backup call Supervision of fire alarm systems Control room backup Computer support for HCI’s/terminals Confirmation.
DITSCAP Phase 2 - Verification Pramod Jampala Christopher Swenson.
Overview of Data Management solutions for the Control and Operation of the CERN Accelerators Database Futures Workshop, CERN June 2011 Zory Zaharieva,
CS370 Spring 2007 CS 370 Database Systems Lecture 2 Overview of Database Systems.
CV activities on LHC complex during the long shutdown Serge Deleval Thanks to M. Nonis, Y. Body, G. Peon, S. Moccia, M. Obrecht Chamonix 2011.
Maintenance Specificities in the CERN Cooling and Ventilation Group Asset and Maintenance Management Workshop CERN, November 2013 I. El Hardouz,
Creating a common understanding on Adverse events information requirements CCC-TI Isabelle Laugier Nov 2 nd 2012.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Barriere O, Le Roux P
1v1 Availability Tracking as a Means to Increase LHC Physics Production B. Todd 1, A. Apollonio 1 and L. Ponce 1 1 CERN – European Organisation for Nuclear.
Database Administration
1 Kenneth Osborne, 9/14/07 Inter Group communications at the Advanced Light Source. Overview of the different methods of communication between different.
RBM and Asset Strategy.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
 by Banner Software Harris Herman Jan Peterson Banner Software April 2004 California Performance Review Transforming California.
CLIC Implementation Studies Ph. Lebrun & J. Osborne CERN CLIC Collaboration Meeting addressing the Work Packages CERN, 3-4 November 2011.
Copyright © 2002 OSI Software, Inc. All rights reserved. PI Application Framework Example Applying the Application Framework.
Review of the operation scenarios and required manning of the activities P. Schnizer and L. Serio.
Overview of the main events related to TS equipment during 2007 Definition Number and category of the events Events and measures taken for each machine.
ABOC/ATC days Electrical Network: Selectivity and Protection Plan José Carlos GASCON TS–EL Group 22 January 2007.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
CERN Availability Working Group & Accelerator Fault Tracker Availability Working Group & Accelerator Fault Tracker - Where do we.
Quality assurance - documentation and diagnostics during interventions Corrective maintenance seen from the Technical Infrastructure operation Peter Sollander,
AB/CO Review, Interlock team, 20 th September Interlock team – the AB/CO point of view M.Zerlauth, R.Harrison Powering Interlocks A common task.
1v2 LHC Availability Tracking: Past and Future B. Todd, A. Apollonio, L. Ponce on behalf of the LHC AWG.
Cryogenics Fault Tree A. Niemi & E. Rogova. Contents 1.Introduction of the current tree structure 2.Failure rates observed in 2015 failure data 3.Unsure.
IST 210 Database Design Process IST 210, Section 1 Todd S. Bacastow January 2004.
TCR Remote Monitoring for the LHC Technical Infrastructure 6th ST Workshop, Thoiry 2003U. Epting, M.C. Morodo Testa, S. Poulsen1 TCR Remote Monitoring.
Conventional Facilities integration: Approach and Issues Daniel Piso Fernández WP Leader (WP13 Conventional Facilities Integration Support) November 5,
Support for Technical Infrastructure operations P. Sollander, AB/OP/TI.
ITIL: Service Transition
Chapter 19: Network Management
Software Project Configuration Management
Manufacturing Productivity Solutions
Technical Services: Unavailability Root Causes, Strategy and Limitations Data and presentation in collaboration with Ronan LEDRU and Luigi SERIO.
Machine operation and daily maintenance management in SOLEIL
Summary of Changes to the
PSS Plans for Improved Reliability and Availability
Software Configuration Management
Improving the Defect Life Cycle Management Process
Quality Assurance applied to Accelerator Safety
Developing Information Systems
ITIL:Service Design IT Services Management Martin Sarnovský
the CERN Electrical network protection system
Planning Phase: Project Control and Deliverables
Chapter 6 Database Design
Managing infrastructure faults to minimize accelerator down time
Automation Topics: Elements of an Automated System
Chapter 9 – Software Evolution and Maintenance
PREVENTIVE MAINTENANCE
1 Stadium Company Network. The Stadium Company Project Is a sports facility management company that manages a stadium. Stadium Company needs to upgrade.
Infor EAM in the Handling Engineering Group at CERN
PANS-AIM (Doc 10066) Air Navigation Procedures for AIM Seminar
Leading Practice Implementation Guide
Case management as logbook for the cryogenic installations at CERN
Leading Practice Implementation Guide
Database System Concepts and Architecture
Conceptual Design of the Central Process Systems
5/12/2019 2:57 PM © Microsoft Corporation. All rights reserved.
AV 13 Avantis Capabilities for Effective Asset Management
Asset condition and priority in Maximo
CHAPTER 5 THE DATA RESOURCE
Presentation transcript:

Performance monitoring framework for the technical infrastructure Ugo Gentile on behalf of TIOC

Content Context Project goals and work plan Activities: Conclusions SBS Alarm discovery Integration with AFT and Infor EAM DB Conclusions

Monitoring and analysis of the technical infrastructure Control Center TI operators are in charge of monitoring and record failures TIOC meets weekly to perform: post-mortem analysis and coordinate interventions identify root causes propose consolidation actions to minimize impact on the machines complex Informs TI operator CCC Goals and issues: Clear and representative KPIs Automate the analysis Downtimes calculations (TI logbook) Models showing functional dependencies Major event Reports on investigations Monitors the TI Failure Technical Infrastructure Support Record root cause Get failure information Analyses events and produces recommendations TIOC

Project goals Performance and Monitoring Framework Analysis Modelling Develop a supporting framework for monitoring and analysis of the TI, guide intervention and troubleshooting Performance and Monitoring Framework Analysis Use of AFT for analysis representation Dependability Performances Reliability analyzing analyzing analyzing analyzing Modelling (Descriptive Models) (Predictive Models) Data mining Mining Processing Store discovered dependencies Extraction Provides system downtimes Equipment information

Workplan Functional analysis: Definition of a hierarchical structure to record failures in the logbook: the System Breakdown Structure (SBS) based on systems and data analysis The SBS is necessary since: Allows to understand system functionalities and dependencies Allows to organize failure data for the modelling phase Provides a common interface to integrate the TI Logbook with other existing systems (e.g. the AFT) Analysis existing data and systems Functional analysis and data organization Definition of procedures and tools for data recording and mining Modelling for predictive analysis Build, release and testing of the framework Integration with other CERN existing systems

SBS definition process Top-down approach, following the functional structure of the system How? Machine To collide beams To provide cooling To distribute primary water To distribute chilled water More.. To provide power To provide access To provide controls Providing network The breakdown structure follows the functional decomposition of the system The SBS identify the system responsible for the failure Simple top down approach primary functions Further levels to be developed as required with the equipment groups

Root cause field definition Failure category should be specified for analysis purpose Used has an attribute of the failure A hierarchical organization to simplify the analysis at different level The failure attribute specify why the responsible system has failed

Alarm data mining To help CCC operators to identify failure causes Based on alarm systems used by the CCC Operator (LASER, PSEN..) To help CCC operators to identify failure causes Reader 18 kV Problem! Filters Learning PSEN, LASER.. To discover dependency Loss of 18kV @ SEM4 Loss of installation SU4 CV Loss of Chilled Water Start of 400 V backup Additional dependency identification between Air compressed (SU4) and 400V

Integration with the AFT (1/2) The integration allows to have a common and unique tool for assessment of availability and reliability for the whole accelerator complex Open Issue: Need for a synchronization mechanism between the two systems! AFT Event n. 26140 Assigned to System Technical Services » Electrical Network Current State OP Ended Started 07-04-2016 03:34:19 OP Ended 07-04-2016 03:56:24 OP Duration 22m 05s Faulty Element RCO/RCD.A56B2 Description : tripped during the ramp, problem on a DC cable - Initially assigned to TI - Then rejected and assigned to Power Convertor - Then assigned again to TI - But there isn’t a major event in the TI Logbook

Integration with the AFT (2/2) Different breakdown structures in AFT and TI logbook increase the discrepancies. Data synchronization only in a manual – and error prone – way Functional incoherencies: Example: Ventilation Doors and Access system at the same functional level than the whole Technical service TI Logbook AFT

Integration with the InforEAM DB The System breakdown structure provide a high-level view InforDB allow to implement and structure the detailed dependency of all assets of the TI The InforEAM DB functional positions provide: the storage of the functional dependencies information the maintenance and consistency of models and the equipment TI Models TI Record of change Models information update Infrastructure change This integration is currently being defined with the InforEAM DB group

First release of the framework Work plan schedule System Breakdown Structure End 2017 System analysis Data structuring Development of procedures and tools for data mining; Definition of interfaces for the integration with the AFT Modelling for predictions Build, release of the framework; Framework and integration testing 2018 -2019 1st semester 2017 2nd semester 2016 Machine learning to self-adapt models to the changes of the TI to guide operation and interventions in order to reduce MTTR and recovery 2nd semester 2016 First release of the framework

Conclusion Monitoring and analysis of TI will be supported by a computer-based framework to: Provide representative performance indicators Provide tools for the analysis and monitoring Reduce analysis time Reduce manual – and error prone! – activities Streamline operation and intervention activities Maintain consistency with infrastructure evolution (by relying on data) Integration with existing CERN tools is a key activity in order to provide coherent analysis and interpretation of data