Download presentation
Presentation is loading. Please wait.
Published byRolf Rose Modified over 6 years ago
1
Data quality, or how to keep afloat in the growing data flood
P. Sollander, CERN 16/4/2013 ARW2013
2
Outline Control system architecture and data flows/floods
Data quality problems and consequences False negatives Unknown state False positives System and software strategies Processes and procedures Summary 16/4/2013 ARW2013
3
Control system architecture
logging Middleware ~1M / day Application server DAQ DAQ DAQ DAQ DAQ 100+ What could possibly go wrong? 16/4/2013 ARW2013
4
Control system architecture
What could possibly go wrong??? ✗ logging Middleware ~1M / day Middleware ~1M / day ✗ ✗ What could possibly go wrong? ✗ 100+ ✗ ✗ 16/4/2013 ARW2013
5
False negatives No alarm ≠ no problem 11/1/11 – Big power cut at LHC
No network no alarms no problem? Broken PLC-SCADA connection Monitoring OK operator confident hours spent looking elsewhere April , inundation alarm on LHC P5. Pumps stopped, but no alarm. The PLC to SCADA connection was not monitored… Must be minimized, zero is impossible? 16/4/2013 ARW2013
6
Monitoring the system Data Tag Value Timestamp Quality Middleware
~1M / day What could possibly go wrong? 100+ 16/4/2013 ARW2013
7
Indicating quality on alarms
Active alarms get [?] prefix New alarm on faulty controls component Help Alarm 16/4/2013 ARW2013
8
Indicating quality on synoptics
16/4/2013 ARW2013
9
Indicating quality on applications
16/4/2013 ARW2013
10
Acting on bad quality data
Indicate to operator What about other applications using the data? Software Interlocks for example? 16/4/2013 ARW2013
11
Panicky software interlocks
LHC Beam dump Data Tag Value: closed Timestamp Quality: OK Data Tag Value: closed? Timestamp Quality: NOK Software Interlock System Software Interlock System Middleware ~1M / day Reboot of an element Software Interlock System tolerance for doubtful data Reduce false positives by waiting a reasonable amount of time before taking action 100+ 16/4/2013 ARW2013
12
False positive False alarms
1% of Technical infrastructure alarms are real! Easy to miss out on an important one 24/1/2007 – Constant false alarms mask one real alarm 400kV breaker trips, 7 hours to switch everything back 16/4/2013 ARW2013
13
Software strategies Software Interlock System tolerance for doubtful data Reduce false positives by waiting a reasonable amount of time before taking action Add indications of bad quality, [?] and color 16/4/2013 ARW2013
14
Operator strategies Wait to see if the alarm stays? Check the trend
Poor reading gives brief 0 reading. Diagnose with good tools Worth investing in good tools 1% real alarms for CERN’s technical infrastructure 16/4/2013 ARW2013
15
Processes to improve quality
Alarm and data configuration process Every alarm checked by operation Long and tedious Cannot work without it Test procedures Correction procedures Operating instructions HelpAlarm Diagnostic tools in system 16/4/2013 ARW2013
16
Data integration process
Create request Equipment group System check Computerized Data check Operators Data validation Tests Equipment group and Operators 16/4/2013 ARW2013
17
Summary CERN technical infrastructure system is huge, a million alarms per year! Control system is event based False negatives – reduced by thorough monitoring of the system itself, diagnostic tools False positives – reduced mainly by procedure Strict integration rules, testing, correction, etc 16/4/2013 ARW2013
20
16/4/2013 ARW2013
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.