Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data quality, or how to keep afloat in the growing data flood

Similar presentations


Presentation on theme: "Data quality, or how to keep afloat in the growing data flood"— Presentation transcript:

1 Data quality, or how to keep afloat in the growing data flood
P. Sollander, CERN 16/4/2013 ARW2013

2 Outline Control system architecture and data flows/floods
Data quality problems and consequences False negatives Unknown state False positives System and software strategies Processes and procedures Summary 16/4/2013 ARW2013

3 Control system architecture
logging Middleware ~1M / day Application server DAQ DAQ DAQ DAQ DAQ 100+ What could possibly go wrong? 16/4/2013 ARW2013

4 Control system architecture
What could possibly go wrong??? logging Middleware ~1M / day Middleware ~1M / day What could possibly go wrong? 100+ 16/4/2013 ARW2013

5 False negatives No alarm ≠ no problem 11/1/11 – Big power cut at LHC
No network  no alarms  no problem? Broken PLC-SCADA connection Monitoring OK  operator confident  hours spent looking elsewhere April , inundation alarm on LHC P5. Pumps stopped, but no alarm. The PLC to SCADA connection was not monitored… Must be minimized, zero is impossible? 16/4/2013 ARW2013

6 Monitoring the system Data Tag Value Timestamp Quality Middleware
~1M / day What could possibly go wrong? 100+ 16/4/2013 ARW2013

7 Indicating quality on alarms
Active alarms get [?] prefix New alarm on faulty controls component Help Alarm 16/4/2013 ARW2013

8 Indicating quality on synoptics
16/4/2013 ARW2013

9 Indicating quality on applications
16/4/2013 ARW2013

10 Acting on bad quality data
Indicate to operator What about other applications using the data? Software Interlocks for example? 16/4/2013 ARW2013

11 Panicky software interlocks
LHC Beam dump Data Tag Value: closed Timestamp Quality: OK Data Tag Value: closed? Timestamp Quality: NOK Software Interlock System Software Interlock System Middleware ~1M / day Reboot of an element Software Interlock System tolerance for doubtful data Reduce false positives by waiting a reasonable amount of time before taking action 100+ 16/4/2013 ARW2013

12 False positive False alarms
1% of Technical infrastructure alarms are real! Easy to miss out on an important one 24/1/2007 – Constant false alarms mask one real alarm  400kV breaker trips, 7 hours to switch everything back 16/4/2013 ARW2013

13 Software strategies Software Interlock System tolerance for doubtful data Reduce false positives by waiting a reasonable amount of time before taking action Add indications of bad quality, [?] and color 16/4/2013 ARW2013

14 Operator strategies Wait to see if the alarm stays? Check the trend
Poor reading gives brief 0 reading. Diagnose with good tools Worth investing in good tools 1% real alarms for CERN’s technical infrastructure 16/4/2013 ARW2013

15 Processes to improve quality
Alarm and data configuration process Every alarm checked by operation Long and tedious Cannot work without it Test procedures Correction procedures Operating instructions HelpAlarm Diagnostic tools in system 16/4/2013 ARW2013

16 Data integration process
Create request Equipment group System check Computerized Data check Operators Data validation Tests Equipment group and Operators 16/4/2013 ARW2013

17 Summary CERN technical infrastructure system is huge, a million alarms per year! Control system is event based False negatives – reduced by thorough monitoring of the system itself, diagnostic tools False positives – reduced mainly by procedure Strict integration rules, testing, correction, etc 16/4/2013 ARW2013

18

19

20 16/4/2013 ARW2013


Download ppt "Data quality, or how to keep afloat in the growing data flood"

Similar presentations


Ads by Google