Presentation is loading. Please wait.

Presentation is loading. Please wait.

QC-specific database(s) vs aggregated data database(s) Outline

Similar presentations


Presentation on theme: "QC-specific database(s) vs aggregated data database(s) Outline"— Presentation transcript:

0 QC database requirements and tools in Run3
B. von Haller CERN

1 QC-specific database(s) vs aggregated data database(s) Outline
Scope Run 3 vs Run 2 Database vs client QC-specific database(s) vs aggregated data database(s) Outline Architecture Requirements Possible solutions B. von Haller | WP7 |

2 Reminder B. von Haller | WP7 |

3 Actually repositories ?
QC repository Actually repositories ? Generic client (shifters, experts) Specific clients (experts) Client Interface Interface “ALICE Aggregated data” (QC, logbook, CCDB, …) “Raw QC” (~Histos + metadata*) From sync and async QC tasks “Derived QC” (Trending and correlation) Interface B. von Haller | WP7 |

4 Preservation and backups Is there something missing ?
Requirements Types of data Amount of data Sources Access Preservation and backups Is there something missing ? Review of each point in next slides B. von Haller | WP7 |

5 Requirements Types of data MonitorObjects (MO) : TObject (mostly histos) with metadata (e.g. quality, source...) (already merged) Trending : derived data under the form of histos or graphs or trees (to be clarified) with metadata (e.g. source MO) Correlations : derived data under the form of histos (to be confirmed). Does it need to be actually stored or could it be generated in memory on the fly ? B. von Haller | WP7 |

6 Requirements Amount of data 25000 MOs updated every minute (i.e. a new version comes in every minute) From survey: (but we don’t believe it) 50% to be kept for 1 month, 40% for 1 year max, 10% forever 1 MO between 550b and 50MB, average 250 kB (online) Trending : at least 15 detectors* 10 objects/detector = 150 To be kept forever Correlation : ? inserts/update : > 400Hz, 100MB/s 6 GB per run initially (3GB after 1 month, 0.6 after 1 year) 2016 : 2300 global runs with recording, 2500 standalone runs with recording 2.8TB per year to be kept forever, 3TB for last month, 5TB for last 6 months (Actually a lot less : standalone runs have 1/15 of the data of a global run) B. von Haller | WP7 |

7 Requirements Sources Mergers running inside the O2 farm (getting their data from QC tasks and other O2 devices) Processes running outside the O2 farm when offloading synchronous processing and asynchronous final processing B. von Haller | WP7 |

8 The results of the QC must be available worldwide
Requirements Access The results of the QC must be available worldwide Well defined and stable interface hides the underlying technology Access limited to the members of the Collaboration. A public access shall be granted to a selected set of interesting data and results for Public Relations (PR). Data should be queryable (filters at least) SWAN support (?) B. von Haller | WP7 |

9 It should therefore support schema evolution.
Requirements Preservation and backups QC data is to be kept forever or for a limited duration, depending on the detectors and the tasks. It should therefore support schema evolution. Backups must ensure that data can be recovered at any time in case of major failure B. von Haller | WP7 |

10 Is there something missing ? Review of each point in next slides
Solutions File-based database SQL database noSQL database CCDB HDFS Is there something missing ? Review of each point in next slides B. von Haller | WP7 |

11 Data in (ROOT) files [and metadata in a DB on top]
Solutions File-based Data in (ROOT) files [and metadata in a DB on top] Current scheme for Run 2 FXS, OCDB, offline QA Used in Overwatch prototype Concerns Non atomic operations Scaling (number of files and load on metadata server) Archiving not trivial B. von Haller | WP7 |

12 Metadata and data (blob) are stored in an SQL database
Solutions SQL database Metadata and data (blob) are stored in an SQL database Currently used in : DQM (MySQL) Prototype for QC exists and benchmarked Concerns : Backing up is not trivial on large DB continuously used B. von Haller | WP7 |

13 Store metadata and data
Solutions noSQL database Store metadata and data Maybe not for the Raw QC but for the Derived QC Prototype exists for trending (ElasticSearch) Concerns : Querying can be complex Data format conversion needed for storing and retrieving Size on disk (?) B. von Haller | WP7 |

14 Do not solve the problem ourselves but rely on the CCDB
Solutions CCDB Do not solve the problem ourselves but rely on the CCDB Either the main CCDB or a dedicated QC CCDB Concerns Are our requirements possible with the CCDB as envisaged ? How to use it on a development system (detector expert station) ? B. von Haller | WP7 |

15 I don’t know this system but it seems promising.
HDFS I don’t know this system but it seems promising. B. von Haller | WP7 |

16 Clarification on the various subsystems involved
Summary Clarification on the various subsystems involved Requirements for the Raw and Derived Database(s) List of possible solutions Discussion Then : interface definition, more prototyping with possibility to switch between backends B. von Haller | WP7 |

17 If you want to participate in these discussions but were not involved until now : send me an ! Requirements work document : B. von Haller | WP7 |


Download ppt "QC-specific database(s) vs aggregated data database(s) Outline"

Similar presentations


Ads by Google