QC-specific database(s) vs aggregated data database(s) Outline

Slides:



Advertisements
Similar presentations
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Advertisements

High-level VIEWS Architecture. Data Acquisition & Import Data Acquisition System: Accepts submission of data in a variety of schemas and formats Can automatically.
DQM news Technical side. Web tools  Advantages  Can be ran out of P2 with access rights  Centrally maintained -> can’t be altered on machines  Modern.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
Star (Traditional) Database Tasks & MySQL 1. Database Types & Operation Issues 2. Server & Database deployments 3. Tools with MySQL 4. Data definition.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Quality Control B. von Haller 8th June 2015 CERN.
From Olivier to commissioning team plans for the start-up of regular operations of LHCb 30/06 to 4/07 : Global commissioning week, all detectors, full.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
A summary of the report written by W. Alink, R.A.F. Bhoedjang, P.A. Boncz, and A.P. de Vries.
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
+ discussion in Software WG: Monte Carlo production on the Grid + discussion in TDAQ WG: Dedicated server for online services + experts meeting (Thusday.
Chapter 9 Section 2 : Storage Networking Technologies and Virtualization.
Experiences, limitations and suggested improvements The ALICE DQM Software and ROOT ROOT Users Workshop Barthelemy von Haller & Adriana Telesca for the.
2005 Epocrates, Inc. All rights reserved. Integrating XML with legacy relational data for publishing on handheld devices David A. Lee Senior member of.
What is Sure Stats? Sure Stats is an add-on for SAP that provides Organizations with detailed Statistical Information about how their SAP system is being.
ALICE, ATLAS, CMS & LHCb joint workshop on
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
FHIR Server Design Review Brian Postlethwaite HEALTHCONNEX October 2015.
Peter Chochula ALICE Offline Week, October 04,2005 External access to the ALICE DCS archives.
RPC DQM status Cimmino, M. Maggi, P. Noli, D. Lomidze, P. Paolucci, G. Roselli, C. Carillo.
DQM for the RPC subdetector M. Maggi and P. Paolucci.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
The DCS Databases Peter Chochula. 31/05/2005Peter Chochula 2 Outline PVSS basics (boring topic but useful if one wants to understand the DCS data flow)
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
CWG9 Data Quality Monitoring, Quality Assurance and Visualization B. von Haller CERN.
Systems, their relations & information. Concepts and Status of the new central service for tracking relations between CERN accelerator systems TE/MPE TM.
The ALICE data quality monitoring Barthélémy von Haller CERN PH/AID For the ALICE Collaboration.
[FUNCTIONALITY AND SAFETY OF A MODERN TECHNOLOGY] [CLOUD COMPUTING FOR INDIVIDUAL CONSUMERS]
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
DAQ thoughts about upgrade 11/07/2012
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
HTCC coffee march /03/2017 Sébastien VALAT – CERN.
KEEPS – a system for UELMA preservation and security
Introduction to DBMS Purpose of Database Systems View of Data
Databases and DBMSs Todd S. Bacastow January 2005.
Jacek Otwinowski (Data Preparation Group)
Chapter (12) – Old Version
KEEPS – a system for UELMA preservation and security
HCAL Database Goals for 2009
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Database Replication and Monitoring
Database backed DNS.
SharePoint Solutions Architect, Protiviti
CMS High Level Trigger Configuration Management
Diskpool and cloud storage benchmarks used in IT-DSS
Triple Stores.
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Simone Campana CERN IT-ES
ALICE analysis preservation
Summary of first LHC logging DB meeting
Jacek Otwinowski (for the DPG QA tools and WP7 groups)
Introduction to NewSQL
AliEn central services (structure and operation)
CS 501: Software Engineering Fall 1999
QA tools – introduction and summary of activities
TriggerDB copy in TriggerTool
Computing Infrastructure for DAQ, DM and SC
Monitoring of the infrastructure from the VO perspective
Database Management System (DBMS)
Cloud computing mechanisms
Admission Control and Request Scheduling in E-Commerce Web Sites
Interpret the execution mode of SQL query in F1 Query paper
Introduction to DBMS Purpose of Database Systems View of Data
DQM for the RPC subdetector
Using an Object Oriented Database to Store BaBar's Terabytes
Event Storage GAUDI - Data access/storage Framework related issues
Triple Stores.
Offline framework for conditions data
Presentation transcript:

QC database requirements and tools in Run3 B. von Haller CERN 21.07.2017

QC-specific database(s) vs aggregated data database(s) Outline Scope Run 3 vs Run 2 Database vs client QC-specific database(s) vs aggregated data database(s) Outline Architecture Requirements Possible solutions B. von Haller | WP7 | 21.07.2017

Reminder B. von Haller | WP7 | 21.07.2017

Actually repositories ? QC repository Actually repositories ? Generic client (shifters, experts) Specific clients (experts) Client Interface Interface “ALICE Aggregated data” (QC, logbook, CCDB, …) “Raw QC” (~Histos + metadata*) From sync and async QC tasks “Derived QC” (Trending and correlation) Interface B. von Haller | WP7 | 21.07.2017

Preservation and backups Is there something missing ? Requirements Types of data Amount of data Sources Access Preservation and backups Is there something missing ? Review of each point in next slides B. von Haller | WP7 | 21.07.2017

Requirements Types of data MonitorObjects (MO) : TObject (mostly histos) with metadata (e.g. quality, source...) (already merged) Trending : derived data under the form of histos or graphs or trees (to be clarified) with metadata (e.g. source MO) Correlations : derived data under the form of histos (to be confirmed). Does it need to be actually stored or could it be generated in memory on the fly ? B. von Haller | WP7 | 21.07.2017

Requirements Amount of data 25000 MOs updated every minute (i.e. a new version comes in every minute) From survey: 10000 (but we don’t believe it) 50% to be kept for 1 month, 40% for 1 year max, 10% forever 1 MO between 550b and 50MB, average 250 kB (online) Trending : at least 15 detectors* 10 objects/detector = 150 To be kept forever Correlation : ? inserts/update : > 400Hz, 100MB/s 6 GB per run initially (3GB after 1 month, 0.6 after 1 year) 2016 : 2300 global runs with recording, 2500 standalone runs with recording 2.8TB per year to be kept forever, 3TB for last month, 5TB for last 6 months (Actually a lot less : standalone runs have 1/15 of the data of a global run) B. von Haller | WP7 | 21.07.2017

Requirements Sources Mergers running inside the O2 farm (getting their data from QC tasks and other O2 devices) Processes running outside the O2 farm when offloading synchronous processing and asynchronous final processing B. von Haller | WP7 | 21.07.2017

The results of the QC must be available worldwide Requirements Access The results of the QC must be available worldwide Well defined and stable interface hides the underlying technology Access limited to the members of the Collaboration. A public access shall be granted to a selected set of interesting data and results for Public Relations (PR). Data should be queryable (filters at least) SWAN support (?) B. von Haller | WP7 | 21.07.2017

It should therefore support schema evolution. Requirements Preservation and backups QC data is to be kept forever or for a limited duration, depending on the detectors and the tasks. It should therefore support schema evolution. Backups must ensure that data can be recovered at any time in case of major failure B. von Haller | WP7 | 21.07.2017

Is there something missing ? Review of each point in next slides Solutions File-based database SQL database noSQL database CCDB HDFS Is there something missing ? Review of each point in next slides B. von Haller | WP7 | 21.07.2017

Data in (ROOT) files [and metadata in a DB on top] Solutions File-based Data in (ROOT) files [and metadata in a DB on top] Current scheme for Run 2 FXS, OCDB, offline QA Used in Overwatch prototype Concerns Non atomic operations Scaling (number of files and load on metadata server) Archiving not trivial B. von Haller | WP7 | 21.07.2017

Metadata and data (blob) are stored in an SQL database Solutions SQL database Metadata and data (blob) are stored in an SQL database Currently used in : DQM (MySQL) Prototype for QC exists and benchmarked Concerns : Backing up is not trivial on large DB continuously used B. von Haller | WP7 | 21.07.2017

Store metadata and data Solutions noSQL database Store metadata and data Maybe not for the Raw QC but for the Derived QC Prototype exists for trending (ElasticSearch) Concerns : Querying can be complex Data format conversion needed for storing and retrieving Size on disk (?) B. von Haller | WP7 | 21.07.2017

Do not solve the problem ourselves but rely on the CCDB Solutions CCDB Do not solve the problem ourselves but rely on the CCDB Either the main CCDB or a dedicated QC CCDB Concerns Are our requirements possible with the CCDB as envisaged ? How to use it on a development system (detector expert station) ? B. von Haller | WP7 | 21.07.2017

I don’t know this system but it seems promising. HDFS I don’t know this system but it seems promising. B. von Haller | WP7 | 21.07.2017

Clarification on the various subsystems involved Summary Clarification on the various subsystems involved Requirements for the Raw and Derived Database(s) List of possible solutions Discussion Then : interface definition, more prototyping with possibility to switch between backends B. von Haller | WP7 | 21.07.2017

If you want to participate in these discussions but were not involved until now : send me an email ! Requirements work document : https://docs.google.com/document/d/1npTUMwosuwHnvdD3Mb9hZRTBwG5TWAgsGUlUvp22jAk B. von Haller | WP7 | 21.07.2017