ATLAS Offline Database Architecture for Time-varying Data, with Requirements for the Common Project David M. Malon LCG Conditions Database Workshop CERN,

Slides:



Advertisements
Similar presentations
March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,
Advertisements

S.T.A.I.R.. General problem solving strategy that can be applied to a range problems.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
D. Düllmann - IT/DB LCG - POOL Project1 POOL Release Plan for 2003 Dirk Düllmann LCG Application Area Meeting, 5 th March 2003.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB JavaForum.
Database Design Concepts Info 1408 Lecture 2 An Introduction to Data Storage.
Database Design Concepts Info 1408 Lecture 2 An Introduction to Data Storage.
Objectives of the Lecture :
Lecture 3 – Data Storage with XML+AJAX and MySQL+socket.io
Exam examples Tor Stålhane. The A scenario – 1 We are working in a small software development company – 10 developers plus two persons in administrative.
Persistence Technology and I/O Framework Evolution Planning David Malon Argonne National Laboratory 18 July 2011.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
Alignment Strategy for ATLAS: Detector Description and Database Issues
LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.
Time and storage patterns in Conditions: old extensions and new proposals António Amorim CFNUL- FCUL - Universidade de Lisboa ● The “Extended”
Software Solutions for Variable ATLAS Detector Description J. Boudreau, V. Tsulaia University of Pittsburgh R. Hawkings, A. Valassi CERN A. Schaffer LAL,
Copyright (c) Cem Kaner. 1 Software Testing 1 CSE 3411 SWE 5411 Assignment #1 Replicate and Edit Bugs.
CHEP 2006, Mumbai13-Feb-2006 LCG Conditions Database Project COOL Development and Deployment: Status and Plans Andrea Valassi On behalf of the COOL.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Problem Statement: Users can get too busy at work or at home to check the current weather condition for sever weather. Many of the free weather software.
Intermediate 2 Software Development Process. Software You should already know that any computer system is made up of hardware and software. The term hardware.
ALICE, ATLAS, CMS & LHCb joint workshop on
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
An RTAG View of Event Collections, and Early Implementations David Malon ATLAS Database Group LHC Persistence Workshop 5 June 2002.
The Persistency Patterns of Time Evolving Conditions for ATLAS and LCG António Amorim CFNUL- FCUL - Universidade de Lisboa A. António, Dinis.
CHEP /21/03 Detector Description Framework in LHCb Sébastien Ponce CERN.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett Interval of Validity Service IOVSvc ATLAS Software Week May Architecture.
Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
ALICE Condition DataBase Magali Gruwé CERN PH/AIP Alice Offline week May 31 st 2005.
1 CS161 Introduction to Computer Science Topic #9.
Peter Chochula ALICE Offline Week, October 04,2005 External access to the ALICE DCS archives.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
CHEP /21/03 Detector Description Framework in LHCb Sébastien Ponce CERN.
New COOL Tag Browser Release 10 Giorgi BATIASHVILI Georgian Engineering Center 23/10/2012
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB Markus.
LCG Distributed Databases Deployment – Kickoff Workshop Dec Database Lookup Service Kuba Zajączkowski Chi-Wei Wang.
12 February 2004 ATLAS presentation to LCG PEB 1 Why ATLAS needs MySQL  For software developed by the ATLAS offline group, policy is to avoid dependencies.
The Lisbon Team - 8 December 2003The Lisbon team - 25 November 2003 ConditionsDB – Lisbon API Wide access to CondDB data and schema LCG Conditions DB Workshop.
Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,
General requirements for BES III offline & EF selection software Weidong Li.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
Some Notes on Logical File Names and Related Interfaces David Malon ATLAS Database Group LHC Persistence Workshop 5 June 2002.
CERN IT Department CH-1211 Genève 23 Switzerland t COOL Performance Tests ATLAS Conditions Database example Romain Basset, IT-DM October.
Summary of User Requirements for Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Offline Week Alignment and Calibration Workshop February.
Vincenzo Innocente, CERN/EPUser Collections1 Grid Scenarios in CMS Vincenzo Innocente CERN/EP Simulation, Reconstruction and Analysis scenarios.
The DCS Databases Peter Chochula. 31/05/2005Peter Chochula 2 Outline PVSS basics (boring topic but useful if one wants to understand the DCS data flow)
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Summary of persistence discussions with LHCb and LCG/IT POOL team David Malon Argonne National Laboratory Joint ATLAS, LHCb, LCG/IT meeting.
1 Configuration Database David Forrest University of Glasgow RAL :: 31 May 2009.
Metadata Driven Aspect Specification Ricardo Ferreira, Ricardo Raminhos Uninova, Portugal Ana Moreira Universidade Nova de Lisboa, Portugal 7th International.
Database Issues Peter Chochula 7 th DCS Workshop, June 16, 2003.
AliRoot Classes for access to Calibration and Alignment objects Magali Gruwé CERN PH/AIP ALICE Offline Meeting February 17 th 2005 To be presented to detector.
ATLAS The ConditionDB is accessed by the offline reconstruction framework (ATHENA). COOLCOnditions Objects for LHC The interface is provided by COOL (COnditions.
Today… Modularity, or Writing Functions. Winter 2016CISC101 - Prof. McLeod1.
Configuration Database David Forrest University of Glasgow.
POOL Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 04/June/2003.
Modularization of Geant4 Dynamic loading of modules Configurable build using CMake Pere Mato Witek Pokorski
L1Calo Databases ● Overview ● Trigger Configuration DB ● L1Calo OKS Database ● L1Calo COOL Database ● ACE Murrough Landon 16 June 2008.
Database Replication and Monitoring
Online Control Program: a summary of recent discussions
POOL persistency framework for LHC
Simulation and Physics
Andrea Valassi Pere Mato
Offline framework for conditions data
Calibration Infrastructure Design
Presentation transcript:

ATLAS Offline Database Architecture for Time-varying Data, with Requirements for the Common Project David M. Malon LCG Conditions Database Workshop CERN, Geneva, Switzerland 8 December 2003

David M. Malon, ANL LCG Conditions Database Workshop 2 Architectural principles  All data with a time interval (or run interval) of validity are managed via the same temporal database infrastructure  Sometimes people distinguish between conditions and configurations and other kinds of detector description, but (offline) users see no difference in the machinery one uses to get the conditions or the configuration in effect when an event was taken.  We refer to the underlying database infrastructure as an interval of validity database (IOV database) rather than a conditions database for two reasons:  so as not to prejudge the types of data accessible via this means, and  because it is principally a temporal database: conditions data may not reside within this database, but rather, may be stored externally to the database hosting the interval-of-validity infrastructure

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 3 Architectural principles  It must be possible to assign an interval of validity to any data object accessible to standard execution frameworks (Athena, for ATLAS), independent of the technology used to store that object.  Storing an object, and assigning a validity interval to it, may be (widely) separated in time.  Example 1: Online, a configuration may be chosen from a portfolio of stored configurations, each with no inherent interval of validity.  Example 2: The expert who updates the muon geometry does not know the range of simulation runs for which it will be used.

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 4 Architectural principles  It must not be necessary to copy an object in order to assign an interval of validity to it.  Example 1: A reference to the configuration selected online, which may reside in a relational database, is registered with a range of test beam runs as the interval of validity  Example 2: A reference to the muon geometry, which may be described in an XML file, is registered with a range of simulation runs as the interval of validity  Example 3: If I use the same configuration as in Example 1 or the same geometry as in Example 2 for a later range of runs, I should not need to write it a second time.

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 5 Registration and mediation  The IOV service is, principally, a registration service and a mediator.  An object may be stored in any supported technology (ROOT, POOL ROOT, MySQL{NOVA}, plain ASCII or XML files, strings,…), and later registered in the IOV database.  This does not mean that all technology choices are equally sensible for all purposes  Storing the data object in the temporal database itself is one important possibility, but it is an optimization choice—it must not be a design limitation.  LHC experiments already know how to store complex objects  In ROOT directly, via POOL,…  Should not be required to solve this problem again in order to use an IOV service  Registration means associating an interval of validity, a tag/version, …, with (a reference to) the object.

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 6 Mediation  On input, the IOV service mediates access to data, helping to choose the correct instance of a data object—the one with the correct timestamp and tag  In Athena, the transient IOV service 1. checks the current event timestamp as appropriate 2. consults IOV database to get a reference to the correct version of data 3. invokes standard Athena conversion services to put conditions objects in the transient store  “Correct” for non-specialists usually means “the endorsed one corresponding to this event”  Version/tag information is likely supplied in standard job options  Both mediated and unmediated access are possible: if one has a “direct” reference to the object of interest, it is not necessary to pass through the IOV database (mediator) to retrieve it.  One can get Version P of the ATLAS muon geometry without dealing with interval of validity databases, on the other hand, the IOV database would be used to discover that Version P was used for simulation runs [m,n]  Similar calibration example…

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 7 IOV Database Conditions Data Writer Conditions or other time- varying data 1. Store an instance of data that may vary with time or run or … 2. Return reference to data 3. Register reference, assigning interval of validity, tag, …

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 8 IOV Database Athena Transient Conditions Store Conditions data 1. Folder (data type), timestamp, tag, 2. Ref to data (string) 3. Dereference via standard conversion services 4. Build transient conditions object Conditions data client

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 9 ATLAS development strategy to date  Employ common solutions wherever possible.  ATLAS has contributed requirements and feedback to common CERN IT/DB, ATLAS, LHCb, HARP, COMPASS, … project  Lisbon TDAQ group has implemented this interface in MySQL: this is what ATLAS offline uses for its IOV database  Athena transient interval-of-validity service  checks current event timestamp,  compares to validity intervals of already-loaded time-varying objects,  triggers loading of references time-valid objects when needed  IOVDbSvc does the loading, allowing standard conversion services to build the transient object from the persistent data

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 10 Notes on architecture  The architecture lets one register conditions data stored in ASCII files, ROOT files, MySQL databases, …, in an IOV database.  Is this all that it takes to be “in” the conditions database: do whatever you want, but register your objects in the IOV database? (I hope we’ll do better.)  We still need to manage all those files coherently, and catalog them.  One can imagine  Configuration data written to their own files or databases  DCS data written to their own files or databases in a possibly different way  Subdetector-specific conditions written to their own files or databases  Different simulation geometries in different ASCII (XML?) files  …other partitioning by domain…  …all registered in the same IOV service  Possible as long as one can represent a “reference” to the data object as a string

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 11 Technologies  What about storage technologies for conditions objects themselves? Anything readable by our frameworks is okay in theory, but what are good choices?  For complex objects, an obvious alternative is to use the same technology that is used for event data: POOL infrastructure, with ROOT as the storage layer  For small amounts of data, one can imagine storing the data, rather than a reference to the data, in the string (blob) managed by the IOV database  ATLAS offline has used  IOV+NOVA (an ATLAS relational-database-hosted product)  IOV+POOL: expect the common project to support this  IOV+{XML strings}

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 12 Storage technologies  Since we are using a relational IOV database implementation, using the same relational database is another “obvious” and attractive alternative  Are schema equally obvious? Perhaps this project can find consensus  One can imagine using the LCG SEAL dictionary for conditions object definitions, and POOL/ROOT (or POOL/{relational database}) as a storage layer  This has the advantage that users would describe event and conditions data using exactly the same tools  Conversely, it is easy to imagine a standard transient mapping (via POOL?) of simple relational table structures; with reasonable “reference” conventions, these could easily be used for data managed by the IOV database  Sometimes transient object definition has primacy; sometimes persistent table schema has primacy: we should support both cases

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 13 Beyond intervals of validity  Is it obvious that intervals of validity are the right model for all temporal LHC data?  What about alarms, and periodic measurements?  If I measure pressure at times t0, t1, t2, …, it is entirely artificial to say that the pressure at t1 has an interval of validity [t1,t2)  At time t in [t1,t2), I am more likely to want the previous and next pressure measurements, or all the pressure measurements in (t-d,t+d)  No reason to say that the pressure at t1 is the valid one  Need an extended API: We would like the common project to think about this

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 14 Appendix: tagging extensions  Several people (myself included) expressed concern about the limitations of the current tagging model in the common project interval of validity (“conditions”) software at the 4-5 February 2003 ATLAS database workshop  The following slides describe a modest proposal to change/extend the tagging interface, beginning with a simplified scenario that motivates this proposal  Agreed (ATLAS, LHCb: Pere Mato), but extensions not yet implemented

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 15 Calibration scenario: Phase 1  A calibration expert is experimenting with a variety of algorithms and algorithm parameters. After a calibration run, she produces calibration constants using three different algorithms, with an interval of validity that lets her apply them to a range of runs and compare the results Algorithm 1 Algorithm 2 Algorithm 3 time “version”

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 16 Calibration scenario: Phase 2  After looking at the results, she believes that Algorithm 2 is pretty good, but Algorithm 3 is the best  After the next calibration run, she therefore computes calibration constants using Algorithm 3, and assigns an interval of validity corresponding to a new range of runs Algorithm 1 Algorithm 2 Algorithm 3 time “version”

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 17 Calibration scenario: Phase 3  Just to be certain before announcing anything for collaboration-wide use, she computes calibration constants from this latest calibration run using Algorithms 1 and 2, and compares the results when these are applied to the recent runs Algorithm 1 Algorithm 2 Algorithm 3 Algorithm 1 Algorithm 2 time “version”

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 18 Calibration scenario: Phase 4  She is slightly surprised when it appears that Algorithm 2 is a better choice, and, after looking at her results from the first calibration run, she decides that the Algorithm 2 results are what should be tagged for Production  … but the two Algorithm 2 objects were NEVER the HEAD: there is nothing she could have done (unless she were prescient) with tools that tag only the HEAD Algorithm 1 Algorithm 2 Algorithm 3 Algorithm 1 Algorithm 2 time “version”

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 19 Things could have been easy  What she would have liked to do was this:  When she inserted an object produced by Algorithm N, she wanted to label (tag) it “Algorithm N” at insertion time  She may not be a C++ expert, but she could certainly have added the string “Algorithm N” to her argument list inside her Algorithm N code  How would this work with overlapping intervals?  Easy: an interval added with a tag splits only intervals with the same tag (and the HEAD, if you like, for you folks who like to trust the HEAD)

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 20 Comments on robustness  It is needlessly risky to build a database infrastructure that relies on order of insertion into the database  What could the calibration expert have done differently—waited until the Nth calibration run to begin her comparison, making sure she always ran her algorithms in the same order?  If someone makes a mistake, do we need to be so unforgiving?  Internal versions do not help. Even if she kept a log of everything she did, including the order in which she ran her algorithms, she might be able to guess the version numbers when intervals do not overlap— when they do, the situation is hopeless  …and it’s worse if she has a colleague exploring alternatives (but people assure me that this will never happen …)  Are we willing to bet our database on this?

8 December 2003 David M. Malon, ANL LCG Conditions Database Workshop 21 A question with no context  For some conditions, run ranges are the most natural intervals of validity; for others, time ranges are more natural  With some work, “real” runs can be associated with time intervals, but for simulation, this requires applying some rather arbitrary and artificial conventions (retroactively, in our case)  Query to other experiments: Would it be useful to have the project support more than one kind of validity “key,” e.g., timestamps and {run,event} ranges, or {run,event}  time mapping services?