The new AliRoot DB access classes

Slides:



Advertisements
Similar presentations
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Advertisements

Chapter 9 Chapter 9: Managing Groups, Folders, Files, and Object Security.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
ASP.NET Programming with C# and SQL Server First Edition Chapter 8 Manipulating SQL Server Databases with ASP.NET.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Introduction to Classes and Objects CS-2303, C-Term Introduction to Classes and Objects CS-2303 System Programming Concepts (Slides include materials.
1 Securing Network Resources Understanding NTFS Permissions Assigning NTFS Permissions Assigning Special Permissions Copying and Moving Files and Folders.
Chiara Zampolli in collaboration with C. Cheshkov, A. Dainese ALICE Offline Week Feb 2009C. Zampolli 1.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Xiaoying Gao, Peter Andreae, VUW Indexing Large Data COMP
CLEO’s User Centric Data Access System Christopher D. Jones Cornell University.
Unit J: Creating a Database Microsoft Office Illustrated Fundamentals.
FireRMS SQL Audit, Archiving & Purging Presented by Laura Small FireRMS Quality Assurance.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
XP New Perspectives on Microsoft Office Access 2003 Tutorial 12 1 Microsoft Office Access 2003 Tutorial 12 – Managing and Securing a Database.
Introduction to Hall-D Software February 27, 2009 David Lawrence - JLab.
DAY 14: ACCESS CHAPTER 1 Tazin Afrin October 03,
Alice off-line meeting Alberto Colla Cern, October 3, 2005 AliEn How-To Alice off-line meeting Cern, October 3, 2005 Alberto Colla (Alice off-line Calibration.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice SISP Training Documentation Template.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
C++ Implementation ( Version 1 – Text Interface ) Elimination of services of our system. Elimination of services of our system. General Flow of the program.
Question of the Day  On a game show you’re given the choice of three doors: Behind one door is a car; behind the others, goats. After you pick a door,
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett Interval of Validity Service IOVSvc ATLAS Software Week May Architecture.
ALICE Condition DataBase Magali Gruwé CERN PH/AIP Alice Offline week May 31 st 2005.
4 Oct 2005 / Offline week Shuttle program for gathering conditions data from external DB Boyko Yordanov 4 October 2005 ALICE Offline week.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
Session 1 Module 1: Introduction to Data Integrity
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Alberto Colla - CERN ALICE off-line week 1 Alberto Colla ALICE off-line week Cern, May 31, 2005 Table of contents: ● Summary of requirements ● Description.
SOCSAMS e-learning Dept. of Computer Applications, MES College Marampally FILE SYSTEM.
Slides prepared by Rose Williams, Binghamton University Chapter 16 Collections and Iterators.
M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.
Summary of User Requirements for Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Offline Week Alignment and Calibration Workshop February.
Star Database Tutorial Package Design & Objectivity Discussion Interface Questions – What do you want? -> making requests – What do you get? -> data container.
G.Govi CERN/IT-DB 1GridPP7 June30 - July 2, 2003 Data Storage with the POOL persistency framework Motivation Strategy Storage model Storage operation Summary.
11-1 © Prentice Hall, 2004 Chapter 11: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
AliRoot Classes for access to Calibration and Alignment objects Magali Gruwé CERN PH/AIP ALICE Offline Meeting February 17 th 2005 To be presented to detector.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
AAF tips and tricks Arsen Hayrapetyan Yerevan Physics Institute, Armenia.
HYDRA Framework. Setup of software environment Setup of software environment Using the documentation Using the documentation How to compile a program.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Lesson 13 PROTECTING AND SHARING DOCUMENTS
MIKADO – Generation of ISO – SeaDataNet metadata files
CS 325 Spring ‘09 Chapter 1 Goals:
ASP.NET Programming with C# and SQL Server First Edition
Introduction To DBMS.
Data Indexing Herbert A. Evans.
Chapter 14: System Protection
Java API del Logical File Catalog (LFC)
OCDB Issue: how to handle storages
Chapter 19 Java Data Structures
News on the CDB Framework
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
Off-line weekly meeting
Analysis framework - status
Lesson 13 PROTECTING AND SHARING DOCUMENTS
Un</br>able’s MySecretSecrets
Database Applications – Microsoft Access
Topics Introduction to File Input and Output
CS179G, Project In Computer Science
Microsoft Office Access 2003
CSE 303 Concepts and Tools for Software Development
Chapter 9: Managing Groups, Folders, Files, and Object Security
Topics Introduction to File Input and Output
Unit J: Creating a Database
Offline framework for conditions data
Calibration Infrastructure Design
Introduction to Classes and Objects
Presentation transcript:

The new AliRoot DB access classes Alberto Colla (Alice off-line Calibration and Alignment grup) Alice off-line meeting Cern, October 3, 2005

Summary History Underlying principles New features (wrt first publication, June 2005) Description of the CDB access classes Examples of use cases

“History” of DB access classes Original idea and first implementation by T. Kuhr (late 2004) Since February 2005: work on the framework implementation is carried out in the core offline group First presentation of the prototype to the off-line community: June 2005 Alice off-line meeting Development performed taking into account the many and useful discussions with software and detector experts which followed the publication of the prototype

Underlying philosophy The Alice offline calibration and alignment framework provides the software infrastructure for storage and access to the experiment condition data Calibration and alignment objects are Root TObjects stored into Root files Calibration and alignment objects must be run dependent objects Database is read-only (automatic versioning tools) The framework provides storage and access into Grid and local environment Storage and retrieval technique is transparent to the user

New features (Introduction) Implementation of the Grid storage access class AliCDBGrid New manager class AliCDBManager Handles activation and deactivation of one or more storage system Owns the instances of the active storages “Factory” and “Parameter” classes associated to each specific storage Used by the manager to activate storage locations Storage systems are identified by a string (“uri”) or set of parameters New versioning schema introduced Two version numbers: “Grid” version and “local” (sub)version Object's container class (AliCDBEntry) and object's metadata classes have been redesigned

Software requirements AliRoot HEAD Root v5-04-00 For Grid access: Alice VO registration, AliEn client (gShell) AliCDB* classes are in STEER: AliCDBManager AliCDBStorage AliCDBGrid, AliCDBLocal, AliCDBDump AliCDBEntry AliCDBId AliCDBMetaData

CDB access classes schema AliCDBManager <<singleton>> AliCDBStorage AliCDBGrid AliCDBLocal AliCDBDump AliCDBFactory AliCDBParam AliCDBGridFactory AliCDBGridParam Framework proposed and mainly developed by Boyko Yordanov AliCDBEntry TObject AliCDBId AliCDBMetaData AliCDBPath version, subVersion AliCDBRunRange AliCDBLocalFactory AliCDBLocalParam AliCDBDumpFactory AliCDBDumpParam

CDB access classes relationships AliCDBManager AliCDBXXXFactory GetStorage Put/Get AliCDBGrid/Local/Dump PutEntry/GetEntry AliCDBEntry DataBase (Root files) AliCDBStorage AliCDBXXXParam Storage activation Object storage/retrieval

Summary History Underlying principles New features (wrt first publication, June 2005) Description of the CDB access classes Examples of use cases

AliCDBEntry Container class. It has: The calibration or alignment object (anything inheriting from TObject) The object's identifier (AliCDBId) The object's metadata (AliCDBMetaData) Remember: Each AliCDBEntry contains a single object (which can be a container of more objects). It is identified by a name (path) and its validity is specified by a run range and a version. Some public AliCDBEntry methods: SetObject(TObject*), TObject* GetObject() SetId(const AliCDBId&), AliCDBId& GetId() SetMetaData(AliCDBMetaData*), AliCDBMetaData* GetMetaData() SetOwner(Bool_t), Bool_t IsOwner() SetOwner sets AliCDBEntry object as the owner of the TObject and AliCDBMetaData objects (so that they are deleted with AliCDBEntry)

AliCDBId Contains the set of the object's metadata which uniquely identifies it (path, run validity range, versions) It has two purposes: During storage it is used to build the location (e.g. directory path, file name) where the object will be stored During retrieval it is used to identify the object and, if needed, to specify the required version Data members: AliCDBPath fPath: the object's path AliCDBRunRange fRunRange: the object's validity range Int_t fVersion, Int_t fSubVersion: the object's Grid and local versions TString fLastStorage: “previous” storage location of the object (new, grid, local, dump). It is set at first storage and during object's retrieval and helps to “backtrace” the object's history.

AliCDBPath, AliCDBRunRange AliCDBPath contains the object's path name (TString fPath) The path must have a three-level directory structure: “level0/level1/level2” Example: “ZDC/Calib/Pedestals” Wildcard character * allowed if path is used to specify selection criteria or for multiple object retrieval (e.g. “ZDC/*” or “TPC/Calib/*” ...) AliCDBRunRange contains the run validity range of the object (Int_t fFirstRun, Int_t fLastRun) AliCDBId contains public getter/setters for path, run numbers, versions ...

AliCDBMetaData Contains the set of the object's metadata not used for storage/retrieval Data members and getter/setters for: Object's class name (TString) Responsible's name (TString) AliRoot version used for the object (TString) Beam period number (UInt_t) Comment string (TString) TMap of any additional set of “properties”: TMap format: (const char* property, TObject* object) Getter function to get the metadata object associated to “property”: TObject* GetProperty(const char* property) see also: RemoveProperty(…), PrintMetaData()

AliCDBManager Singleton ( AliCDBManager::Instance() ) Owner of the activated storage object instances holds: List of the registered factories (3 available storage factories: Dump, Local, Grid). List (TMap) of active storages (storage object instances created with AliCDBManager::GetStorage()) Factory registration is hard-coded; it is done at the first call of AliCDBManager::Instance() AliCDBGridFactory is registered only if Root is enabled for AliEn access If Grid factory is not registered the corresponding storage cannot be activated (null AliCDBStorage pointer returned)

AliCDBManager (2) To activate a new storage instance use AliCDBManager method GetStorage: AliCDBStorage* GetStorage(const char* dbString); AliCDBStorage* GetStorage(const AliCDBParam* param); Storage type “URI” Set of parameters identifying the storage Returns pointer to the active instance of AliCDBStorage GetActiveStorages() returns list of active storages Public methods added to select single “default storage” and “drain storage” (see later) Destroy() method deletes AliCDBManager instance and all the active storages

AliCDBStorage Interface for the concrete storage types (Dump, Local, Grid) Public virtual functions to store/retrieve objects: Bool_t Put(AliCDBEntry* entry); Bool_t Put(object, Id, MetaData) AliCDBEntry* Get(const AliCDBId& query) AliCDBEntry* Get(“path”, runNumber, version, subVersion) TList* GetAll(const AliCDBId& query) TList* GetAll(“path”, runNumber, …) Single request Multiple request

AliCDBStorage (2) During retrieval, AliCDBId query is used to specify: The path of the requested object (wildcards allowed for multiple requests) The run number Optionally, the version and subversion (highest version search if not specified) Possibility to specify a list of “selection criteria” has been mantained: Void AddSelection(const AliCDBId& selection) Void AddSelection(“path”, firstRun, lastRun, version, subVersion) See also: RemoveSelection(...), RemoveAllSelections(), PrintSelectionList()

AliCDBGrid TPC ZDC Calib Gains Align level1 level2 level3 Access class to an object stored into a Grid database Based on the Root TGrid/Talien plugin, uses gliteUI libraries U.r.i. pattern: “alien://host:port;user;DBPath;SE” Example: “alien://aliendb4.cern.ch:9000;colla;DBFolder;ALICE::CERN::se01” If DBFolder is not a full path it is created from the home directory AliCDBGridParam members: fHost, fPort (UInt_t), fUser, fDBPath, fSE (TString) One single AliCDBEntry stored in each TAlienFile: level1 level2 level3 Run#fr_#lr_v#gv.root DBFolder Grid Run1_10_v0.root Run11_20_v0.root Run21_v0.root Run22_30_v0.root ... TPC Pedestals ZDC Calib Gains Align

AliCDBLocal TPC ZDC Calib Gains Align level1 level2 level3 Access class to an object stored into a local database U.r.i. pattern: “local://DBPath” If DBPath is not a full path it is created from the working directory AliCDBLocalParam member: fDBPath One single AliCDBEntry stored in each local root file: level1 level2 level3 Run#fr_#lr_v#gv_s#lv.root DBFolder Local Run1_10_v0_s0.root Run1_10_v0_s1.root Run11_20_v1_s0.root Run11_20_v1_s1.root ... TPC Pedestals ZDC Calib Gains Align

AliCDBDump Access class to an object stored into a “dump” local file U.r.i. pattern: “dump://fileName(;ReadOnly)” If fileName is not a full path the file is created/opened in the working directory If ReadOnly is specified the file is opened in read-only mode AliCDBDumpParam member: fDBPath, Bool_t fReadOnly All the AliCDBEntry objects stored in the dump root file: Run1_10_v0_s0 Run1_10_v0_s1 Run11_20_v1_s0 Run11_20_v1_s1 ... TPC Pedestals ZDC Calib Gains Align Local DumpFile.root: TDirectory TKey name

New versioning schema Object version is automatically set during storage Two version numbers: the first one stands for “Grid version”, the second (subVersion) stands for “Local version” Example: Run1_10_v0_s0.root Local Grid New object stored Run1_10_v0_s1.root Run1_10_v0_s2.root Local updates Run1_10_v1.root Grid transfer Run1_10_v2.root Run1_10_v3.root Grid updates Run1_10_v3_s0.root Local transfer Run1_10_v3_s1.root ... Error is returned if someone tries to transfer the same object from Grid to local more than once (protection against mess)

Summary History Underlying principles New features (wrt first publication, June 2005) Description of the CDB access classes Examples of use cases

Activation of new storage locations Using the storage's URI AliCDBManager *man = AliCDBManager::Instance(); AliCDBStorage *storGrid = man->GetStorage (“alien://aliendb4.cern.ch:9000;colla;DBFolder;ALICE::CERN::se01”); AliCDBStorage *storLoc = man->GetStorage(“local:///work/DBFolder”); Using the AliCDBParam class AliCDBGridParam param (“aliendb4.cern.ch”,9000,”colla”,”DBFolder”,”ALICE::CERN::se01”); AliCDBStorage *storGrid = AliCDBManager::Instance()->GetStorage(&param);

Object storage Object stored into local file: // some code to create the TObject *object ... // set object's path and run validity range in AliCDBId AliCDBId id(“ZDC/Calib/Pedestals”,1,10); // Set additional object's metadata AliCDBMetadata md; md.Set... //fill metadata using AliCDBMetaData setters // Put object into the database storLoc->Put(object, id, &md); path runRange Object stored into local file: DBFolder/ZDC/Calib/Pedestals/Run1_10_v0_s0.root

Object retrieval Single object retrieval run Multiple object retrieval AliCDBEntry *entry; entry = storLoc->Get(“ZDC/Calib/Pedestals”,5); entry = storLoc->Get(“ZDC/Calib/Pedestals”,5,2); entry = storLoc->Get(“ZDC/Calib/Pedestals”,5,2,4); // Get Id, metaData, object from entry AliCDBId id = entry->GetId(); AliCDBMetadata *md = entry->GetMetaData(); ObjClass *obj = entry->GetObject(); run Look for highest version & subVersion Look for version 2 & highest subVersion & subVersion 4 Single object retrieval TList *list; // list will contain AliCDBEntry obj's list = storLoc->GetAll(“ZDC/Calib/*”,5); entry = (AliCDBEntry*) list->At(0); Multiple object retrieval AliCDBEntry must be cast!

Object retrieval (2) Object retrieval using AliCDBStorage “selection criteria” methods: // I want version 2 for all “ZDC/Calib/*” obj's for runs 1 to 100 storLoc->AddSelection(“ZDC/Calib/*”,1,100,2) // and version 1_0 for “ZDC/Calib/Pedestals” obj's for runs 5-10 storLoc->AddSelection(“ZDC/Calib/Pedestals”,5,10,1,0) TList *list = storLoc->GetAll(“ZDC/*”,5) “General” selection criteria (“ZDC/*”) should be added before more specific ones!

Default and Drain storages AliCDBManager::Instance()->SetDefaultStorage(const char* “uri”); AliCDBManager::Instance()->SetDefaultStorage(AliCDBParam* param); AliCDBManager::Instance()->SetDefaultStorage(AliCDBStorage* sto); AliCDBManager::Instance()->SetDrain(const char* “uri”); AliCDBManager::Instance()->SetDrain(AliCDBParam* param); AliCDBManager::Instance()->SetDrain(AliCDBStorage* sto); Among the active AliCDBStorage objects collected by AliCDBManager, one can choose two as the “default” and “drain” storages: If the storage instance is not present in the collection it is created and added to it The first created storage instance is automatically set as the default storage AliCDBManager::Instance()->RemoveDefaultStorage(); AliCDBManager::Instance()->RemoveDrain(); Removal of default and drain storages (objects aren’t removed from list of active storages!)

Use of default and drain storages To check the activation of the default and drain storage pointers: (Bool_t) AliCDBManager::Instance()->IsDefaultStorageSet(); (Bool_t) AliCDBManager::Instance()->IsDrainSet(); The pointer to the default storage is returned by: AliCDBManager::Instance()->GetDefaultStorage(); If the drain storage is activated, each entry retrieved from any storage is put into it: AliCDBManager *man = AliCDBManager::Instance(); man->GetStorage(“alien://...”); // this is the default storage man->SetDrain(“dump://DBDrain.root”); // this is the drain storage AliCDBEntry *entry; entry = man->GetDefaultStorage()->Get(“ZDC/Calib/Pedestals”,5); Retrieved entry is drained into dump file!

For further examples... Run tutorial macro macros/DBAccessTutorial.C It requires AliEn access! If AliEn is not enabled in Root, replace the alien storage activation with a local “dummy” one ... Follow today's “live” tutorial!

Proposal of a new storage schema New storage schema proposed by Boyko Yordanov which optimizes time efficiency of storage and retrieval processes Current implementation: Every set of objects with same name (e.g. “ZDC/Calib/Pedestals”) is stored in the same location (“linearly”), regardless of their versions: ZDC/Calib/Pedestals: Run0_10_v1.root Run0_10_v2.root Run11_20_v1.root ... For a modified object (new version), it is necessary to iterate over the already existing ones to get the version number. The same number of iterations is needed for automatic data retrieval (highest version)

New data storage idea For a new object (e.g. ZDC/Calib/Pedestals): new branch, and for every version new sub-branch with the same name as the version number The “leaves” are the objects (files, root keys etc.) with name determined by the run range (and possibly the version still appended for clarity) Taking into account that for a given version there is no overlapping run ranges, we can order them This structure allows for less iterations in most of the cases thanks to the additional version branch and “Binary Tree” optimization. example: ZDC/Calib/Pedestals/1/ Run0_10_v1.root Run11_20_v1.root ... ZDC/Calib/Pedestals/2/ Run0_10_v2.root

Performance tests Sequential run range storage (no overlapping): This test stores and retrieves values for particular object increasing run range every time by one. There is only one version number. With the “current” method put/get time depends on the number of files. With the “New” method put/get time is constant.

Performance tests (2) Random run range storage (overlapping): This test stores and retrieves values of particular object with random run range. Run range is overlapping and version number increases. With the “Current” method put/get time depends on the size of DB. With the “New” method put/get time is constant.