Event Storage GAUDI - Data access/storage Framework related issues

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Dr. Kalpakis CMSC 661, Principles of Database Systems Representing Data Elements [12]
March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Data - Information - Knowledge
Chapter 12 File Management Systems
LHCb Software week November, 1999 M.Frank LHCb/CERN N-Tuples within Gaudi ã Definition ã Usage ä Run through simplified example ã The persistent.
M.Frank LHCb/CERN - In behalf of the LHCb GAUDI team Data Persistency Solution for LHCb ã Motivation ã Data access ã Generic model ã Experience & Conclusions.
Database Design - Lecture 2
SCADA. 3-Oct-15 Contents.. Introduction Hardware Architecture Software Architecture Functionality Conclusion References.
File Structures Foundations of Computer Science  Cengage Learning.
Event Data History David Adams BNL Atlas Software Week December 2001.
CHEP 2003 March 22-28, 2003 POOL Data Storage, Cache and Conversion Mechanism Motivation Data access Generic model Experience & Conclusions D.Düllmann,
ALICE, ATLAS, CMS & LHCb joint workshop on
Lesson Overview 3.1 Components of the DBMS 3.1 Components of the DBMS 3.2 Components of The Database Application 3.2 Components of The Database Application.
CHEP /21/03 Detector Description Framework in LHCb Sébastien Ponce CERN.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
Apr. 8, 2002Calibration Database Browser Workshop1 Database Access Using D0OM H. Greenlee Calibration Database Browser Workshop Apr. 8, 2002.
CHEP /21/03 Detector Description Framework in LHCb Sébastien Ponce CERN.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans.
CIS 250 Advanced Computer Applications Database Management Systems.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Dictionary and POOL Dirk Duellmann.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
The Storage Resource Broker and.
11/01/20081 Data simulator status CCRC’08 Preparatory Meeting Radu Stoica, CERN* 11 th January 2007 * On leave from IFIN-HH.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Next-Generation Navigational Infrastructure and the ATLAS Event Store Abstract: The ATLAS event store employs a persistence framework with extensive navigational.
MCSE: Windows Server 2003 Active Directory Planning, Implementation, and Maintenance Study Guide, Second Edition (70-294) Chapter 1: Overview of the Active.
M.Frank, CERN/LHCb Persistency Workshop, Dec, 2004 Distributed Databases in LHCb  Main databases in LHCb Online / Offline and their clients  The cross.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
David Adams ATLAS Hybrid Event Store Integration with Athena/StoreGate David Adams BNL March 5, 2002 ATLAS Software Week Event Data Model and Detector.
Intro to MIS – MGS351 Databases and Data Warehouses
Root I/O and the Gaudi Framework
(on behalf of the POOL team)
Database Management System
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Central Florida Business Intelligence User Group
Databases and Data Warehouses Chapter 3
by Prasad Mane (05IT6012) School of Information Technology
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Introduction to Database Management System
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Data Base System Lecture : Database Environment
Database management concepts
SW Architecture SG meeting 22 July 1999 P. Mato, CERN
Dr. Awad Khalil Computer Science Department AUC
Lecture 1: Multi-tier Architecture Overview
LHCb Detector Description Framework Radovan Chytracek CERN Switzerland
Data Persistency Solution for LHCb
MANAGING DATA RESOURCES
Database Systems Instructor Name: Lecture-3.
Presented by: Jacky Ma Date: 11 Dec 2001
Technical Capabilities
Metadata The metadata contains
DATABASES WHAT IS A DATABASE?
Dr. Awad Khalil Computer Science Department AUC
Chapter 2 Database Environment Pearson Education © 2009.
Use Of GAUDI framework in Online Environment
Configuration DB Status report Lana Abadie
Planning next release of GAUDI
IBM Tivoli Storage Manager
Lecuter-1.
LHCb Detector Description Framework Radovan Chytracek CERN Switzerland
An Interactive Browser For BaBar Databases
Presentation transcript:

Event Storage GAUDI - Data access/storage Framework related issues Data management aspects Should we go for a Data Challenge?

Structure of the Data Store Tree - similar to file system Identification by logical addresses: ”/Event/Mc/MCParticles” Tree node has data members (payload) contains other node objects (directory structure) GAUDI M.Frank LHCb/CERN

Layout of the Data Object /node1 /node1/B /node1/A Data Members Node objects /node1/A/A2 /node1/A/A1 Data Object Symbolic links C D /node2 /node2/D /node2/C Data Members Data Object GAUDI M.Frank LHCb/CERN

Generic Model: Ingredients Data Store ConversionSvc Class ID, Object path Storage type DB name Cont.name Item ID Disk Storage database Database Container Objects GAUDI M.Frank LHCb/CERN

Generic Model: Extended Object ID Storage Type Class ID OR OID Objectivity Storage Type dependent part Link ID Record ID ZEBRA ROOT RDBMS GAUDI M.Frank LHCb/CERN

Data Serialization Object serialization a la ROOT/MFC/Java to a byte stream Machine & technology independent data format Store object as BLOB in database StreamBuffer& MCVertex::serialize( StreamBuffer& s ) { ContainedObject::serialize(s); s >> m_position >> m_timeOfFlight >> m_motherMCParticle(this) >> m_daughterMCParticles(this); return s } GAUDI M.Frank LHCb/CERN

Questions Are the Blobs a problem? Is a data dictionary necessary? Generation of converters C++ and Java interoperability Handling of schema updates could possibly be automated Is schema evolution sufficiently supported? Persistent schema + Transient schema Does this model also work for the online? GAUDI M.Frank LHCb/CERN

Binary Large Objects (BLOB)  Only one type of persistent objects  No updates to persistent schema Objectivity/DB - Low level access to object properties is lost - Knowledge about data interpretation may not be lost GAUDI M.Frank LHCb/CERN

Language Independent Data Description  Automatic generation of C++/Java stubs  Automatic generation of Converters  Store dictionary with the data in the database - Argghh! Another pre-processor???? - Can only describe data, no “real” behaviour GAUDI M.Frank LHCb/CERN

Event Tags and Collections N-Tuple like quantities with reference to event We simply need them Content must be configurable “Official” tags from online, collaboration wide data processing Group tags User tags Must support queries SQL ? GAUDI M.Frank LHCb/CERN

Schema Updates Is our class ID/version mechanism sufficient Class identifier equivalent Major match Class version equivalent Minor match Objectivity’s mechanism is not sufficient How do we supply data to “improved” objects? Is this possible at all ? GAUDI M.Frank LHCb/CERN

Data Storage Hierarchy Physics Algorithm (a) Framework Transient Data Representation Persistent Data Representation Database internal Representation Database managed disk files Tertiary Storage (Tape) NETWORK (opt) Full blown Transient Data Representation Persistent Data Representation Database internal Representation Database managed disk files Light weight (b) Database Technology HSM GAUDI M.Frank LHCb/CERN

What to do? Data Challenges ”Never believe something will 'scale', you've been there or not" "Individually all components work fine, when put together only then the problems show.” Data Challenges Identify missing components Check out persistency model Possibility to stress the database technology GAUDI M.Frank LHCb/CERN

Data Challenges Babar ALICE CMS 1st: Test the data processing chain 2nd: Test of Objectivity ALICE 1st/2nd: Test of ROOT I/O + Data management CMS Simulation/analysis environment using Objectivity GAUDI M.Frank LHCb/CERN

First Data Challenge (1) What should be tested? Processing scenarios Simulation (online like?) Reprocessing Physics analysis Use of “the” complete database technology GAUDI is open Test one database or several? Focus on the “usability” questions Does the database fulfill basic performance needs GAUDI M.Frank LHCb/CERN

First Data Challenge (2) Identify missing components Data management Farm management HSM “Dummy” data processing software Emulate the software’s behavior Intelligent guess of object access Small dedicated facility A few disconnected boxes No interference with other users (reboot,…) GAUDI M.Frank LHCb/CERN

First Data Challenge: Setup Disk Hub ... Database Server Client(s) GAUDI M.Frank LHCb/CERN

Second Data Challenge Test full processing chain Test missing components from DC1 Complete data processing software Simulation, reconstruction, analysis programs Small dedicated facility Cannot be a “dedicated” LHCb facility IT farm project GRID ? GAUDI M.Frank LHCb/CERN

Second Data Challenge DB Servers Disk Tape Robot Switch ... ... GAUDI M.Frank LHCb/CERN

Conclusions There are open questions to be solved Questions about the framework Data modeling Schema handling Questions about technologies to be used Database technology Data management Once these are addressed Should we go for a Data Challenge ? GAUDI M.Frank LHCb/CERN

Let’s go for the discussion Conclusions I only gave the discussion input We have to define to TODO list together Let’s go for the discussion GAUDI M.Frank LHCb/CERN