CHEP 2003 March 22-28, 2003 POOL Data Storage, Cache and Conversion Mechanism Motivation Data access Generic model Experience & Conclusions D.Düllmann,

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Database System Concepts and Architecture
March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,
M.Frank LHCb/CERN POOL persistency: Status  Current understanding  Today’s model  Conclusions.
Transaction.
1 Foundations of Software Design Fall 2002 Marti Hearst Lecture 18: Hash Tables.
Reza Gorgan Mohammadi AmirKabir University of Technology, Department of Computer Engineering & Information Technology Advanced design.
Organizing Data & Information
The Event as an Object-Relational Database: Avoiding the Dependency Nightmare Christopher D. Jones Cornell University, USA.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Database Features Lecture 2. Desirable features in an information system Integrity Referential integrity Data independence Controlled redundancy Security.
1 A Student Guide to Object- Orientated Development Chapter 9 Design.
Database Software Application
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
D. Duellmann, CERN Data Management at the LHC1 Data Management at CERN’s Large Hadron Collider (LHC) Dirk Düllmann CERN IT/DB, Switzerland
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
M.Frank CERN/LHCb - Persistency Workshop, Dec.2004 Agenda: 2 separate talks! Don’t mangle together what does not belong together 1.Schema Evolution Tests.
M.Frank LHCb/CERN - In behalf of the LHCb GAUDI team Data Persistency Solution for LHCb ã Motivation ã Data access ã Generic model ã Experience & Conclusions.
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Software Solutions for Variable ATLAS Detector Description J. Boudreau, V. Tsulaia University of Pittsburgh R. Hawkings, A. Valassi CERN A. Schaffer LAL,
Oleh Munawar Asikin. Principles of Information Systems, Seventh Edition 2  Database management system (DBMS): group of programs that manipulate database.
KFS Data Mapping Leveraging a new KFS 3.0 feature.
 Three-Schema Architecture Three-Schema Architecture  Internal Level Internal Level  Conceptual Level Conceptual Level  External Level External Level.
Andrew S. Budarevsky Adaptive Application Data Management Overview.
1 Mapping to Relational Databases Presented by Ramona Su.
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
Lesson Overview 3.1 Components of the DBMS 3.1 Components of the DBMS 3.2 Components of The Database Application 3.2 Components of The Database Application.
Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:
SEAL: Common Core Libraries and Services for LHC Applications CHEP’03, March 24-28, 2003 La Jolla, California J. Generowicz/CERN, M. Marino/LBNL, P. Mato/CERN,
Databases in CMS Conditions DB workshop 8 th /9 th December 2003 Frank Glege.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
Model View Controller MVC Web Software Architecture.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
The POOL Persistency Framework POOL Project Review Introduction & Overview Dirk Düllmann, IT-DB & LCG-POOL LCG Application Area Internal Review October.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
Apr. 8, 2002Calibration Database Browser Workshop1 Database Access Using D0OM H. Greenlee Calibration Database Browser Workshop Apr. 8, 2002.
7/6/2004 CMS weekZhen Xie 1 POOL RDBMS abstraction layer status & plan Zhen Xie Princeton University.
CHEP 2004, Core Software Integration of POOL into three Experiment Software Frameworks Giacomo Govi CERN IT-DB & LCG-POOL K. Karr, D. Malon, A. Vaniachine.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Pool Project and ROOT I/O Dirk Duellmann What is Pool? Component Breakdown Status and Plans.
Some Ideas for a Revised Requirement List Dirk Duellmann.
1 Process Description and Control Chapter 3. 2 Process A program in execution An instance of a program running on a computer The entity that can be assigned.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Dictionary and POOL Dirk Duellmann.
Gaudi Framework Tutorial, April Creating Objects and Writing Data.
CHEP 2004, POOL Development Status & Plans POOL Development Status and Plans K. Karr, D. Malon, A. Vaniachine (Argonne National Laboratory) R. Chytracek,
CORAL CORAL a software system for vendor-neutral access to relational databases Ioannis Papadopoulos, Radoval Chytracek, Dirk Düllmann, Giacomo Govi, Yulia.
27 March 2003RD Schaffer & C. Arnault CHEP031 Use of a Generic Identification Scheme Connecting Events and Detector Description in Atlas  Authors: C.
Summary of persistence discussions with LHCb and LCG/IT POOL team David Malon Argonne National Laboratory Joint ATLAS, LHCb, LCG/IT meeting.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
G.Govi CERN/IT-DB 1GridPP7 June30 - July 2, 2003 Data Storage with the POOL persistency framework Motivation Strategy Storage model Storage operation Summary.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Evaluation of the C++ binding to the Oracle Database System Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow Seminar November 20, 2002.
POOL Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 04/June/2003.
Databases and DBMSs Todd S. Bacastow January 2005.
(on behalf of the POOL team)
By: S.S.Tomar Computer Center CAT, Indore, India On Behalf of
Event Data Definition in LHCb
Fundamentals & Ethics of Information Systems IS 201
File System Implementation
POOL persistency framework for LHC
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Data, Databases, and DBMSs
Data Persistency Solution for LHCb
Event Storage GAUDI - Data access/storage Framework related issues
Presentation transcript:

CHEP 2003 March 22-28, 2003 POOL Data Storage, Cache and Conversion Mechanism Motivation Data access Generic model Experience & Conclusions D.Düllmann, M. Frank, G. Govi, I. Papadoupolos, S. Roiser

M.Frank CERN/LHCb 2CHEP 2003 March 22-28, 2003 Motivation  Physics software should be independent of the underlying data storage technology  Data of different nature has to be accessed  Event data, detector data, statistical data, …  The data sizes: O(10 6 ) to O(10 13 ) Bytes/experiment/year  The access patterns differ  It is unclear how these data will be stored  Locking into one technology may be a disadvantage Need for a technology free data storage and data access mechanism

M.Frank CERN/LHCb 3CHEP 2003 March 22-28, 2003 Strategy  Hide any technology details from the clients  Clients deal with objects or object references  Hide all cache/persistency specific details  No compromise on transient data representation due to technology details  Each technology can be handled transparently  Transient representation sufficient for persistency  Ensure independence of experiment framework  Run-time binding of transient data to the underlying technology  Need for object description: “dictionary”

M.Frank CERN/LHCb 4CHEP 2003 March 22-28, 2003 Object Cache Data Service Manages object cache Client Data Access ClientClient ClientClient ClientClient Ref<T>Ref<T> Object Cache Data Service Ref<T>Ref<T> Object Cache Data Service Ref<T>Ref<T> Client access data through References Different context  Event data  Detector data  other Different context  Event data  Detector data  other

M.Frank CERN/LHCb 5CHEP 2003 March 22-28, 2003 Cache Access Through References  References know about the Data Cache  2 operation modes: - Clear at checkpoint - Auto-clear with reference count Object Reference in Cache Manager Reference to Cache Manager Ref<T>Ref<T> Pointer to object Dereference  References are implemented as smart pointers  Use cache manager for “load-on-demand”  Use the object key of the cache manager

M.Frank CERN/LHCb 6CHEP 2003 March 22-28, 2003 Data Service object cache … TokenObject Cache Access by Smart Pointer Persistency Service Object type Storage type Persistent Reference T o k e n Cache Ref Data Service Pointer Ref File Catalog

M.Frank CERN/LHCb 7CHEP 2003 March 22-28, 2003 Generic Persistent Model Objects & pointers Objects, object IDs, collections & DBs Transient Persistent C++ pointer >> object ID

M.Frank CERN/LHCb 8CHEP 2003 March 22-28, 2003 Access to the Data (5) Register -Object -References (2) Look-up Data Cache Will be unsuccessful, requested object is not present (3) Load request Persistency Service Technology dispatcher Data Service ClientClient (1) read(…) Ref<T>Ref<T> Try to access an object data Storage Service Storage Service Common Handling Conversion Service Conversion Service

M.Frank CERN/LHCb 9CHEP 2003 March 22-28, 2003 Map objects and write Storage Service Storage Service Storing objects Ref. mark for write Start Transaction Commit Transaction Persistency Service Technology dispatcher Object Cache ClientClient Data Service Common handling Conversion Service Conversion Service cache.startTransaction(...) Ref.mark_write(placement)... Ref.mark_write(placement) cache.endTransaction(...,COMMIT)

M.Frank CERN/LHCb 10CHEP 2003 March 22-28, 2003 The Storage Mechanism  The underlying model assumptions  How they map to “known” technologies  Migrating objects to/from the persistent medium  Object mapping  Reference handling  References are objects, not primitives  Need setup: Reference to data cache  ROOT: Callback for base class (Streamer)

M.Frank CERN/LHCb 11CHEP 2003 March 22-28, 2003 The Generic Model StorageSvc database Database Disk Storage Container Objects Data Cache Object type (class name) Optional data transform Storage type DB name Cont.name Item ID

M.Frank CERN/LHCb 12CHEP 2003 March 22-28, 2003 Database Technologies  Identify commonalties and differences between technologies Necessary knowledge when reading/writing  Model adapts to any technology with direct record access  Need to know record identifier in advance  RDBMS: More or less traditional  Primary key must be uniquely determined before writing  Probably two round-trips

M.Frank CERN/LHCb 13CHEP 2003 March 22-28, 2003 Object Mapping  Objects must maintain personality when persistent  Allow for queries, selections and independent element access  If technology supports objects…  Want to make use of such features  These technologies must be instructed how to do it  Need object dictionary  If technologies support only primitives  Split objects into primitives [until reasonable level]  Need full access to object member data[member offset, type]  Constructor and Destructor with defined signature  Need object dictionary

M.Frank CERN/LHCb 14CHEP 2003 March 22-28, 2003 Technology dependent.h ROOTCINT CINT dictionary code Dictionary Generation CINT dictionary I/O Data I/O Dictionary: Population/Conversion GCC-XML LCG dictionary code.xml Code Generator LCG dictionary Gateway Reflection Other Clients

M.Frank CERN/LHCb 15CHEP 2003 March 22-28, 2003 Follow Object Associations Link ID Link Info DB/Cont.name, Local lookup table in each file (1) TokenObject (2)(3) (4) Entry ID Link ID

M.Frank CERN/LHCb 16CHEP 2003 March 22-28, 2003 The Link Table  Contains all information to resurrect an object  Storage type  Database name  Container name  Object type (class name)  Cache hints  E.g. other possible transient conversions  Size: O(Associations in class model)  Local to every database  Size is limited

M.Frank CERN/LHCb 17CHEP 2003 March 22-28, 2003 Experience & Conclusions  We adopted a mechanism to write physics data without knowledge of the underlying store technology  Our approach can adopt any technology based on database files, collections and objects within collections  ROOT (implemented) and RDBMS (work ongoing)  We are able to choose technologies according to needs  As can save any objects described by the dictionary  We can offer a uniform interface to persistency clients