Evolution of the Configuration Database Design Andrei Salnikov, SLAC For BaBar Computing Group ACAT05 – DESY, Zeuthen.

Slides:



Advertisements
Similar presentations
Load Balancing in a Cluster-based Active Jiani Guo (Student Member, IEEE) Laxmi Bhuyan (Fellow, IEEE) March 15 th 2005 Seo, Dong Mahn.
Advertisements

Object Persistency & Data Handling Session C - Summary Object Persistency & Data Handling Session C - Summary Dirk Duellmann.
A AAAA Model to Support Science Gateways with Community Accounts GGF-14 Science Gateways Workshop June 28, 2005 Von Welch, James Barlow, James Basney,
Lia Toledo Moreira Mota, Alexandre de Assis Mota, Wu, Shin-Ting
VGISCs view VGISC Uses Cases Geneva October 2005.
More about Ruby Maciej Mensfeld Presented by: Maciej Mensfeld More about Ruby dev.mensfeld.pl github.com/mensfeld.
1 Migrating from Access to SQL Server Simon Kingston, CSU / NPS NRGIS.
Database System Concepts and Architecture
Load Balancing Hybrid Programming Models for SMP Clusters and Fully Permutable Loops Nikolaos Drosinos and Nectarios Koziris National Technical University.
March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Chapter 3.2 C++, Java, and Scripting Languages “The major programming languages used in game development.”
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
IS4401 Project Technology Issues. Introduction This seminar covers Databases When to use a Database What Database to use Development Tools Visual Studio.
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
SM3121 Software Technology Mark Green School of Creative Media.
Web Application Architecture: multi-tier (2-tier, 3-tier) & mvc
Object Oriented Databases by Adam Stevenson. Object Databases Became commercially popular in mid 1990’s Became commercially popular in mid 1990’s You.
2/10/2000 CHEP2000 Padova Italy The BaBar Online Databases George Zioulas SLAC For the BaBar Computing Group.
A Simplified Approach to Web Service Development Peter Kelly Paul Coddington Andrew Wendelborn.
ITEC224 Database Programming
M1G Introduction to Database Development 6. Building Applications.
Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration Migrating PHENIX databases from object to relational model.
HPS Online Software Discussion Jeremy McCormick, SLAC Status and Plans.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
1 Alice DAQ Configuration DB
Engr. M. Fahad Khan Lecturer Software Engineering Department University Of Engineering & Technology Taxila.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI Feb 2012 Presentation.
Sage ACT! 2013 SDK Update Brian P. Mowka March 23, 2012 Template date: October 2010.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Lessons Learned from Managing a Petabyte Jacek Becla Stanford Linear Accelerator Center (SLAC) Daniel Wang now University of CA in Irvine, formerly SLAC.
Hibernate 3.0. What is Hibernate Hibernate is a free, open source Java package that makes it easy to work with relational databases. Hibernate makes it.
Software Engineering, Lecture 4 Mohamed Elshaikh.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Processes Introduction to Operating Systems: Module 3.
The Persistency Patterns of Time Evolving Conditions for ATLAS and LCG António Amorim CFNUL- FCUL - Universidade de Lisboa A. António, Dinis.
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
The Software Development Process
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
David Lawrence 7/8/091Intro. to PHP -- David Lawrence.
Why A Software Review? Now have experience of real data and first major analysis results –What have we learned? –How should that change what we do next.
Some Ideas for a Revised Requirement List Dirk Duellmann.
Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,
Scott D. Metzler, CaltechCHEP 2000, Padova, IT Feb Production Experience with CORBA in the BaBar Experiment Scott D. Metzler California Institute.
Event Management. EMU Graham Heyes April Overview Background Requirements Solution Status.
Database Issues Peter Chochula 7 th DCS Workshop, June 16, 2003.
BIG DATA/ Hadoop Interview Questions.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
AliRoot survey: Calibration P.Hristov 11/06/2013.
“This improved a lot since I started using Tango (three years ago) from scratch so I'm happy to see the efforts from the developers. Still there is room.
DEPTT. OF COMP. SC & APPLICATIONS
ODBC, OCCI and JDBC overview
Efficient data maintenance in GlusterFS using databases
Types for Programs and Proofs
CMS High Level Trigger Configuration Management
Triple Stores.
Lecture 1: Multi-tier Architecture Overview
OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model
Design Components are Code Components
Outline Chapter 2 (cont) OS Design OS structure
NAVIGATING THE MINEFIELD
Triple Stores.
overview today’s ideas relational databases
Presentation transcript:

Evolution of the Configuration Database Design Andrei Salnikov, SLAC For BaBar Computing Group ACAT05 – DESY, Zeuthen

May 23, 2005ACAT Talk overview BaBar configuration database BaBar configuration database Migration of BaBar databases Migration of BaBar databases New configuration database API New configuration database API Choice of technologies Choice of technologies Implementation dynamic loading Implementation dynamic loading Lessons learned from migration Lessons learned from migration

May 23, 2005ACAT BaBar on-line databases Conditions database Conditions database Calibrations, geometry, alignment, etc. Calibrations, geometry, alignment, etc. Data accessed with the event time Data accessed with the event time Ambient database (simplified conditions) Ambient database (simplified conditions) Time history of the data-taking conditions (voltages, currents, temperatures, etc.) Time history of the data-taking conditions (voltages, currents, temperatures, etc.) Part of the on-line detector control Part of the on-line detector control Configuration database Configuration database Details follow Details follow Prompt Reconstruction databases Prompt Reconstruction databases Support for multi-node calibrations Support for multi-node calibrations Electronic Logbook Electronic Logbook Etc. Etc.

May 23, 2005ACAT Configuration database Important part of the on-line system Important part of the on-line system Configuration database keeps configuration data - detector and software settings for data taking Configuration database keeps configuration data - detector and software settings for data taking Support configuration of the on-line hardware and software in preparation for data taking Support configuration of the on-line hardware and software in preparation for data taking For more details see CHEP03 talk For more details see CHEP03 talk Worked reasonably well since the beginning of Run 1 in 1999 Worked reasonably well since the beginning of Run 1 in 1999

May 23, 2005ACAT BaBar database migration BaBar was using Objectivity/DB ODBMS for many of its databases BaBar was using Objectivity/DB ODBMS for many of its databases About two years ago started migration from Objectivity to ROOT for event store, which was a success and improvement About two years ago started migration from Objectivity to ROOT for event store, which was a success and improvement No reason to keep pricey Objectivity only because of secondary databases No reason to keep pricey Objectivity only because of secondary databases Migration effort started in 2004 for conditions, configuration, prompt reconstruction, and ambient databases Migration effort started in 2004 for conditions, configuration, prompt reconstruction, and ambient databases

May 23, 2005ACAT Scope of the migration Configuration database currently holds about 60MB (uncompressed) of configuration data Configuration database currently holds about 60MB (uncompressed) of configuration data About 30 persistent classes for representation of the configuration data About 30 persistent classes for representation of the configuration data Rather compact and simple compared to other BaBar databases (e.g. conditions database is 40GB and 400 classes) Rather compact and simple compared to other BaBar databases (e.g. conditions database is 40GB and 400 classes) Due to its size and simplicity configuration database is a perfect candidate for a pilot project within migration effort Due to its size and simplicity configuration database is a perfect candidate for a pilot project within migration effort

May 23, 2005ACAT Configuration database API Main problem of the old database – API exposed too much of the implementation technology: Main problem of the old database – API exposed too much of the implementation technology: Persistent object, handles, class names, etc. Persistent object, handles, class names, etc. API has to change but we dont want to make the same mistakes again (new mistakes are more interesting) API has to change but we dont want to make the same mistakes again (new mistakes are more interesting) Pure transient-level abstract API independent on any specific implementation technology Pure transient-level abstract API independent on any specific implementation technology Much easier to re-implement in different technology if we dont like one particular technology Much easier to re-implement in different technology if we dont like one particular technology Can support more than one technology if needed [vital] Can support more than one technology if needed [vital] Client code independent of any persistency technology Client code independent of any persistency technology

May 23, 2005ACAT Prototyping New API To test the details of new API and one specific implementation, a prototype implementation was built To test the details of new API and one specific implementation, a prototype implementation was built Great prototyping language used – Python Great prototyping language used – Python MySQL used for data storage MySQL used for data storage Prototype is not exact because languages are very different: Python – dynamic, C++ – static Prototype is not exact because languages are very different: Python – dynamic, C++ – static But helped a lot with quick testing of different ideas and optimizing SQL tables But helped a lot with quick testing of different ideas and optimizing SQL tables

May 23, 2005ACAT New Database API Defining new API is easy – after five years of experience it cant be done wrong Defining new API is easy – after five years of experience it cant be done wrong Expressing this API in C++ is not so easy Expressing this API in C++ is not so easy one particular problem – how to pass non- polymorphic types through abstract interface one particular problem – how to pass non- polymorphic types through abstract interface in C++ interfaces and templates do not mix in C++ interfaces and templates do not mix solution is to use RTTI at transient level solution is to use RTTI at transient level smart AnyPtr (void* with enough RTTI) smart AnyPtr (void* with enough RTTI) mechanism is hidden from client code, clients only see type-safe methods mechanism is hidden from client code, clients only see type-safe methods

May 23, 2005ACAT API migration – clients All clients should be changed to use new API All clients should be changed to use new API All client code should be made free of the old implementation details All client code should be made free of the old implementation details Most time-consuming task in the whole migration Most time-consuming task in the whole migration Re-examining dependencies, re-packaging code, re-thinking design at global level Re-examining dependencies, re-packaging code, re-thinking design at global level Process is hard, sometimes conflicting with other development, but the result is beneficial for everybody – cleaner code and fewer dependencies Process is hard, sometimes conflicting with other development, but the result is beneficial for everybody – cleaner code and fewer dependencies

May 23, 2005ACAT Bridge Implementation Having an abstract API we need to build specific implementation(s) Having an abstract API we need to build specific implementation(s) Start with obvious – we still have data in Objectivity. Build bridge implementation on top of old database Start with obvious – we still have data in Objectivity. Build bridge implementation on top of old database proof of a working API proof of a working API being used until migration is complete being used until migration is complete but the goal is to have different technology(-ies) but the goal is to have different technology(-ies)

May 23, 2005ACAT Choosing Technology Time to choose implementation technologies, real alternatives to Objectivity Time to choose implementation technologies, real alternatives to Objectivity Many considerations – data access and distribution, reliability, data modeling capabilities, etc. Many considerations – data access and distribution, reliability, data modeling capabilities, etc.

May 23, 2005ACAT Choosing technology Two classes of clients with different requirements: Two classes of clients with different requirements: production site needs reliable, fault-tolerant, concurrent read-write access to the data production site needs reliable, fault-tolerant, concurrent read-write access to the data remote sites need zero-management, easy-to- use, fast, scalable read-only access to the data remote sites need zero-management, easy-to- use, fast, scalable read-only access to the data It should be possible to use different implementation technologies at production and remote sites It should be possible to use different implementation technologies at production and remote sites

May 23, 2005ACAT Read-only implementation ROOT is an obvious decision for storing read-only data: ROOT is an obvious decision for storing read-only data: Data definition using C++ constructs allows easy migration of Objectivity schema Data definition using C++ constructs allows easy migration of Objectivity schema No servers and no management needed for small installations No servers and no management needed for small installations For large installations xrootd server could be used for load-balancing For large installations xrootd server could be used for load-balancing BaBar data distribution knows already about ROOT BaBar data distribution knows already about ROOT

May 23, 2005ACAT Read-only implementation Migration was straightforward Migration was straightforward ROOT-persistent classes are (almost) exact copy of Objectivity DDLs ROOT-persistent classes are (almost) exact copy of Objectivity DDLs Data size is small, everything fits in one file Data size is small, everything fits in one file Metadata and data objects are stored in TTrees, à la relational tables in SQL database Metadata and data objects are stored in TTrees, à la relational tables in SQL database Some complications due to lack of proper indexing support in ROOT TTrees, needed some additional structures Some complications due to lack of proper indexing support in ROOT TTrees, needed some additional structures

May 23, 2005ACAT Read-write implementation Need something more robust for this – real database Need something more robust for this – real database Practically only option is relational database – either commercial like Oracle, or free as MySQL Practically only option is relational database – either commercial like Oracle, or free as MySQL Need to augment it with the object-relational mapping, not quite trivial despite many databases call themselves object-relational Need to augment it with the object-relational mapping, not quite trivial despite many databases call themselves object-relational But we could reuse data definitions from ROOT read-only implementation But we could reuse data definitions from ROOT read-only implementation Not so critical for configuration database, but will be essential for conditions Not so critical for configuration database, but will be essential for conditions

May 23, 2005ACAT MySQL implementation Initial implementation in MySQL, but can switch to Oracle later if MySQL fails Initial implementation in MySQL, but can switch to Oracle later if MySQL fails Configuration objects are self-contained, store them as BLOBs in the relational tables Configuration objects are self-contained, store them as BLOBs in the relational tables Need to serialize object data for storage Need to serialize object data for storage convert to ROOT object convert to ROOT object call Streamer() to stream data into buffer call Streamer() to stream data into buffer compress buffer, store as BLOB compress buffer, store as BLOB De-serialization works in opposite direction De-serialization works in opposite direction Reuse of the persistent classes from ROOT read- only implementation! Reuse of the persistent classes from ROOT read- only implementation!

May 23, 2005ACAT Building applications Now we have three implementations: Bridge (temporary), ROOT, and MySQL Now we have three implementations: Bridge (temporary), ROOT, and MySQL Decision on which one to use should be delayed until run time, so that the same application could run everywhere Decision on which one to use should be delayed until run time, so that the same application could run everywhere Should link all implementations into all applications? Should link all implementations into all applications? Not good, dont want to install MySQL if need only ROOT read-only Not good, dont want to install MySQL if need only ROOT read-only Load implementations dynamically at run time Load implementations dynamically at run time

May 23, 2005ACAT Dynamic loading In principle it should be possible to split things into static/dynamic In principle it should be possible to split things into static/dynamic In practice it is very complex and needs extreme care In practice it is very complex and needs extreme care In BaBar we ended up with the dynamic loading of libmysqlclient.so only In BaBar we ended up with the dynamic loading of libmysqlclient.so only ROOT implementation is linked into application ROOT implementation is linked into application part of MySQL implementation too part of MySQL implementation too but no link dependency on MySQL but no link dependency on MySQL

May 23, 2005ACAT Current Status New API and all three implementations are now in production New API and all three implementations are now in production Bridge implementation is what we are currently using in production Bridge implementation is what we are currently using in production Data in two other implementations are constantly updated from production data Data in two other implementations are constantly updated from production data Preparing data distribution for ROOT configuration data Preparing data distribution for ROOT configuration data Planning to switch production to use MySQL implementation Planning to switch production to use MySQL implementation

May 23, 2005ACAT Lessons learned Always make abstract APIs to avoid problems in the future (this may be hard and need few iterations) Always make abstract APIs to avoid problems in the future (this may be hard and need few iterations) Client code should be free from any specific database implementation details Client code should be free from any specific database implementation details Early prototyping could answer a lot of questions, but five years of experience count too Early prototyping could answer a lot of questions, but five years of experience count too Use different implementations for clients with different requirements Use different implementations for clients with different requirements Implementation would benefit from features currently missing in C++: reflection, introspection (or from completely new language) Implementation would benefit from features currently missing in C++: reflection, introspection (or from completely new language)

May 23, 2005ACAT Conclusions We have redesigned and reimplemented BaBar configuration database We have redesigned and reimplemented BaBar configuration database Different alternative implementations exist for clients with different requirements Different alternative implementations exist for clients with different requirements No specific performance figures, but it should be better than original Objectivity performance No specific performance figures, but it should be better than original Objectivity performance Good experience which will be used in migration of other BaBar databases Good experience which will be used in migration of other BaBar databases