CMS Database Projects Lee Lueking CMS Activity Coordination Meeting July 20, 2004.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Persistency Framework CORAL, POOL, COOL – Status and Outlook A. Valassi, R. Basset,
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
POOL Project Status GridPP 10 th Collaboration Meeting Radovan Chytracek CERN IT/DB, GridPP, LCG AA.
Database Infrastructure Major Current Projects –CDF Connection Metering, codegen rewrite, hep w/ TRGSim++ – Dennis –CDF DB Client Monitor Server and MySQL.
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
LCG 3D StatusDirk Duellmann1 LCG 3D Throughput Tests Scheduled for May - extended until end of June –Use the production database clusters at tier 1 and.
Mantychore Oct 2010 WP 7 Andrew Mackarel. Agenda 1. Scope of the WP 2. Mm distribution 3. The WP plan 4. Objectives 5. Deliverables 6. Deadlines 7. Partners.
Summary DCS Workshop - L.Jirdén1 Summary of DCS Workshop 28/29 May 01 u Aim of workshop u Program u Summary of presentations u Conclusion.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
RLS Tier-1 Deployment James Casey, PPARC-LCG Fellow, CERN 10 th GridPP Meeting, CERN, 3 rd June 2004.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May Proposal.
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
Update on Database Issues Peter Chochula DCS Workshop, June 21, 2004 Colmar.
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
CHEP 2006, Mumbai13-Feb-2006 LCG Conditions Database Project COOL Development and Deployment: Status and Plans Andrea Valassi On behalf of the COOL.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
ALICE, ATLAS, CMS & LHCb joint workshop on
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
Database Administrator RAL Proposed Workshop Goals Dirk Duellmann, CERN.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
Clara Gaspar, March 2005 LHCb Online & the Conditions DB.
3D Workshop Outline & Goals Dirk Düllmann, CERN IT More details at
Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.
Databases in CMS Conditions DB workshop 8 th /9 th December 2003 Frank Glege.
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
7/6/2004 CMS weekZhen Xie 1 POOL RDBMS abstraction layer status & plan Zhen Xie Princeton University.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Some Ideas for a Revised Requirement List Dirk Duellmann.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Rack Wizard LECC 2003 Frank Glege. LECC Frank Glege - CERN2/12 Content CMS databases - overview The equipment database The Rack Wizard.
LCG Distributed Databases Deployment – Kickoff Workshop Dec Database Lookup Service Kuba Zajączkowski Chi-Wei Wang.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Overview of C/C++ DB APIs Dirk Düllmann, IT-ADC Database Workshop for LHC developers 27 January, 2005.
Grid Technologies for Distributed Database Services 3D Project Meeting CERN, May 19, 2005 A. Vaniachine (ANL)
CORAL CORAL a software system for vendor-neutral access to relational databases Ioannis Papadopoulos, Radoval Chytracek, Dirk Düllmann, Giacomo Govi, Yulia.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
VOX Project Status T. Levshina. 5/7/2003LCG SEC meetings2 Goals, team and collaborators Purpose: To facilitate the remote participation of US based physicists.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
A quick summary and some ideas for the 2005 work plan Dirk Düllmann, CERN IT More details at
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
1 LCG Distributed Deployment of Databases A Project Proposal Dirk Düllmann LCG PEB 20th July 2004.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Database Replication and Monitoring
(on behalf of the POOL team)
LCG 3D Distributed Deployment of Databases
Open Source distributed document DB for an enterprise
3D Application Tests Application test proposals
POOL persistency framework for LHC
LCG Distributed Deployment of Databases A Project Proposal
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Workshop Summary Dirk Duellmann.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Presentation transcript:

CMS Database Projects Lee Lueking CMS Activity Coordination Meeting July 20, 2004

Topics Status report on HCAL Testbeam Detector DB project, and… Status report on HCAL Testbeam Detector DB project, and… How does the testbeam work fit into the broader CMS DB context? How does the testbeam work fit into the broader CMS DB context? Building a POOL plug-in for FroNtier Building a POOL plug-in for FroNtier LCG Distributed Deployment of Database LCG Distributed Deployment of Database

HCAL Testbeam Detector DB

HCAL Det DB Focus Equipment Configuration DB Equipment Configuration DB –Relationships for all HCAL detector components: Wedges, layers, read-out box (RBX), cables, HCAL Trigger (HTR) cards. –Test results for various components e.g. RBX, QIE. Conditions DB Conditions DB –DCS Slow Controls logging DB: Temperatures, HV, LV, Beam properties, etc. –Calibration DB: Pedestals, gains, timing information for each channel. –Configuration DB: Conf. info that is downloaded to RBX

HCAL Detector Configuration Details

Manpower & Status FNAL Manpower FNAL Manpower –PPD/CMS: Shuichi Kunori (UMD, 0.2), Taka Yasuda (0.2), Stefan Piperov (0.8  0.5), Jordan Damgov(0.8  0.5), Gennadiy Lukhanin (  1.0 new hire) –CD/CEPA/DBS: Lee Lueking (0.2), Yuyi Guo (0.8) –CD/CSS: Anil Kumar (0.2), Maurine Mihalek (0.0  0.2) Status Status –Schema designs completed for EqConf, SlowCont, Calib DB’s. Extensive reviews of DDL’s finished. Installation on Devel machine in progress. –New Prod servers (DELL PE 2650) at FNAL and CERN w/ RH ES 3.0 loaded. FNAL machine has OS patches installed and Oracle 10G loaded soon. Machine will move to rack in machine room. –Also, a new devel server to get Oracle 10G this week. –Loading scripts for SC logs are ready, tested, and in CVS. SC data waiting to be loaded.

HCAL DetDB in the Broader Context of CMS

The LCG Conditions DB Project Lead by Andrea Valassi (CERN IT). Several (many ATLAS) participants. Strong BaBar influence. Lead by Andrea Valassi (CERN IT). Several (many ATLAS) participants. Strong BaBar influence. The purpose of the ConditionsDB project is to develop software libraries and tools for the LHC experiments to store, retrieve and manipulate conditions data. The purpose of the ConditionsDB project is to develop software libraries and tools for the LHC experiments to store, retrieve and manipulate conditions data. The deliverables of the project include: The deliverables of the project include: –A C++ API to store and retrieve condition data –Concrete implementations of the API using different persistent backends (such as Oracle and MySQL) –Tools to manage, browse, replicate and manipulate the data –Test and example programs Weekly Phone conferences to discuss progress and ideas. Weekly Phone conferences to discuss progress and ideas.

Schema for HCDB calibration Ped, gain, or t0 Algorithm Blob w/ calib info HCAL Calib DBCond DB

Comparing Cond DB HVS w/ HCAL Calib DB (HCDB) Hardware Hardware –HCDB is for a specific sub-detector structure –Cond DB HVS approach is generic Time Time –HCDB uses run ranges for the test beam –Cond DB has IOV (Interval of Validity) Data Data –HCDB is concerned about the relation of the data as it goes in, as well as how it is used. Includes algorithm used. –Cond DB seems to be focused on access to the data. Tagging: Similar concept for both. Tagging: Similar concept for both. –Cond DB offers more flexible approach. –HCDB is simpler with the constraints that run-ranges impose.

The Equipment Management DB (EMDB) Designed and implemented at CERN by Frank Glege to track the location of irradiated hardware for French Government legal reasons. Designed and implemented at CERN by Frank Glege to track the location of irradiated hardware for French Government legal reasons. Production version being used for existing components as the detector is built. Production version being used for existing components as the detector is built. Comparison w/ EqConf DB: Comparison w/ EqConf DB: Feature\DBEMDBEqConfDB Component relationships no (planned) yes Component History no (planned) yes Detailed set of components no (only major) yes Currently in CERN yesno Works w/ all sub-detectors yes no (HCAL)

Sizing DB Resources for CMS Gennadiy is working w/ Frank Glege to estimate the DB needs for CMS. A process to get more detailed info from each detector group is planned. Gennadiy is working w/ Frank Glege to estimate the DB needs for CMS. A process to get more detailed info from each detector group is planned. Configuration Database Configuration Database –Start run info needed daily: 10 GB –Number of configurations (early): 10/Month –Expected addition each Month: ~100 GB Conditions database Conditions database –Average daily dataflow: 2 GB –Expected size after 1 month: ~60 GB

POOL Plug-in for FroNtier

RDBMS Abstraction in POOL From the POOL Project Plan Motivation Motivation –Vendor neutral access to RDBMS backend for relational components in POOL: FileCatalog, collection, relational storage manager. –Driven by CMS requirements: access existing relational data as C++ objects using POOL. E.g. conditions data, configuration data, etc. Requirements collection and component analysis started late last year and finished in March. Requirements collection and component analysis started late last year and finished in March. Implementation started in March and expected to complete in Q3 this year. Implementation started in March and expected to complete in Q3 this year.

POOL Software Design RelationalAccess ObjectRelationalAccess Seal reflection Relational Catalog RelationalC ollection RelationalStorageSvc ODBC OracleSQLite FileCatalogCollectionStorageSvc Technology dependent module Experiment framework uses Abstract interface Implementation implements FroNtier

POOL RDBM Status – FroNtier Proposal POOL RDBM interface status (as of June 2004) POOL RDBM interface status (as of June 2004) –Interface completed: Technology neutral and SQL-free –plugin modules »Oracle(9i OCI), SQLite completed and unit-tested »ODBC in progress –Relational file catalog completed and tested: Validated RelationalAccess interface and the existing plugin modules FroNtier plug-in proposal FroNtier plug-in proposal –Build a FroNtier component for POOL –Reuse existing XML CDF Descriptors or adapt to LCG standards.

Distributed Deployment of Database Information from Dirk Duelmann Presentation made 7/20 (today) to LCG Project Execution Board

Project Goals Help to avoid the costly parallel development of data distribution and backup mechanisms in each experiment or grid site in order to limit the support costs Help to avoid the costly parallel development of data distribution and backup mechanisms in each experiment or grid site in order to limit the support costs Enable a distributed operation of the LCG database infrastructure with a minimal number of LCG database administration personnel. Enable a distributed operation of the LCG database infrastructure with a minimal number of LCG database administration personnel. Define on the application access to database services to allow any LCG application or service to find relevant database back-ends, authenticate and use the provided data in a location independent way Define on the application access to database services to allow any LCG application or service to find relevant database back-ends, authenticate and use the provided data in a location independent way

Project Non-Goals Store all database data Store all database data –Experiments are free to deploy databases and replicate data under their responsibility Setup a single monolithic distributed database system Setup a single monolithic distributed database system –Given the WAN connections and service we can not assume that a single synchronously updated database would work or give sufficient availability. Setup a single vendor system Setup a single vendor system –Technology independence and multi-vendor implementation will be required to minimise the risks and to adapt to the different requirements at T1 and T2 sites Impose a CERN centric infrastructure to participating sites Impose a CERN centric infrastructure to participating sites –CERN is one equal partner of other LCG sites

M Starting Point for a Service Architecture? O O O M T1- db back bone - all data replicated - reliable service T2 - local db cache -subset data -only local service T3/4 M M T0 - autonomous Oracle Streams Cross vendor extract MySQL Files Proxy Cache

Staged Project Evolution Proposal Phase 1 (in place for 2005 data challenges) Proposal Phase 1 (in place for 2005 data challenges) –Focus on T1 back-bone understand the bulk data transfer issues »Given the current service situation a T1 back-bone based on Oracle with streams based replication seems the most promising implementation »Start with T1 sites who have sufficient manpower to actively participate in the project –Prototype vendor independent T1 T2 extraction based on application level or relational abstraction level »This would allow to run vendor dependent database applications on the T2 subset of the data –Define a MySQL service with interested T2 sites »Experiments should point out their MySQL service requirements to the sites »Need candidate sites which are interested in providing a MySQL service and are able to actively contribute its definition Proposal Phase 2 Proposal Phase 2 –Try to extend the heterogeneous T2 setup to T1 sites »By this time real MySQL based services should be established and reliable »Cross vendor replication based on either Oracle streams bridges or relational abstraction may have proven to work and to handle the data volumes

Proposed Project Structure Data Inventory and Distribution Requirements Data Inventory and Distribution Requirements –Members are s/w providers from experiments and grid services based on RDBMS data –Gather data properties (volume, ownership) requirements and integrate the provided service into their software Database Service Definition and Implementation Database Service Definition and Implementation –Members are site technology and deployment experts –Propose an agreeable deployment setup and common deployment procedures Evaluation Tasks Evaluation Tasks –Short, well defined technology evaluations against the requirements delivered by wp1 –Evaluation are proposed by people WP2 (evaluation plan) and typically executed by the people proposing a technology for the service implementation and result in a short evaluation report

Data Inventory Collect and maintain a catalog of main RDBMS data types Collect and maintain a catalog of main RDBMS data types –Select from catalog of well defined replication options –Determine which are to be supported as part of the service Ask the experiments and s/w providers to fill a simple table for each main data type which is candidate for storage and replication via this service Ask the experiments and s/w providers to fill a simple table for each main data type which is candidate for storage and replication via this service –Basic storage properties »Data description, expected volume on T0/1/2 in 2005 (and evolution) »Ownership model: read-only, single user update, single site update, concurrent update –Replication/Caching properties »Replication model: site local, all t1, sliced t1, all t2, sliced t2 … »Consistency/Latency: how quickly do changes need to reach other sites/tiers »Application constraints: DB vendor and DB version constraints –Reliability and Availability requirements »Essential for whole grid operation, for site operation, for experiment production, »Backup and Recovery policy, Acceptable time to recover, location of backup(s), etc.

DB Service Definition and Implementation Service Discovery Service Discovery –How does a job find a replica of the database it needs? –Do we need transparent relocation of services? How? Connectivity, firewalls and constraints on outgoing connections Connectivity, firewalls and constraints on outgoing connections Authentication and authorization Authentication and authorization –Integration between DB vendor and LCG security models Installation and configuration Installation and configuration –Database server and client installation kits »Which client bindings are required? C, C++, Java(JDBC), Perl,.. C, C++, Java(JDBC), Perl,.. –Server administration procedures and tools? »Even basic agreements to simplify the distributed operation –Server and client version upgrades (eg security patches) »How, if transparency is required for high availability? Backup and recovery Backup and recovery –Backup policy templates, Responsible site(s) for a particular data type? –Acceptable latency for recovery?

Initial list of possible evaluation tasks Oracle replication study Oracle replication study –Eg Continue/extend work started during CMS DC04 –Focus: stability, data rates, conflict handling, administration, topology DB File based distribution DB File based distribution –Eg shipping complete MySQL DBs or Oracle tablespaces –Focus: deployment impact on existing applications Application specific cross vendor extraction Application specific cross vendor extraction –Eg Extracting a subset of Conditions Data to a T2 site –Focus: complete support of experiment computing model use cases Web Proxy based data distribution Web Proxy based data distribution –Eg Integrate this technology into relational abstraction layer –Focus: cache control, efficient data transfer Other Generic Vendor-to-Vendor bridges Other Generic Vendor-to-Vendor bridges –Eg Streams interface to MySQL –Focus: feasibility, fault tolerance, application impact

Proposed Mandate, Timescale & Deliverables Define, in collaboration with the experiments and Tier0-2 service providers, a reliable LCG infrastructure which allows to store the database data and distribute it (if necessary) for use from physics applications and grid services. The target delivery date for a first service should be in time for the 2005 data challenges. Define, in collaboration with the experiments and Tier0-2 service providers, a reliable LCG infrastructure which allows to store the database data and distribute it (if necessary) for use from physics applications and grid services. The target delivery date for a first service should be in time for the 2005 data challenges. The project could/should run as part of the LCG deployment area in close collaboration with the application area as provider of application requirements and db abstraction solutions. The project could/should run as part of the LCG deployment area in close collaboration with the application area as provider of application requirements and db abstraction solutions. Main deliverables should be Main deliverables should be –An inventory of data types and their properties (incl. distribution) –A service definition document to be agreed between experiments and LCG sites –A service implementation document to be agreed between LCG sites Status reports to the established LCG committees Status reports to the established LCG committees –Final decisions are obtained via PEB and GDB

How CAN/SHOULD FNAL Participate? Comments form CD/CSS/DSG Comments form CD/CSS/DSG –They are very short on manpower for the short-term. –Their involvement will be limited to the existing testing program with Oracle 10G and unidirectional streams replication. –They are willing, and interested, to participate in regular 3D meetings and offer technical advice when possible. CD/CEPA/DBS Interested in participating CD/CEPA/DBS Interested in participating –Help defining the plan –Development and testing: MMSR, FonNtier, MySQL replication. Need other CD Department/Group involvement. Need other CD Department/Group involvement. The ATLAS DB group at Argonne (David Malon, Alexandre Vaniachine, Jack Cranshaw) are very interested in working together with Fermilab. This could be a productive collaboration. The ATLAS DB group at Argonne (David Malon, Alexandre Vaniachine, Jack Cranshaw) are very interested in working together with Fermilab. This could be a productive collaboration.

Summary The HCAL DetDB project is providing a set of tools for the testbeam, and is providing us valuable contact with other DB projects in CMS and LCG. The HCAL DetDB project is providing a set of tools for the testbeam, and is providing us valuable contact with other DB projects in CMS and LCG. A FroNtier plug-in for POOL would enable us to leverage our experience in CDF to CMS. We are pursuing this with the POOL developers. A FroNtier plug-in for POOL would enable us to leverage our experience in CDF to CMS. We are pursuing this with the POOL developers. The proposed Distributed Deployment of Databases is an important project. We should be involved in its definition and development. The proposed Distributed Deployment of Databases is an important project. We should be involved in its definition and development.