Presentation is loading. Please wait.

Presentation is loading. Please wait.

An ODBMS approach to persistency in CMS

Similar presentations


Presentation on theme: "An ODBMS approach to persistency in CMS"— Presentation transcript:

1 An ODBMS approach to persistency in CMS
Lucia Silvestris INFN Bari - CERN/EP CHEP February 2000 Padova Italy

2 CMS - Software Components
Request asynchronous data Environment Data Slow Control Online Monitoring CMS Detector (Muon, Tracker, Calo) Quasi-online Reconstruction store Request part of event Filter Unit/ Event Filter Objectivity Formatter Request part of event Store rec-Obj Request part of event Persistent Object Store Manager Object Database Management System Request asynchronous data Store rec-Obj calibration store Request part of event Data Quality Calibrations Group Physics Analysis Simulation G3 and or G4 User Analysis on demand Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

3 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
CARF Components CARF Architecture: On-demand reconstruction (see V.Innocente talk on CARF Architecture-session A) Framework Main Services Define the events to be dispatched (events and geometry from Simulations or Test-Beams) Manage the “not yet removed” sequential components (coming from Geant3) Run-Time Dynamic Loading is used to configure and build CARF Applications Framework Persistency Services Framework Ancillary Services User Interface, Error Report, Logging facilities,... Timing facility, Utility library Object of this talk Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

4 CMS Persistency history
Prototype Test Beams DAQ and Analysis using Objectivity/DB in different CMS Test-Beam areas (H2, T9 and X5b). The system was successfully tested. Production 1999 Test Beam DAQ (from April ‘99) Monte Carlo (GEANT3) reconstruction (from October ‘99) Persistent digit for Calorimeter, Muon and Trigger Physics Generator information (vertices, tracks) persistent (see D. Stickland talk on ORCA - session A) ORCA Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

5 Persistent Service for High Energy Physics Data
Event Collection Collection Meta-Data Event Electrons Tracker Alignment Tracks Ecal calibration User Tag (N-tuple) Environmental data Detector and Accelerator status Calibrations, Alignments Event-Collection Meta-Data (luminosity, selection criteria, …) Event Data, User Data . Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

6 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Do a user need a DBMS? Do I encode meta-data (run number, version id) in file names? How many files and logbooks I should consult to determine the luminosity corresponding to a histogram? How easily I can determine if two events have been reconstructed with the same version of a program and using the same calibrations? How many lines of code I should write and which fraction of data I should read to select all events with two ’s with p> 11.5 GeV and ||<2.7? The same at generator level? If the answers scare you, you need a DBMS! Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

7 Can CMS do without a DBMS?
An experiment lasting 20 years can not rely just on ASCII files and file systems for its production bookkeeping, “condition” database, etc. Even today at LEP, the management of all real and simulated data-sets (from raw-data to n-tuples) is a major enterprise. A DBMS is the modern answer to such a problem and, given the choice of OO technology for the CMS software, an ODBMS (or a DBMS with an OO interface) is the natural solution. Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

8 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
A “BLOB” Model Event Event DataBase Objects RecEvent RawEvent Blob: a sequence of bytes. Decoding it is a “user” responsibility. Blob Why should Blobs not be stored in the DBMS? Blob Blob Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

9 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
CMS Raw Event RawData are identified by the corresponding ReadOutUnit Raw Event ReadOutUnit ReadOutUnit The ReadOutUnit Object can identify a complete detector or a detector component Raw Data Raw Data Raw Data Vector of Digi RawData belonging to different“detectors” are clustered into different containers. The granularity will be adjustedto optimize I/O performances. An index at RawEvent level is used to avoid the access to all containers in search for a given RawData. A range index at RawData level could be used for fast random access in complex detectors. Index implemented as an ordered vector of pairs Vector of Digi Vector of Digi The vector of Digi in the Testbeam contains the ADC or TDC values ReadOutUnit Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

10 Persistent Object Management
The persistent object management is a major responsibility in the CMS Analysis and Reconstruction Framework (CARF) CARF manages multi-threaded transactions creation of databases and containers meta data and event collections physical clustering of event objects persistent event structure and its relations with the transient Use of Database is transparent to detector developers users access persistent objects through C++ pointers CARF takes care of memory pinning Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

11 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
CMS Event Structure The Run object contains event collection condition like Beam energy, particle type, magnetic field etc.. Persistent Event Collection Event Collection Transient Run In case of re-reconstruction the original structure is kept. Event objects are cloned and new collections created Event Event Event Event RawEvent RecEvent RecEvent The event header object contains event num, spill num, event num in the spill Event Header RecEvent RecEvent Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

12 CMS Reconstructed Objects
Reconstructed Objects produced by a given “algorithm” are managed by a Reconstructor. RecEvent S-Track Reconstructor A Reconstructed Object (Track) is split into several independent persistent objects to allow their clustering according to their access patterns (physics analysis, reconstruction, detailed detector studies, etc.). The top level object acts as a proxy. Intermediate reconstructed objects (RHits) are cached by value into the final objects . “esd” “rec” Track SecInfo Track Constituents S Track .. “aod” Vector of RHits S Track Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

13 Test Beam Production in 1999
Detector performances studies have been the real “users” for Test Beams project From April 99 to October 99 the test beam software was in production for the Tracker and the Muon reading data from VME - FastBus modules and filling one federate database for each beam line (H2b, X5b, T9) and for each data taking period. Some system databases Beam configuration : Read-Out Unit list LogBook: logbook information for each run ListRuns: run list Run Databases: event collection with the same data taking conditions The DAQ system + Objectivity formatter running on Solaris More than 800 GB of data stored in Objectivity/DB Ran without major problems Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

14 Test Beam Production in 1999
Online Offline - cmsc01 Prod Boot Prod Boot Clone FD Prod FD Prod FD BConfDB BConfDB RunDB RunDB LogDB LogDB Run1 Run2 Run3 RunN Run1 Run2 Run3 RunN Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

15 Test Beam Data Analysis
Online (Prompt data) Monitoring: on online machine fast feedback of the detector performances. Offline analysis: locally on the data server or remotely using AMS server. During August, Tracker (X5b) test beam up to 25 concurrent users were accessing data on the offline system without any observable degradation. During 1999 Hbook Histograms and ntuples Persistent Data TB Analysis Package HTL n-tuples HBook Online (Prompt data) Monitoring: runs on the online machine. This provide a fast feedback of the detector performances. It’s implemented polling on the event collection, when new events are found the analysis is triggered and the histograms are updated. Offline analysis: is performed locally on the data server or remotely, accessing the data through the AMS server. It’s implemented making a loop on the event collection, when new events are found the analysis Observer is triggered and the histograms are updated. During 2000 Moves from Hbook Histograms and ntuples, to HTL and Tags See I. Gaponenko talk on IGUANA session F Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

16 Tracker Silicon Detector Performances Studies
Muon beams 50 GeV Silicon non irradiated detector APV6 Chip deconvolution mode FED VME Modules active area 62.5 mm x 61.5mm thickness 300 mm High Resistivity strip pitch 61 mm strip width 14 mm implanted strips 1024 Scl = 31.8 Ncl = 2.9 Scl/Ncl = 10.9 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

17 Muon Drift Tube Detector Performances Studies
DTBX Format bits (0:15): Drift Time (1.04ns) [0…65535] bit (16): Signal Edge [1=falling] bits (17:22): Cell Number [1..63] bits (23:25): Layer Number [1…4] bits (26:27): SuperLayer Number [1..3] Beam Profile Cell Nb Layer Cell The Tracking and timing performance of a chamber was optimezed with a design using 12 layers of drift tubes divided into three groups of four consecutive layers, called superlayer. Inside each superlayer the tubes are staggered by half tube. Two super layer measure the r-phi cooridnate, I.e. have the wires parallel to the beam line, and the thrid measures z the coordinate parallel to the beam line. A muon coming from the interaction point passes before an r-phi superlayer then a z superlayer and after the second r-phi superlayer. Drift Time (ns) Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

18 Muon Trigger (BTI) Test Beam Analysis
The Muon Test Beam analysis is fully integrated with the Muon and first level trigger reconstruction. For Bunch and Track Identifier (BTI) comparison between real data and simulation is performed. see C. Grandi talk on CMS Muon Trigger - session B The Bunch and Track identifier (BTI) has been studied from groups of four layers of staggered drift tubes (superlayer) with the aim of identifying tracks which give a signal in at least three of the sl planes. Each BTI is connected to nine wires of four layers.. The BTI collects the signals from the wires and injects then in one shift register where they are propagate at the drift velocity. After a number of clocks equal to the maximum drift time dividev by the clock frequency the position of the hits reproduce the positions where the track crossed the layers. The time, I.e. the number of clocks at which at least three his\ts “align” inside three shift registers Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

19 High Level Trigger Production with ORCA in 1999
Zebra files with HITS HEPEVT ntuples CMSIM MC Prod. Signal MB ORCA Digitization Objectivity Database DB pop. ORCA user Analysis ntuples PAW User ORCA ntuple production User Analysis Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

20 ORCA High Level Trigger 2000 production
First ORCA production in October 99 was very successful (>700GB in Objy/DB), but ORCA prod 2000 must have much more functionality: All data will be in the database Every CMSIM run will have its objects in many database files Single Db file contains concatenation from many CMSIM runs (64 k files Objectivity limit) Many layers of apparently autonomous federations actually synchronized by enforcing common schema and unique DbID’s Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

21 High Level Trigger Processing 2000
Minimum Bias JetMet Muon …... (FZ)User G3 Hits and Tracks JetMet Each box is an independent production running in “parallel” ORCA Xings &Digis ORCA RecObjs JetMet Also required: Online late clustering and selection (clustering to oblivion) Offline selection and cloning Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

22 Selective Tracker Digitization
Trigger Calorimetry Muon Tracker Trigger Calorimetry Muon Select Trigger Calorimetry Muon Tracker Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

23 FZ File ooHit dB's ~2 GB/file
ORCA 2000 Db Structure One CMSIM Job, oo-formatted into multiple Db’s. For example: FZ File Few kB/ev MC Info Container #1 ~300kB/ev 1 CMSIM Job ooHit dB's Calo/Muon Hits ~100kB/ev ~200kB/ev Tracker Hits Multiple sets of ooHits concatenated into single Db file. For example: MC Info Run1 MC Info Run2 ~2 GB/file Concatenated MC Info from N runs. MC Info Run3.. Physical and logical Db structures diverge... Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP

24 Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP
Conclusions The persistent object management is a major responsibility of CMS Analysis and Reconstruction Framework A DBMS is required to manage the large data set of CMS (including user data) An ODBMS is the natural choice if OO is used in all software Once an ODBMS is used to manage the experiment data, it’s very natural to use it to manage any kind of data related to detector studies and physics analysis Objectivity/DB has been evaluated in different prototypes which successfully stored and retrieved data (Test-Beam, simulated, reconstructed, statistical i.e histograms). From 1999 both for Test Beam and High Level Trigger studies we are in production using Objectivity/DB. Chep'00 Persistency in CMS L. Silvestris INFN Bari - CERN/EP


Download ppt "An ODBMS approach to persistency in CMS"

Similar presentations


Ads by Google