VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
VI/ CERN Dec 4 Slide 2 Baselining Architecture/Framework/Toolkits Current schedule is to Baseline CMS “Offline” Software in time for the Physics TDR. Major activity in next 12/18 month will be to define and prototype the initial production software for LHC operation è Review Architecture è Choose products è Prototype and implement middleware è Implement framework and toolkits A primary goal is to ensure that the architecture will support and take profit of the evolution of IT technology at negligible cost for CMS physics software è Some components harder to change: p Programming language p Data Management layer Object Store plays a central role in CMS computing model: è Persistency cannot be regarded as just other basic computing service è We must ensure to be able to access data for the whole lifetime of the experiment (even longer)
VI/ CERN Dec 4 Slide 3 File Distributed Data Store Data Browser Analysis job wizards Simulation Reconstruction PersistencyServices NetworkServices Coherent Analysis Environment Visualization BatchServices VisualizationTools AnalysisTools Software Development
VI/ CERN Dec 4 Slide 4 CMS Data Analysis Model Detector Control Online Monitoring Environmental data store Request part of event Simulation store Data Quality Calibrations Group Analysis User Analysis on demand Request part of event Request part of event Store rec-Obj and calibrations Quasi-online Reconstruction Request part of event Store rec-Obj Persistent Object Store Manager Database Management System Event Filter Object Formatter PhysicsPaper
VI/ CERN Dec 4 Slide 5 HEP Data Event DataSet DataSetMeta-Data Event Electrons Electrons Tracker Alignment Tracks Tracks Ecal calibration Ecal calibration User Tag (N-tuple) l Event-Collection Meta-Data l Environmental data è Detector and Accelerator status è Calibrations, Alignments (luminosity, selection criteria, …) l … l Event Data, User Data Navigation is essential for an effective physics analysis Complexity requires coherent access mechanisms Event Collection CollectionMeta-Data
VI/ CERN Dec 4 Slide 6 A Complex use case l One simulated event l Digitized è At Different luminosity p Add random pileup from a large minbias sample è With different readout schemes è With different algorithms l Reconstructed è With different algorithms è With “imprecise” alignments and calibrations l Analyzed è For any reconstructed object, trace-back contributions from all simulated sources p Trigger event (hits, tracks) p Pile-up p Noise è Correlate differences in the final samples with differences in digitization and/or reconstruction at the highest level of granularity p This simulated track is reconstructed with 7 hits here and 9 there: why?
VI/ CERN Dec 4 Slide 7 Analysis & Reconstruction Framework ODBMS Geant3/4 CLHEP Paw Replacement C++ standard library Extension toolkit Reconstruction Algorithms Data Monitoring Event Filter Physics Analysis Calibration Objects Event Objects Configuration Objects Generic Application Framework Physics modules Utility Toolkit Specific Framework adapters and extensions
VI/ CERN Dec 4 Slide 8 DataBase Management System DBMS Server Distributed, Hierarchical, File Storage System Application (Distributed) DBMS Client Application Representation Persistent Data Representation Database internal Representation Database Storage (Server+Files) Tertiary Storage (Tapes) NETWORK
VI/ CERN Dec 4 Slide 9 Hybrid Store CMS vision of an Hybrid Store is of a Data Management System whose components do not come all from a single “vendor” è It is User/CMS/LHC/CERN/HEP responsibility to glue it together è Hope to produce a Data Management System Using different Data Management Systems in different parts of the problem domain is not considered a Hybrid Store è In loosely coupled components p Shift-list and event-data è Different views of the same data p EDMS vs Construction-DB vs DDD vs CalibDB vs my logbook è In different environment p Online vs Offline vs interactive p Production vs single user è In different components of a batch-sequential architecture p Read format different than write format è In different components of a logical hierarchy p MetaData store different than Data store p Parent store different than children store è Abandon every hope to navigate outside each specific applications
VI/ CERN Dec 4 Slide 10 COBRA vs Hybrid-Store (Divide et Impera) l We need a clear and complete interface between COBRA and the Hybrid- store è Components of the Hybrid Store are NOT individual components of CMS Software Architecture è CMS Framework will NOT provide the glue among different components of the Hybrid Store è COBRA will NOT interface directly to the various components of the Hybrid- Store l The Data Management System should provide a coherent and consistent interface to è Data Definition è Object Query è Object Navigation è Physical Location Management l The Data Management System should NOT impose (imply) an object model and/or access-pattern è COBRA defines the event model (static and dynamic) è DDD defines the geometry model è etc