The COMPASS event store in 2002 CHEP03 - March 24-28 2003 The COMPASS event store in 2002 Presented by Vinicio Duic Massimo Lamanna IT Division CERN - CH On leave from INFN Trieste - Italy Vinicio Duic INFN Trieste - Italy
Summary COMPASS experiment Off-Line System Overview Data Model Adopted Data Storage Strategies 2002 Data Taking 2002 Data Storage & Processing Conclusions CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Fixed-target experiment at CERN SPS COMPASS Experiment (Common Muon-Proton Apparatus for Structure & Spectroscopy) Fixed-target experiment at CERN SPS Broad physics programme (m & p beams) Different apparatus set-ups High intensities and trigger rates Large accumulated data sets 1010 events/y ´ 30 kB/event = 300 TB/y High data-acquisition rate (35 MB/s) Design figures CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Centralised Data Storage The off-line system: choices Storage High data flux from on-line Huge amount of data (300 TB/y, many years of data taking) Continuous access from collaboration institutes Centralised Data Storage (transfer from exp. hall to tape drives -9540a in 2002, 9540b in 2003- in the computing centre through dedicated optic fibre) Processing Long period data taking Þ quasi on-line reconstruction Complex reconstruction Þ ~20k SI2000 (i.p.40k req’d) Wide physics programme Þ flexibility Design figures FARM of PC’s (need parallel reconstruction to get high performance/cost) CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
The off-line system: choices Reconstruction Reconstruction complexity OOP (C++) Flexibility Code handling CORAL (COmpass Reconstruction and AnaLysis) base components abstract interface to event-reconstruction components abstract interfaces to external packages in particular to access the data (DB, online files, …) components for reconstruction interchangeable, to suit reconstruction to different needs CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Data Model Collected data are shaped into objects (C++), stored hierarchically by Objectivity/DB Run DB DST DBs Disc Resident (metadata) Held on tert. MS Event DB RAW DBs CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Data Storage: CDR Design speed: 35 MB/s Data-Servers (Up to 20 X 0.5 TB) Thanks B. Gobbo INFN Trieste - It On-line PC’s (3.5 TB) CASTOR Execution Hosts (400) IN 2002 COMPASS has been CASTOR (heavy ) user No 1 CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Data taking 2002 4.5 TB! Design rate (35 MB/s) Collected sample Data (TB) Design rate (35 MB/s) Time (days) Collected sample (270 TB) In ~100 days of DAQ Data (TB) CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
At-peak: Network Activity b a c Traffic to tape a c b (Same time windows) CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
At-peak: DB Input Activity Traffic to tape b (Same time windows) CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Objectivity/DB pro’s & con’s Logical/physical layer separation independent from HSM details Clean integration w/t HSM Client driven Nice performances High concurrency in production (> 400 clients) Weak on client abort Oocleanup sometimes tricky Poor flexibility for read locks (100 clients) LockServer, AMS (network server): 1 box ´ process CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Migration to Oracle CERN Contract with Objectivity terminated; they have chosen Oracle as new DBMS PAST (Objectivity/DB 6.1.3) In the ODBMS: Metadata & Associations Raw Data Reconstructed Events Conditions In plain files: DB overhead ~30% of Raw Data size (turns into ~6% on tape) FUTURE (ORACLE 9i) In the ORDBMS: Relations (metadata) In plain files: Raw Data Reconstructed Events Conditions DB overhead: ~0.3% of Raw Data size CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Oracle vs Objectivity COMPASS only has to get » 35 MB/s CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
2002 Data Production DST on 2001 data (103 GB) with OBJY DST on 2002 data with OBJY Big; i/o size: 80 TB RAWs processed (multiple processing) 5.7 TB DSTs produced DST on 2002 data with ORACLE Just started (yet 1.3 TB DSTs produced) Incremental changes on DST formats + CORAL tuning (algorithm, calibrations,…) CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Oracle Production Input: Output: 17 TB RAWs 1.3 TB DSTs (8% of total) Feb. - Mar. 2003 CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
Batch CPU Load at CERN COMPASS 320 280 240 200 160 120 80 40 127 131 2002 2003 DAQ* Prod. Prod. 320 280 240 200 160 120 80 40 127 131 135 139 143 147 151 203 207 211 215 219 223 227 231 235 239 243 247 251 303 307 311 COMPASS CPU Time (NCU khrs) Objectivity/DB Oracle * CDR does not use the batch CPU Week CHEP03 - March 24-28 2003 M. Lamanna & V. Duic
COMPASS experience Usage of very different technologies Objectivity/DB Oracle Continuity in data access Only new I/O plug-in (CORAL good design!) Huge data migration (see M. Nowak’s talk) Very large data rate and data sizes Comparable to one LHC experiment Heavy duty usage of CASTOR in production CHEP03 - March 24-28 2003 M. Lamanna & V. Duic