Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 CMS Distributed Data Analysis Challenges Claudio Grandi on behalf of the CMS Collaboration.

Slides:

Advertisements

Similar presentations

Claudio Grandi INFN Bologna DataTAG WP4 meeting, Bologna 14 jan 2003 CMS Grid Integration Claudio Grandi (INFN – Bologna)

Advertisements

CMS Applications – Status and Near Future Plans

CMS Grid Batch Analysis Framework

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.

Réunion DataGrid France, Lyon, fév CMS test of EDG Testbed Production MC CMS Objectifs Résultats Conclusions et perspectives C. Charlot / LLR-École.

CMS-ARDA Workshop 15/09/2003 CMS/LCG-0 architecture Many authors…

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.

CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.

The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

A tool to enable CMS Distributed Analysis

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

CMS Report – GridPP Collaboration Meeting VIII Peter Hobson, Brunel University22/9/2003 CMS Applications Progress towards GridPP milestones Data management.

Dave Newbold, University of Bristol24/6/2003 CMS MC production tools A lot of work in this area recently! Context: PCP03 (100TB+) just started Short-term.

CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.

December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.

November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;

Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.

LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.

Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.

1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.

Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.

CMS Report – GridPP Collaboration Meeting V Peter Hobson, Brunel University16/9/2002 CMS Status and Plans Progress towards GridPP milestones Workload management.

11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,

CMS Stress Test Report Marco Verlato (INFN-Padova) INFN-GRID Testbed Meeting 17 Gennaio 2003.

Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.

29 May 2002Joint EDG/WP8-EDT/WP4 MeetingClaudio Grandi INFN Bologna LHC Experiments Grid Integration Plans C.Grandi INFN - Bologna.

13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.

Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)

ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.

DPS/ LCG Review Nov 2003 Working towards the Computing Model for CMS David Stickland CMS Core Software and Computing.

Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.

US CMS Centers & Grids – Taiwan GDB Meeting1 Introduction l US CMS is positioning itself to be able to learn, prototype and develop while providing.

ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.

29 Sept 2004 CHEP04 A. Fanfani INFN Bologna 1 A. Fanfani Dept. of Physics and INFN, Bologna on behalf of the CMS Collaboration Distributed Computing Grid.

Claudio Grandi INFN-Bologna CHEP 2000Abstract B 029 Object Oriented simulation of the Level 1 Trigger system of a CMS muon chamber Claudio Grandi INFN-Bologna.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CRAB: the CMS tool to allow data analysis.

INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.

ATLAS Grid Computing Rob Gardner University of Chicago ICFA Workshop on HEP Networking, Grid, and Digital Divide Issues for Global e-Science THE CENTER.

David Stickland CMS Core Software and Computing

RefDB: The Reference Database for CMS Monte Carlo Production Véronique Lefébure CERN & HIP CHEP San Diego, California 25 th of March 2003.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

L. Perini DATAGRID WP8 Use-cases 19 Dec ATLAS short term grid use-cases The “production” activities foreseen till mid-2001 and the tools to be used.

CMS Production Management Software Julia Andreeva CERN CHEP conference 2004.

EDG Project Conference – Barcelona 13 May 2003 – n° 1 A.Fanfani INFN Bologna – CMS WP8 – Grid Planning in CMS Outline  CMS Data Challenges  CMS Production.

Claudio Grandi INFN Bologna Workshop congiunto CCR e INFNGrid 13 maggio 2009 Le strategie per l’analisi nell’esperimento CMS Claudio Grandi (INFN Bologna)

ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.

Real Time Fake Analysis at PIC

U.S. ATLAS Grid Production Experience

BOSS: the CMS interface for job summission, monitoring and bookkeeping

BOSS: the CMS interface for job summission, monitoring and bookkeeping

INFN-GRID Workshop Bari, October, 26, 2004

CMS Data Challenge Experience on LCG-2

ALICE Physics Data Challenge 3

Sergio Fantinel, INFN LNL/PD

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

CMS Data Challenge 2004 Claudio Grandi CMS Grid Coordinator

BOSS: the CMS interface for job summission, monitoring and bookkeeping

Scalability Tests With CMS, Boss and R-GMA

US ATLAS Physics & Computing

R. Graciani for LHCb Mumbay, Feb 2006

ATLAS DC2 & Continuous production

Presentation transcript:

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 CMS Distributed Data Analysis Challenges Claudio Grandi on behalf of the CMS Collaboration

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 2 Outline CMS Computing Environment CMS Computing Milestones OCTOPUS: CMS Production System 2002 Data productions 2003 Pre-Challenge production (PCP03) PCP03 on grid 2004 Data Challenge (DC04) Summary

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 CMS Computing Environment

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 4 CMS computing context LHC will produce 40 million bunch crossing per second in the CMS detector (1000 TB/s) The on-line system will reduce the rate to 100 events per second (100 MB/s raw data) –Level-1 trigger: hardware –High level trigger: on-line farm Raw data (1MB/evt) will be: –archived on persistent storage (~1 PB/year) –reconstructed to DST (~0.5 MB/evt) and AOD (~20 KB/evt) Reconstructed data (and part of raw data) will be: –distributed to computing centers of collaborating institutes –analyzed by physicists at their own institutes

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 5 CMS Data Production at LHC Level 1 Trigger High Level Trigger 40 MHz (1000 TB/sec) 75 KHz (50 GB/sec) 100 Hz (100 MB/sec) Data Recording & Offline Analysis

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 6 CMS Distributed Computing Model Tier 1 Tier2 Center Online System CERN Center PBs of Disk; Tape Robot FNAL Center IN2P3 Center INFN Center RAL Center Institute Workstations ~ MBytes/sec Gbps 0.1 to 10 Gbps Physics data cache ~PByte/sec ~ Gbps Tier2 Center ~ Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 7 CMS software for Data Simulation Event Generation –Pythia and other generators Generally Fortran programs. Produce N-tuple files (HEPEVT format) Detector simulation –CMSIM (uses GEANT-3) Fortran program. Produces Formatted Zebra (FZ) files from N-tuples –OSCAR (uses GEANT-4 and the CMS COBRA framework) C++ program. Produces POOL files (hits) from N-tuples Digitization (DAQ simulation) –ORCA (uses the CMS COBRA framework) C++ program. Produces POOL files (digis) from hits POOL files or FZ Trigger simulation –ORCA Reads digis POOL files Normally run as part of the reconstruction phase

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 8 CMS software for Data Analysis Reconstruction –ORCA Produces POOL files (DST and AOD) from hits or digis POOL files Analysis –ORCA Reads POOL files in (hits, digis,) DST, AOD formats –IGUANA (uses ORCA and OSCAR as back-end) Visualization program (event display, statistical analysis)

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 9 CMS software: ORCA & C. Pythia Zebra files with HITS HEPEVT Ntuples CMSIM (GEANT3) ORCA/COBRA Digitization Digis Database (POOL) ORCA/COBRA Hit Formatter Hits Database (POOL) OSCAR/COBRA (GEANT4) ORCA Reconstruction or User Analysis Ntuples or Root files Database (POOL) IGUANA Interactive Analysis Other Generators Merge signal and pile-up  Data Simulation   Data Analysis 

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 CMS Computing Milestones

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 11 CMS computing milestones DAQ TDR (Technical Design Report) Spring-2002 Data Production Software Baselining Computing & Core Software TDR 2003 Data Production (PCP04) 2004 Data Challenge (DC04) Physics TDR 2004/05 Data Production (DC05) Data Analysis for physics TDR “Readiness Review” 2005 Data Production (PCP06) 2006 Data Challenge (DC06) Commissioning

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 12 Average slope =x2.5/year DC04 Physics TDR DC05 LCG TDR DC06 Readiness LHC 2E33 LHC 1E34 DAQTDR Size of CMS Data Challenges 1999: 1TB – 1 month – 1 person : 27 TB – 12 months – 30 persons 2002: 20 TB – 2 months – 30 persons 2003: 175 TB – 6 months – <30 persons

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 13 World-wide Distributed Productions CMS Production Regional Centre CMS Distributed Production Regional Centre

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 14 CMS Computing Challenges CMS Computing challenges include: –production of simulated data for studies on: Detector design Trigger and DAQ design and validation Physics system setup –definition and set-up of analysis infrastructure –definition of computing infrastructure –validation of computing model Distributed system Increasing size and complexity Tightened to other CMS activities –provide computing support for all CMS activities

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 OCTOPUS CMS Production System

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 16 BOSS DB Dataset metadata Job metadata OCTOPUS Data Production System McRunjob + plug-in CMSProd Site Manager starts an assignment RefDB Phys.Group asks for a new dataset shell scripts Local Batch Manager Computer farm Job level query Data-level query Production Manager defines assignments Push data or info Pull info JDL Grid (LCG) Scheduler LCG RLS POOL DAG job DAGMan (MOP) Chimera VDL Virtual Data Catalogue Planner DPE

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 17 Remote connections to databases Job Wrapper (job instru- mentation) User Job Journal writer Remote updater Job input Job output Journal Catalog Metadata DB Job input Job output Journal Catalog Asynchronous updater Worker Node User Interface Metadata DB are RLS/POOL, RefDB, BOSS DB Direct connection from WN

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 18 Job production MCRunJob –Modular: produce plug-in’s for: reading from RefDB reading from simple GUI submitting to a local resource manager submitting to DAGMan/Condor-G (MOP) submitting to the EDG/LCG scheduler producing derivations in the Chimera Virtual Data Catalogue –Runs on the user (e.g. site manager) host –Defines also the sandboxes needed by the job –If needed, the specific submission plug-in takes care of: preparing the XML POOL catalogue with input files information moving the sandbox files to the worker nodes CMSProd

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 19 Job Metadata management Job parameters that represent the job running status are stored in a dedicated database: –when did the job start? –is it finished? but also: –how many events did it produce so far? BOSS is a CMS-developed system that does this extracting the info from the job standard input/output/error streams –The remote updater is based on MySQL –Remote updater are being developed now based on: R-GMA (still has scalability problems) Clarens (just started)

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 20 Dataset Metadata management Dataset metadata are stored in the RefDB: –by what (logical) files is it made of? but also: –what input parameters to the simulation program? –how many events have been produced so far? Information may be updated in the RefDB in many ways: –manual Site Manager operation –automatic from the job –remote updaters based on R-GMA and Clarens (similar to those developed for BOSS) will be developed Mapping of logical names to physical file names will be done on the grid by RLS/POOL

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec Data Productions

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK production statistics Used Objectivity/DB for persistency 11 Regional Centers, more than 20 sites, about 30 site managers Spring 2002 Data production –Generation and detector simulation: 6 million events in 150 physics channels –Digitization: >13 million events with different configuration (luminosity) –about 200 KSI2000 months –more than 20 TB digitized data Fall 2002 Data production –10 million events, full chain (small output) –about 300 KSI2000 months –Also productions on grid!

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 23 Spring 2002 production history 1.5 million events per month CMSIM

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 24 Fall 2002 CMS grid productions CMS/EDG Stress Test on EDG testbed & CMS sites Top-down approach: more functionality but less robust, large manpower needed USCMS IGT Production in the US Bottom-up approach: less functionality but more stable, little manpower needed 1.2 million events in 2 months  260,000 events in 3 weeks 

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec Pre-Challenge Production

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 26 PCP04 production statistics Started in july. Supposed to end by Xmas. Generation and simulation: –48 million events with CMSIM 50  150 KSI2K s/event, 2000 KSI2K months ~ 1MB/event, 50 TB hit-formatting in progress. POOL format reduces size of a factor of 2! –6 million events with OSCAR 100  200 KSI2K s/event, 350 KSI2K months (in progress) Digitization just starting –need to digitize ~70 million events. Not all in time for DC04! Estimated: ~30-40 KSI2K s/event, ~950 KSI2K months ~1.5 MB/event, 100 TB Data movement to CERN –~1TB/day for 2 months

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 27 PCP 2003 production history 13 million events per month CMSIM

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 PCP04 on grid

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 29 US DPE production system Running on Grid2003 – ~ 2000 CPU’s – Based on VDT – EDG VOMS for authentication – GLUE Schema for MDS Information Providers – MonaLisa for monitoring – MOP for production control - Dagman and Condor-G for specification and submission - Condor-based match-making process selects resources US DPE Production on Grid2003 Master Site Remote Site 1 MCRunJobmop_submitter DAGMan Condor-G GridFTP Batch Queue GridFTP Remote Site N Batch Queue GridFTP MOP System

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 30 Performance of US DPE USMOP Regional Center Mevts pythia: ~30000 jobs ~1.5min each, ~0.7 KSI2000 months Mevts cmsim: ~9000 jobs ~10hours each, ~90 KSI2000 months ~3.5 TB data Now running OSCAR productions CMSIM

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 31 CMS/LCG-0 testbed CMS/LCG-0 is a CMS-wide testbed based on the LCG pilot distribution (LCG-0), owned by CMS –joint CMS – DataTAG-WP4 – LCG-EIS effort –started in june 2003 –Components from VDT and EDG 1.4.X (LCG pilot) –Components from DataTAG (GLUE schemas and info providers) –Virtual Organization Management: VOMS –RLS in place of the replica catalogue (uses rlscms by CERN/IT) –Monitoring: GridICE by DataTAG –tests with R-GMA (as BOSS transport layer for specific tests) –no MSS direct access (bridge to SRB at CERN) About 170 CPU’s, 4 TB disk –Bari Bologna Bristol Brunel CERN CNAF Ecole Polytechnique Imperial College ISLAMABAD-NCP Legnaro Milano NCU-Taiwan Padova U.Iowa Allowed to do CMS software integration while LCG-1 was not out

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 32 User Interface CMS/LCG-0 Production system OCTOPUS installed on User Interface CMS software (installed on Computing Elements as RPM’s) BOSS DB McRunjob + ImpalaLite RefDB JDL Grid (LCG) Scheduler RLS SE CE CMS software CE CMS software CE CMS software CE SE WN SE CE CMS software Job metadata Dataset metadata Push data or info Pull info Grid Information System (MDS)

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 33 CMS/LCG-0 performance CMS-LCG Regional Center based on CMS/LCG Mevts “heavy” pythia: ~2000 jobs ~8hours each, ~10 KSI2000 months 1.5 Mevts cmsim: ~6000 jobs ~10hours each, ~55 KSI2000 months ~2.5 TB data Inefficiency estimation: –5% to 10% due to sites’ misconfiguration and local failures –0% to 20% due to RLS unavailability –few errors in execution of job wrapper –Overall inefficiency: 5% to 30% Pythia + CMSIM Now used as a play-ground for CMS grid-tools development

Claudio Grandi INFN Bologna ACAT'03 - KEK 3-Dec-2003 Data Challenge 2004 (DC04)

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK Data Challenge Test the CMS computing system at a rate which corresponds to the 5% of the full LHC luminosity –corresponds to the 25% of the LHC startup luminosity –for one month (February or March 2004) –25 Hz data taking rate at a luminosity of 0.2 x cm -2 s -1 –50 million events (completely simulated up to digis during PCP03) used as input Main tasks –Reconstruction at Tier-0 (CERN) at 25 Hz (~40 MB/s) –Distribution of DST to Tier-1 centers (~5 sites) –Re-calibration at selected Tier-1 centers –Physics-groups analysis at the Tier-1 centers –User analysis from the Tier-2 centers

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 36 DC04 Analysis challenge DC04 Calibration challenge T0 T1 T2 T1 T2 Fake DAQ (CERN) DC04 T0 challenge SUSY Background DST HLT Filter ? CERN disk pool ~40 TByte (~20 days data) TAG/AOD (replica) TAG/AOD (replica) TAG/AOD (20 kB/evt) Replica Conditions DB Replica Conditions DB Higgs DST Event streams Calibration sample Calibration Jobs MASTER Conditions DB 1 st pass Recon- struction 25Hz 1.5MB/evt 40MByte/s 3.2 TB/day Archive storage CERN Tape archive Disk cache 25Hz 1MB/e vt raw 25Hz 0.5MB reco DST Higgs background Study (requests New events) Event server 50M events 75 Tbyte 1TByte/day 2 months PCP CERN Tape archive

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 37 Tier-0 challenge Data serving pool to serve digitized events at 25Hz to the computing farm with 20/24 hour operation. –40 MB/s –Adequate buffer space (at least 1/4 of the digi sample in the disk buffer). –Pre-staging software. File locking while in use, buffer cleaning and restocking as files have been processed Computing Farm: approximately 400 CPU’s –jobs running 20/24 hours. 500 events/job, 3 hour/job –Files in buffer locked till successful job completion –No dead-time can be introduced to the DAQ. Latencies must be no more than of order 6-8 hours CERN MSS: ~50 MB/s archiving rate –archive ~ 1.5 MB * 25 Hz raw data (digis) –archive ~0.5 MB * 25 Hz reconstructed events (DST) File catalog: POOL/RLS –Secure and complete catalog of all data input/products –Accessible and/or replicable to the other computing centers

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 38 Data distribution challenge Replication of the DST and part of raw data at one or more Tier-1 centers –possibly using the LCG replication tools –foreseen some event duplication –At CERN ~3 GB/s traffic without inefficiencies (about 1/5 at Tier-1) Tier-0 catalog accessible by all sites Replication of calibration samples (DST/raw) to selected Tier-1 Transparent access of jobs at the Tier-1 sites to the local data whether in MSS or on disk buffer Replication of any Physics-Groups (PG) data produced at the Tier-1 sites to the other Tier-1 sites and interested Tier-2 sites Monitoring of Data Transfer activites –e.g. with MonaLisa

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 39 Calibration challenge Selected sites will run calibration procedures Rapid distribution of the calibration samples (within hours at most) to the Tier-1 site and automatically scheduled jobs to process the data as it arrives. Publication of the results in an appropriate form that can be returned to the Tier-0 for incorporation in the calibration “database” Ability to switch calibration “database” at the Tier-0 on the fly and to be able to track from the meta-data which calibration table has been used.

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 40 Tier-1 analysis challenge All data distributed from Tier-0 safely inserted to local storage Management and publication of a local catalog indicating status of locally resident data –define tools and procedures to synchronize a variety of catalogs with the CERN RLS catalog (EDG-RLS, Globus-RLS, SRB-Mcat, …) –Tier-1 catalog accessible to at least the “associated” Tier-2 centers Operation of the physics-group (PG) productions on the imported data –“production-like” activity Local computing facilities made available to Tier-2 users –Possibly via the LCG job submission system Export of the PG-data to requesting sites (Tier-0, -1 or -2) Registration of the data produced locally to the Tier-0 catalog to make them available to at least selected sites –possibly via the LCG replication tools

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 41 Tier-2 analysis challenge Point of access to computing resources of the physicists Pulling of data from peered Tier-1 sites as defined by the local Tier-2 activities Analysis on the local PG-data produces plots and/or summary tables Analysis on distributed PG-data or DST available at least at the reference Tier-1 and “associated” Tier-2 centers. –Results are made available to selected remote users possibly via the LCG data replication tools. Private analysis on distributed PG-data or DST is outside DC04 scope but will be kept as a low-priority milestone –use of a Resource Broker and Replica Location Service to gain access to appropriate resources without knowing where the input data are –distribution of user-code to the executing machines –user-friendly interface to prepare, submit and monitor jobs and to retrieve results

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 42 Summary of DC04 scale Tier-0 –Reconstruction and DST production at CERN 75 TB Input Data 180 KSI2K months = 400 hour operation SI2K/CPU) 25TB Output data 1-2 TB/Day Data Distribution from CERN to sum of T1 centers Tier-1 –Assume all (except CERN) “CMS” Tier-1’s participate CNAF, FNAL, Lyon, Karlsruhe, RAL –Share the T0 output DST between them (~5-10TB each) 200 GB/day transfer from CERN (per T1) –Perform scheduled analysis group “production” ~100 KSI2K months total = ~50 CPU per T1 (24 hrs/30 days) Tier-2 –Assume about 5-8 T2 may be more… Store some of PG-data at each T2 (500GB-1TB) Estimate 20 CPU at each center for 1 month

Claudio Grandi INFN Bologna 3-Dec-2003 ACAT'03 - KEK 43 Summary Computing is a CMS-wide activity –18 regional centers, ~ 50 sites Committed to support other CMS activities –support analysis for DAQ, Trigger and Physics studies Increasing in size and complexity –1 TB in 1 month at 1 site in 1999 –170 TB in 6 months at 50 sites today –Ready for full LHC size in 2007 Exploiting new technologies –Grid paradigm adopted by CMS –Close collaboration with LCG and EU and US grid projects –Grid tools assuming more and more importance in CMS