June 22 L. Silvestris CMS/ARDA CMS/ARDA Lucia Silvestris INFN-Bari.

Slides:



Advertisements
Similar presentations
CMS Applications – Status and Near Future Plans
Advertisements

CMS Grid Batch Analysis Framework
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova Grape for analysis M.Corvo, F.Fanzago, N.Smirnov INFN Padova.
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
CMS-ARDA Workshop 15/09/2003 CMS/LCG-0 architecture Many authors…
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
2 nd GGF School on Grid Computing – July 2004 – n° 1 A.Fanfani INFN Bologna  Introduction about LHC and CMS  CMS Production on Grid  CMS Data challenge.
A tool to enable CMS Distributed Analysis
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
Physicists's experience of the EGEE/LCG infrastructure usage for CMS jobs submission Natalia Ilina (ITEP Moscow) NEC’2007.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Computing and LHCb Raja Nandakumar. The LHCb experiment  Universe is made of matter  Still not clear why  Andrei Sakharov’s theory of cp-violation.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Grid Workload Management Massimo Sgaravatto INFN Padova.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
CMS Stress Test Report Marco Verlato (INFN-Padova) INFN-GRID Testbed Meeting 17 Gennaio 2003.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
CMS Computing and Core-Software USCMS CB Riverside, May 19, 2001 David Stickland, Princeton University CMS Computing and Core-Software Deputy PM.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
LCG ARDA status Massimo Lamanna 1 ARDA in a nutshell ARDA is an LCG project whose main activity is to enable LHC analysis on the grid ARDA is coherently.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
29 Sept 2004 CHEP04 A. Fanfani INFN Bologna 1 A. Fanfani Dept. of Physics and INFN, Bologna on behalf of the CMS Collaboration Distributed Computing Grid.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
EGEE is a project funded by the European Commission under contract IST NA4/HEP work F Harris (Oxford/CERN) M.Lamanna(CERN) NA4 Open meeting.
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
EGEE is a project funded by the European Union under contract IST Package Manager Predrag Buncic JRA1 ARDA 21/10/04
David Stickland CMS Core Software and Computing
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
CMS Production Management Software Julia Andreeva CERN CHEP conference 2004.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
1-2 March 2006 P. Capiluppi INFN Tier1 for the LHC Experiments: ALICE, ATLAS, CMS, LHCb.
EDG Project Conference – Barcelona 13 May 2003 – n° 1 A.Fanfani INFN Bologna – CMS WP8 – Grid Planning in CMS Outline  CMS Data Challenges  CMS Production.
BaBar-Grid Status and Prospects
Real Time Fake Analysis at PIC
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
INFN-GRID Workshop Bari, October, 26, 2004
CMS Data Challenge Experience on LCG-2
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
CMS Data Challenge 2004 Claudio Grandi CMS Grid Coordinator
BOSS: the CMS interface for job summission, monitoring and bookkeeping
R. Graciani for LHCb Mumbay, Feb 2006
Grid Computing in CMS: Remote Analysis & MC Production
ATLAS DC2 & Continuous production
Status and plans for bookkeeping system and production tools
The LHCb Computing Data Challenge DC06
Presentation transcript:

June 22 L. Silvestris CMS/ARDA CMS/ARDA Lucia Silvestris INFN-Bari

2 L. Silvestris CMS/ARDA June 22 Outline u CMS CPT Schedules (Where we are, what’s next..) u Data Challange DC04 u Next steps After DC04 u Short term plan for CMS Distributed analysis u Discussion

3 L. Silvestris CMS/ARDA June 22 CMS Computing schedule u 2004  Mar/Apr. DC04 to study T0 Reconstruction, Data Distribution, Real- time analysis 25% of startup scale  May/Jul. Data available and useable by PRS groups  Sep. PRS analysis feed-backs  Sep. Draft CMS Computing Model in CHEP papers  Nov. ARDA prototypes  Nov. Milestone on Interoperability  Dec. Computing TDR in initial draft form. u 2005  July. LCG TDR and CMS Computing TDR  Post July?... DC05, 50% of startup scale.  Dec. Physics TDR u 2006  DC06 Final readiness tests  Fall. Computing Systems in place for LHC startup  Continuous testing and preparations for data

4 L. Silvestris CMS/ARDA June 22 Data Challenge 04 Aim of DC04:  reach a sustained 25Hz reconstruction rate in the Tier-0 farm (25% of the target conditions for LHC startup)  register data and metadata to a catalogue  transfer the reconstructed data to all Tier-1 centers  analyze the reconstructed data at the Tier-1’s as they arrive  publicize to the community the data produced at Tier-1’s  monitor and archive of performance criteria of the ensemble of activities for debugging and post-mortem analysis Not a CPU challenge, but a full chain demonstration! Pre-challenge production in 2003/04  70M Monte Carlo events (30M with Geant-4) produced  Classic and grid (CMS/LCG-0, LCG-1, Grid3) productions

5 L. Silvestris CMS/ARDA June 22 Data Challenge 04: layout Tier-2 Physicist T2storage ORCA Local Job Tier-2 Physicist T2storage ORCA Local Job Tier-1 Tier-1 agent T1storage ORCA Analysis Job MSS ORCA Grid Job Tier-1 Tier-1 agent T1storage ORCA Analysis Job MSS ORCA Grid Job Tier-0 Castor IB fake on-line process RefDB POOL RLS catalogue TMDB ORCA RECO Job GDB Tier-0 data distribution agents EB LCG-2 Services Tier-2 Physicist T2storage ORCA Local Job Tier-1 Tier-1 agent T1storage ORCA Analysis Job MSS ORCA Grid Job

6 L. Silvestris CMS/ARDA June 22 Data Challenge 04 Processing Rate   Processed about 30M events   But DST content not properly tested make this pass not useful for real (PRS) analysis   Generally kept up at T1’s in CNAF, FNAL, PIC   Got above 25Hz on many short occasions   But only one full day above 25Hz with full system   Working now to document the many different problems

7 L. Silvestris CMS/ARDA June 22 LCG-2 in DC04

8 L. Silvestris CMS/ARDA June 22 Real-Time (Fake) Analysis  Goals  Demonstrate data can be analyzed in real time at the T1  Fast Feedback to reconstruction (e.g. calibration, alignment, check of reconstruction code, etc.)  Establish automatic data replication to T2s  Make data available for offline analysis  Measure time elapsed between reconstruction at T0 and analysis at T1  Architecture  Set of software agents communicating via local mysql DB  Replication, data set completeness, job preparation & submission  Use LCG to run jobs  Private Grid Information System for CMS DC04  Private Resource Broker

9 L. Silvestris CMS/ARDA June 22 DC04 Real-Time (fake) Analysis  CMS software installation  CMS Software Manager (M. Corvo) installs software via a grid job provided by LCG  RPM distribution based on CMSI or DAR distribution  Used at CNAF, PIC, Legnaro, Ciemat and Taiwan with RPMs  Site manager installs RPM’s via LCFGng  Used at Imperial College  Still inadequate for general CMS users  Real-time analysis at Tier-1  Main difficulty is to identify complete file sets (i.e. runs)  Information today in TMDB or via findColls  Job processes single runs at the site close to the data files  File access via rfio  Output data registered in RLS Push data or info Pull info BOSS JDL RB RLS CE SE WN Job metadata bdII CE SE CE output data registration UI rfio

10 L. Silvestris CMS/ARDA June 22 DC04 Fake Analysis Architecture TMDB Mysql TMDB POOL RLS catalogue Transfer agent Replication agent Mass Storage agent SE Export Buffer PIC CASTOR StorageElement MSS CIEMAT disk SE PIC disk SE Drop agent Fake Analysis agent Drop Files LCG Resource Broker LCG Worker Node Data Transfer Fake Analysis  Drop agent triggers job preparation/submission when all files are available  Fake Analysis agent prepares xml catalog, orcarc, jdl script and submits job  Jobs record start/end timestamps in mysql DB

11 L. Silvestris CMS/ARDA June 22

12 L. Silvestris CMS/ARDA June 22   Maximum rate of analysis jobs: 194 jobs/hour   Maximum rate of analysed events: 26 Hz   Total of ~15000 analysis jobs via Grid tools in ~2 weeks (95-99% efficiency)   Datasets examples:   B 0 S  J/  Bkg: mu03_tt2mu, mu03_DY2mu   tTH, H  bbbar t  Wb W  l T  Wb W  had. Bkg: bt03_ttbb_tth Bkg: bt03_qcd170_tth Bkg: mu03_W1mu   H  WW  2  2 Bkg: mu03_tt2mu, mu03_DY2mu DC04 Real-time Analysis INFN

13 L. Silvestris CMS/ARDA June 22 Dataset bt03_ttbb_ttH analysed with executable ttHWmu Total execution time ~ 28 minutes ORCA execution time ~ 25 minutes Job waiting time before starting ~ 120 s Time for staging input and output files ~ 170 s Overhead of GRID + waiting time in queue Results: job time statistic

14 L. Silvestris CMS/ARDA June 22   Real-time analysis: two weeks of quasi-continuous running!   The total number of analysis jobs submitted ~   Overall Grid efficiency ~ 95-99%   Problems :   RLS query to prepare a POOL xml catalog done using file GUID otherwise much slower   Resource Broker disk being full causing the RB unavailability for several hours. This problem was related to large input/output sandbox. Possible solutions:   Set quotas on RB space for sandbox   Configure to use RB in cascade   Network problem at CERN, not allowing connections to the RLS and CERN RB   Legnaro CE/SE disappeared in the Information System during one night   Failures in updating Boss database due to overload of MySQL server (~30% ). The Boss recovery procedure was used Real time Analysis Summary

June 22L. Silvestris CMS/ARDA15 tTH analysis results Leptonic Top Hadronic Top Hadronic W Reconstructed Masses:

16 L. Silvestris CMS/ARDA June 22

17 L. Silvestris CMS/ARDA June 22 Possible evolution of CCS tasks (Core Computing and Software) CCS will Reorganize to match the new requirements and the move from R&D to Implementation for Physics  Meet the PRS Production Requirements (Physics TDR Analysis)  Build the Data Management and Distributed Analysis infrastructures  Production Operations group [NEW]  Outside of CERN. Must find ways to reduce manpower requirements.  Using predominantly (only?) GRID resources.  Data Management Task [NEW]  Project to respond to DM RTAG  Physicists/ Computing to define CMS Blueprint, relationships with suppliers (LCG/EGEE…), CMS DM task in Computing group  Expect to make major use of manpower and experience from CDF/D0 Run II  Workload Management Task [NEW]  Make the Grid useable to CMS users  Make major use of manpower with EDG/LCG/EGEE experience  Distributed Analysis Cross Project (DAPROM) [NEW]  Coordinate and harmonize analysis activities between CCS and PRS  Work closely with Data and Workload Management tasks and LCG/EGEE

18 L. Silvestris CMS/ARDA June 22 After DC04: CMS-ARDA/DAProm u CMS-ARDA (DAProm) Goal is to deliver an end-to-end (E2E) u Distributed analysis system. u The analysis should be possible at all the different event data (SimHits, Digi, DST,AOD,...) level, using COBRA/ORCA code. u Distributed analysis extends the analysis to an environment where the users, data and processing are distributed over the grid. u Cms-Arda Mailing List:

19 L. Silvestris CMS/ARDA June 22 A Vision on CMS Analysis

20 L. Silvestris CMS/ARDA June 22 u Hierarchy of Processes (Experiment, Analysis Groups, End-User) Reconstruction Selection Analysis Re-processing 3 per year Iterative selection Once per month Different Physics cuts & MC comparison ~1 time per day Experiment- Wide Activity (10 9 events) ~20 Groups’ Activity (10 9  10 7 events) ~25 Individual per Group Activity (10 6 –10 8 events) New detector calibrations Or understanding Trigger based and Physics based refinements Algorithms applied to data to get results Monte Carlo Batch Analysis Interactive & Batch Analysis

21 L. Silvestris CMS/ARDA June 22

22 L. Silvestris CMS/ARDA June 22 cms-Arda task Decomposition u What are the End-user inputs:  DataSet and owner (catalog) è (high level identification for event samples)  analysis programs executable + related shared libraries (+ standard cms+external shared libraries) u Possible CMS-ARDA (DAProm) Decomposition  User Interface to production-service to request DataSet (data- products)  User Interface to production-service to monitor the status of a request and where data are located (TierN)

23 L. Silvestris CMS/ARDA June 22 cms-Arda task Decomposition u Workflow management tools for analysis-service  DataSet fully attached to a single Catalog for each Tn  Grid Tools (services) to submit and monitor analysis job u Data Location service  Discovery of remote catalogs and their content u Data-transfer service  Like TMDB. Requirements:  A local catalog at the production site (one for each dataset) is required and the tool should populate the complete full local catalog at the remote site (Tn) u Interface to the storage-service  In order to open/read/write files from an application (srm, nfs, afs, others...) u Analysis-task wizard  Requirement: Provide a simple interface and guidance to the Physicists

24 L. Silvestris CMS/ARDA June 22 First Steps for cms-ARDA Project u Short term: u Give easy access to PRS for analysis asap to PCP and DC04 data available (July-August) :  This will be done using SW already available (like LCG-RB, Grid-ICE, Monalisa and others.) + TMDB + other CMS specific Data-Management tools + latest COBRA/ORCA versions that already included requirements from end-user, like the possibility to use different level for POOL catalog + other components that could comes from GAE etc + few other new components.  Clearly we will use LCG-2 /Grid-3 SW that was already used during DC04 (Workload Management, Job monitoring..)  People from different institutions (US, INFN, RAL, GRIDKA, CERN..) u This is essential in order to provide data to PRS Groups.

25 L. Silvestris CMS/ARDA June 22 First Steps for cms-ARDA Project u Short term: u Start/continue the evaluation of the EGEE SW (including also LCG-2 components) asap, provide feed-back in order to create hopefully together a system that could work for CMS Analysis offline and online  People from different institutions (CERN, US, INFN, RAL, GridKA) u Longer term:  This will be address asap when new organization is in place!