CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.

Slides:



Advertisements
Similar presentations
WP1 Grid Workload Management Massimo Sgaravatto INFN Padova
Advertisements

Installation and evaluation of the Globus toolkit WP 1 INFN-GRID Workload management WP 1 DATAGRID WP 2.1 INFN-GRID Massimo Sgaravatto INFN Padova.
INFN & Globus activities Massimo Sgaravatto INFN Padova.
Grid Workload Management (WP 1) Report to INFN-GRID TB Massimo Sgaravatto INFN Padova.
WP 1 (Globus) Status Report Massimo Sgaravatto INFN Padova for the INFN Globus group
Work Package 1 Installation and Evaluation of the Globus Toolkit Massimo Sgaravatto INFN Padova.
Evaluation of the Globus Toolkit: Status Roberto Cucchi – INFN Cnaf Antonia Ghiselli – INFN Cnaf Giuseppe Lo Biondo – INFN Milano Francesco Prelz – INFN.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Status of Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Globus activities within INFN Massimo Sgaravatto INFN Padova for the INFN Globus group
Workload Management Massimo Sgaravatto INFN Padova.
L.M.Barone – INFN Rome October 2000 ACAT FNAL Management of Large Scale Data Productions for the CMS Experiment Presented by L.M.Barone Università.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Status of Globus activities within INFN (update) Massimo Sgaravatto INFN Padova for the INFN Globus group
First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
The EDG Testbed Deployment Details The European DataGrid Project
Andrew McNab - Manchester HEP - 26 June 2001 WG-H / Support status Packaging / RPM’s UK + EU DG CA’s central grid-users file grid “ping”
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto - INFN Padova Francesco Prelz – INFN Milano.
INFN-GRID Globus evaluation (WP 1) Massimo Sgaravatto INFN Padova for the INFN Globus group
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
GRID The GRID distribution toolkit at INFN Flavia Donno (INFN Pisa) Andrea Sciaba` (INFN Pisa) Zhen Xie (INFN Pisa) presented by Massimo Sgaravatto (INFN.
DATAGRID ConferenceTestbed0 - resources in Italy Luciano Gaido 1 DATAGRID WP6 Testbed0 resources in Italy Amsterdam March,
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,
Grid Workload Management Massimo Sgaravatto INFN Padova.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
The ALICE short-term use case DataGrid WP6 Meeting Milano, 11 Dec 2000Piergiorgio Cerello 1 Physics Performance Report (PPR) production starting in Feb2001.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Globus Toolkit Massimo Sgaravatto INFN Padova. Massimo Sgaravatto Introduction Grid Services: LHC regional centres need distributed computing Analyze.
GRID Zhen Xie, INFN-Pisa, on DataGrid WP6 meeting1 Globus Installation Toolkit Zhen Xie On behalf of grid-release team INFN-Pisa.
Proposal for a IS schema Massimo Sgaravatto INFN Padova.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group
Claudio Grandi INFN-Bologna CHEP 2000Abstract B 029 Object Oriented simulation of the Level 1 Trigger system of a CMS muon chamber Claudio Grandi INFN-Bologna.
GRID The GRID distribution toolkit at INFN Flavia Donno (INFN Pisa) Andrea Sciaba` (INFN Pisa) Zhen Xie (INFN Pisa) presented by Massimo Sgaravatto (INFN.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto INFN Padova.
JSS Job Submission Service Massimo Sgaravatto INFN Padova.
4/9/ 2000 I Datagrid Workshop- Marseille C.Vistoli Wide Area Workload Management Work Package DATAGRID project Parallel session report Cristina Vistoli.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
Accounting in DataGrid HLR software demo Andrea Guarise Milano, September 11, 2001.
WP1 Status and plans Francesco Prelz, Massimo Sgaravatto 4 th EDG Project Conference Paris, March 6 th, 2002.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Workload Management Workpackage
First proposal for a modification of the GIS schema
GRID Workload Management System for CMS fall production
Presentation transcript:

CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea Sciaba` (INFN Pisa) Massimo Sgaravatto (INFN Padova) Zhen Xie (INFN Pisa)

M. Sgaravatto - INFN Padova Introduction Goals Evaluate the existing GRID technologies with real applications and on real production environments Can these GRID tools be useful to “manage” these HEP applications ? Collaboration between: CMS INFN-GRID WP 1 (Installation and Evaluation of the Globus toolkit) DataGrid WP 1 (Grid Workload Management)

M. Sgaravatto - INFN Padova Applications MC Prod. ORCA Prod. Mirrored Db’s Signal Zebra files with HITS ORCA Digitization (merge signal and MB) Objectivity Database HEPEVT ntuples CMSIM HLT Algorithms New Reconstructed Objects HLT Grp Databases ORCA ooHit Formatter Objectivity Database MB Objectivity Database Catalog import Objectivity Database Objectivity Database ytivitcejbOesabataD

M. Sgaravatto - INFN Padova Tested configuration for CMS production Globus GRAM CONDOR Globus GRAM LSF Bologna Pisa condor_submit (Globus Universe) Condor-G Submit jobs Local Resource Management Systems Production manager CMS Farms Padova Condor-G as reliable, crash-proof submitting service GRAM as uniform interface to different resource management systems

M. Sgaravatto - INFN Padova Overview PC farms at each site installed and configured using the CMS farm kickstart toolkit PC farms managed by possible different local resource management systems Globus GRAM as uniform interface to the different local resource management systems Globus deployment using the INFNGRID distribution toolkit (see Zhen’s presentation) considering the INFN setup

M. Sgaravatto - INFN Padova Overview Condor-G as reliable, crash proof submitting service Job submission and monitoring by the production manager from a single machine The production manager decides on which Globus resource (farm) the job must be executed Executable and input files stored on the executing farm Output files created on the executing machine Log files created on the submitting machine Authentication using Globus GSI (use of certificates signed by INFN CA)

M. Sgaravatto - INFN Padova Results The CMS production using Globus and Condor-G failed Many many many memory leaks found in the Globus jobmanager !!!... but we (Francesco Prelz, INFN Milano) have been able to provide fixes for these bugs Fixes reported to Globus team Feedback only for what concerning the bugs in the GAA and GSS modules (new fixes “merged” with the original ones) Work in progress Tests with these fixes Fixes included in the INFN-GRID distribution

M. Sgaravatto - INFN Padova Other problems Globus GRAM Some minor bugs found and fixed (fixes included in the INFN-GRID distribution) Necessary to “address” some “major” problems Scalability (one jobmanager for each job) Reliability (the jobmanager is not persistent) … Condor-G Some problems in the current implementation (it’s a prototype) Scalability in the submitting machine Logging …

M. Sgaravatto - INFN Padova Next steps New tests considering the next CMS productions with the “patched” Globus jobmanager New tests with the new implementations of Condor-G and Globus jobmanager (by Condor team) Tests with bypass Tool written by D. Thain (Condor team) that allows redirection of standard input/output/error to a remote machine (the submitting machine) while the program is running (split execution system) Use of GSI authentication mechanisms New implementation reliable to several kind of failures Tests with the first WP 1 prototype “Integration” with software provided by the other WPs (i.e. replica management tools,..)

M. Sgaravatto - INFN Padova Prototype workload management system architecture Globus GRAM CONDOR Globus GRAM LSF Globus GRAM PBS Site1 Site2Site3 condor_submit (Globus Universe) Condor-G Master Grid Information Service (GIS) Submit jobs (using Class-Ads) Resource Discovery Information on characteristics and status of local resources Local Resource Management Systems Globus GRAM as uniform interface to different local resource management systems Condor-G able to provide a reliable/crashproof job submission service Master chooses in which Globus resources the jobs must be submitted Farms Other info