03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Slides:



Advertisements
Similar presentations
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Advertisements

Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
McFarm: first attempt to build a practical, large scale distributed HEP computing cluster using Globus technology Anand Balasubramanian Karthik Gopalratnam.
A Computation Management Agent for Multi-Institutional Grids
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
LUNARC, Lund UniversityLSCS 2002 Transparent access to finite element applications using grid and web technology J. Lindemann P.A. Wernberg and G. Sandberg.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Minerva Infrastructure Meeting – October 04, 2011.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
Distribution After Release Tool Natalia Ratnikova.
Nick Brook Current status Future Collaboration Plans Future UK plans.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Grid Workload Management Massimo Sgaravatto INFN Padova.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
GRID activities in Wuppertal D0RACE Workshop Fermilab 02/14/2002 Christian Schmitt Wuppertal University Taking advantage of GRID software now.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Introduction to Grid Computing and its components.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
DZero Monte Carlo Production Ideas for CMS Greg Graham Fermilab CD/CMS 1/16/01 CMS Production Meeting.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
LHCb 2009-Q4 report Q4 report LHCb 2009-Q4 report, PhC2 Activities in 2009-Q4 m Core Software o Stable versions of Gaudi and LCG-AA m Applications.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Parag Mhashilkar (Fermi National Accelerator Laboratory)
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
Compute and Storage For the Farm at Jlab
Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.
Eleonora Luppi INFN and University of Ferrara - Italy
U.S. ATLAS Grid Production Experience
Belle II Physics Analysis Center at TIFR
Moving the LHCb Monte Carlo production system to the GRID
PROOF – Parallel ROOT Facility
Grid Canada Testbed using HEP applications
DØ MC and Data Processing on the Grid
Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.
GRID Workload Management System for CMS fall production
Presentation transcript:

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio State University)

03/27/2003CHEP20032 High luminosity experiments need large MC sample (Belle,BaBar require hundreds of millions of MC events) Massive computing power needed (farms of Linux machines) Farms are typically geographically distributed CLEOtwo sites DELPHIfive sites BaBartwo dozen sites (US and Europe) Belleeight sites The Problem:

03/27/2003CHEP20033 Hardware alone is not sufficient: Hardware, system level software maintenance Experiment specific MC software setup MC production Job submission Job monitoring (rerun failed jobs) Data transfer Coordination

03/27/2003CHEP20034 Is there another way? Reduced manpower requirements More efficient coordination Our approach Select one of the steps in the MC production chain MC Production Centralize operations Remote submission and monitoring Evaluate GRID tools. Can they help with MC production? Globus toolkit

03/27/2003CHEP20035 OSU MC Production Farm 27 dual Athlon nodes 1U 1 dual Athlon server 4U 840GB disk in RAID OpenPBS batch system File/batch queue server k MC events/day

03/27/2003CHEP20036 Globus Toolkit Globus Secure access Certificates for user and server Remote command execution system We observed significant overhead few seconds for single command Integrated tools e.g. GRIDftp Installation at Ohio State Globus on dedicated server Separate batch queue system for testing No Resource Broker Farm configuration details hidden Loss of dynamic configurability but much simpler

03/27/2003CHEP20037 MC production I : Job submission Typical input information : (MC software release), run range, #events … To do : build MC jobs and submit them Choose on option: One Globus command starts whole run range production many (thousands) of local jobs still need local script One Globus command starts a single MC production job Too slow Submit all production runs at once Only submit enough runs to fill queue Re-submitted jobs proceed faster

03/27/2003CHEP20038 MC production II : Job monitoring Job Status (“qstat”) Use local script to monitor log files Resubmit crashed jobs locally Monitor through Globus (remotely) Speed? Data Quality Monitoring check physics histograms not always done during production

03/27/2003CHEP20039 MC production III : Data transfer Easy if MC output is in file format GridFTP … Can be complicated otherwise Example would be writing MC into a database Limited disk space -> delete generated MC Log files

03/27/2003CHEP Conclusion MC production for a high luminosity experiment requires significant hardware and manpower resources. GRID tools can help to centralize this effort. Simple test show that remote operation of MC farms is possible Relatively easy to setup Globus framework (secure access, remote command execution) Local scripts for job submission, monitoring Still, significant software infrastructure (“local scripts” required. Other parts of the MC production chain need to be addressed before this becomes a realistic option. Remote MC software installation and version management