Movement of MC Data using SAM & SRM Interface Manoj K. Jha (INFN- Bologna) Gabriele Compostella, Antonio Cuomo, Donatella Lucchesi,, Simone Pagan (INFN.

Slides:



Advertisements
Similar presentations
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Advertisements

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Sep Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
OSG Public Storage and iRODS
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
LcgCAF:CDF submission portal to LCG Federica Fanzago for CDF-Italian Computing Group Gabriele Compostella, Francesco Delli Paoli, Donatella Lucchesi, Daniel.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
UMD TIER-3 EXPERIENCES Malina Kirn October 23, 2008 UMD T3 experiences 1.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
A New CDF Model for Data Movement Based on SRM Manoj K. Jha INFN- Bologna 27 th Feb., 2009 University of Birmingham, U.K.
A Test Framework for Movement of MC Data using SAM & SRM Interface Manoj K. Jha (INFN- Bologna) Antonio Cuomo, Donatella Lucchesi, Gabriele Compostella,
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
A Test Framework for Movement of MC Data using SAM & SRM Interface 2 nd March, 2008 CDF Offline Meeting.
Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
May Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
Adapting SAM for CDF Gabriele Garzoglio Fermilab/CD/CCF/MAP CHEP 2003.
Microsoft ® Official Course Module 6 Managing Software Distribution and Deployment by Using Packages and Programs.
EGEE is a project funded by the European Union under contract IST Experiment Software Installation toolkit on LCG-2
A New CDF Model for Data Movement Based on SRM Manoj K. Jha INFN- Bologna Presently at Fermilab 21 st April, 2009 Post Doctoral Interview University of.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
TANYA LEVSHINA Monitoring, Diagnostics and Accounting.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
July 26, 2007Parag Mhashilkar, Fermilab1 DZero On OSG: Site And Application Validation Parag Mhashilkar, Fermi National Accelerator Laboratory.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group
Bologna, March 30, 2006 Riccardo Zappi / Luca Magnoni INFN-CNAF, Bologna.
Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
CDF Monte Carlo Production on LCG GRID via LcgCAF Authors: Gabriele Compostella Donatella Lucchesi Simone Pagan Griso Igor SFiligoi 3 rd IEEE International.
Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.
L’analisi in LHCb Angelo Carbone INFN Bologna
LcgCAF:CDF submission portal to LCG
StoRM: a SRM solution for disk based storage systems
Installing and Running a CMS T3 Using OSG Software - UCR
Introduction to Grid Technology
Working of Script integrated with SiteScope
Support for ”interactive batch”
Status report NIKHEF Willem van Leeuwen February 11, 2002 DØRACE.
The LHCb Computing Data Challenge DC06
Thursday AM, Lecture 2 Brian Lin OSG
Presentation transcript:

Movement of MC Data using SAM & SRM Interface Manoj K. Jha (INFN- Bologna) Gabriele Compostella, Antonio Cuomo, Donatella Lucchesi,, Simone Pagan (INFN – Padova) Doug Benjamin (Duke University) Robert Illingworth (Fermilab) 3 rd February, 2009 CDF Offline Meeting, Fermilab

2 Overview  Present MC Prod. Model  Proposed Model  Test Framework  Integration within CAF  Description: on the WN  Getting File on User’s Desktop  Future Plan

Present CAF MC Prod. Model 3 User Desktop CAF Head Node File FNAL exe tar ball WN  Presently, CDF is relying on rcp tool for transfer of output from WN to destination file servers. It leads to  Worker nodes are sitting idle just because another WN is transferring files to the fileservers.  Loss of output from the WN in most of the cases.  Wastage of CPU and network resources when CDF is utilizing the concept of opportunistic resources.  No mechanism to deal with the sudden arrival of output from WN at file servers. Especially, happens during the conference period.  Overloading of available file servers in most of the cases.  Users have to resubmit the job.  No catalogue of MC produced files on user basis.

Proposed Model 4 User Desktop CAF Head exe tar ball WN SAM Station  A SE will be attached to each CAF head node.  SE acts as a cache of a SAM station. Physical location of station is unimportant. Since SE is managed by SRM.  Output from WN will be temporarily collected in the SE closer to WN.  Transfer or wait time for MC output data on WN will be considerably reduced due to large bandwidth between WN and its SE.

Proposed Model 5 User Desktop CAF Head exe tar ball WN SAM Station SAM Station

6 Proposed Model User Desktop CAF Head Node exe tar ball WN SAM Station SAM Station SAM Station SAM Station

Test Framework  Hardware:  CAF: CNAF Tier –I  SE: dCache managed SRMs  500 GB of space at Gridka, 10 channels per request  1 TB of space at UCSD, 50 channels per request  SAM station:  station “cdf-cnafTest” at “cdfsam1.cr.cnaf.infn.it”  station “canto-test” at “cdfsam15.fnal.gov”  Setup:  A cronjob submits a job of variable no. of segments every 10 minutes at CNAF CAF. Maxmium segments per job is 5.  Each segment creates a dummy file of random size which vary between 10 MB to 5 GB.  A cron process running at station “cdf-cnafTest” for creation of datasets from the files in SRM closer to WN (Gridka).  Another cron job running at station “canto-test” for transfer of dataset between two stations. 7

Test Framework 8 CNAFCAF Closer to WN Destination SAM Station “cdf-cnafTest” SAM Station “cdf-cantotest””

Description: On the WN  User will submit the jobs using the present CDF Analysis Framework (CAF).  By file name, all the output and log files produced by the users will be unique. One option is to add CAF name, JID and section number in the user’s provided file name.  If SAM station is available on the WN, output and log files will be declared to SAM. Otherwise, it will be declared at the dataset creation time.  Minimum information for creation of metadata will be created on demand. Option also exists for a user to provide specific dimensions in the metadata so that he/she can query the list of files which matches certain criteria.  Entries in metadata depend on list of tags through which a user can create a list of files from already produced MC files. For ex., a user would like to have all the files which corresponds to a certain userid or group (higgs, top, qcd, ) or creation date, etc. This is a subject of discussion and input is required from MC production people. 9

Distribution of no. of job sections  No. outside the circle represents no. of section being submitted in a single job  No. inside the circle represents total no. of section being submitted for the test framework.  For ex: A job with 1 section had been submitted 56 times, with 2 sections 64 times and so on. 10 File TypeFile Size A10 MB < size < 100 MB B100 MB < size < 1 GB C1GB < size < 3 GB D size > 3 GB

Description: On the WN  A pre define storage element (SE) SURL managed by SRM will be known to each CAF. This predefined SE can also be opportunistic one. Proposed model has the ability to manage the space on SE which has been allotted through fixed lifetime space tokens.  At the completion of job on the WN:  User’s proxy available from the CAF head node will be used for grid authentication purpose.  A “JID” folder on the SE will be created. Its path will be like “SURL/JID”  All the output and log files from a user’s job will be transfer to the above JID folder using the available SRM tool on the WN. One can use either the Fermi SRM or LCG tools for transferring of files from WN to SE managed by SRM.  Inform the user regarding the job completion process. 11

Transfer Time from WN to its closer SRM 12

Transfer rate from WN to its closer SE 13

File Declaration Time from WN 14

Description: Near the SE closer to WN  A predefined SE will act as cache of a SAM station.  Service certificate of station will be used for authentication purpose.  A cronjob will run on the station’s pc which check:  Existence of new JID folder (say ) in the SE  Check status of JID “123456” from the CAF head node. 1.If finished, it will go to the next step. Otherwise go to the next JID folder. 2. Create a list of files for output and log files. Now the following steps will be executed individually for each output and log files list. a) Create a file which stores the list of file names. This is needed in case of SRM operation failure. b) Check whether the file has been declared in SAM or not. If not, declare the file in SAM using the minimum metadata information. c) Move the files to SAM upload folder 15

Description: Near the SE closer to WN d)Create a dataset from list of files stored on the local disk of PC. Created in above step (a). Dataset name should be unique. One option is to use the JID and CAF name in the dataset name. e) In the present case, dataset for output and log files are “FILE_CNAFCAF_123456” and “LOG_CNAFCAF_123456” f) Aware the created dataset to SAM  SAM will create a space for the created datasets. If sufficient space don’t exists, older files will be deleted from the station cache.  It is necessary to have sufficient cache space of station so that new dataset gets transfer to destination station cache before its deletion from the CAF station cache.  Update log files  Remove the JID folder “123456”  Inform user (Not necessarily required).  Repeat the above steps for next JID folder. 16

Dataset Type 17 Dataset TypeDataset Size A10 MB < size < 1 GB B1GB < size < 3 GB C3 GB < size < 10 GB D size > 10 GB

Output File Dataset Creation Time 18

Log File Dataset Creation Time 19

Description: Near the SE closer to destination site  Due to non availability of SRM at Fermilab for CDF, the destination site is UCSD in the present case.  SE will act as cache of another SAM station.  A process will run on station’s pc which:  Prepare a list of dataset entry in SAM database which has been created during a specific time period. Presently, we have taken one day before and after of the present time.  Pick a dataset from the above list. Create a list of files which constitute the dataset.  Check the location of file. If the location of file contains the “destination node”, it means file is available at the destination SE. Otherwise download the dataset to the cache of the destination station.  Update log file.  Go to the next dataset and repeat the above steps. 20

Distribution of Transfer Time for Output Dataset 21

Distribution of Transfer Rate for Output Dataset 22

Getting files on User’s Desktop  Requirements on user’s desktop:  A SAM station environment with its service certificate installed.  Fermi SRM client “srmcp”  File query:  User’s will know the dataset name in advance corresponding to a JID. Since dataset name will follow a fix syntax.  User’s can query the list of files which matches dataset using the sam command “sam list files --dim=“  File location:  Command “sam locate ” tells the location of file name.  Getting file on desktop:  Command “sam get dataset –defName= --group=test – downloadPlugin=srmcp “  Presently, setting up the station for doing the above test.  Request has been made for the service certificate of new station “cdf-srmtest”  A wrapper script can be written such that above changes will be transparent to users. 23

Future Plan  Used same flavor of SRMs at source and destination sites in the present setup.  Our next goal is to use diff. flavor of SRMs for source and destination sites.  SToRM to dCache  Waiting for the support of service certificate in StoRM  BestMan to dCache  Got some space in the SE managed by BestMan at University of Nebraska  For more robust test, we need more sam account for setting up the station for new SE.  We are in touch with LCG and OSG people for getting SE closer to WN.  Like at CNAF, we have couple of Tera Bytes of space on SE  OSG people has given assurance for arranging SE managed by SRM closer to their computing sites.  But, everything depends on availability of SE managed by SRM at FNAL. Is CDF will have their dedicated SRM at FNAL or they will use the opportunistic one ? We would like to know the opinion from CDF offline group leaders. 24

Acknowledgement ! We would like to thank Frank Wuerthwein and Abhishek Singh Rana (from UCSD), Thomas Kuhr (from Karlsruhe), Brian Bockelman ( from Nebraska), Jon Bakken (from FNAL), for providing resources on SRMs ! 25