11 March 2004Getting Ready for the Grid SAM: Tevatron Experiments Using the Grid CDF and D0 Need the Grid –Requirements, the CAF and SAM –Grid from the.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

Physics with SAM-Grid Stefan Stonjek University of Oxford 6 th GridPP Meeting 30 th January 2003 Coseners House.
SAM-Grid Status Core SAM development SAM-Grid architecture Progress Future work.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Rod Walker IC 13th March 2002 SAM-Grid Middleware  SAM.  JIM.  RunJob.  Conclusions. - Rod Walker,ICL.
Meta-Computing at DØ Igor Terekhov, for the DØ Experiment Fermilab, Computing Division, PPDG ACAT 2002 Moscow, Russia June 28, 2002.
The Sam-Grid project Gabriele Garzoglio ODS, Computing Division, Fermilab PPDG, DOE SciDAC ACAT 2002, Moscow, Russia June 26, 2002.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
LcgCAF:CDF submission portal to LCG Federica Fanzago for CDF-Italian Computing Group Gabriele Compostella, Francesco Delli Paoli, Donatella Lucchesi, Daniel.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Deploying and Operating the SAM-Grid: lesson learned Gabriele Garzoglio for the SAM-Grid Team Sep 28, 2004.
SAM Job Submission What is SAM? sam submit …… Data Management Details. Conclusions. Rod Walker, 10 th May, Gridpp, Manchester.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
September 4,2001Lee Lueking, FNAL1 SAM Resource Management Lee Lueking CHEP 2001 September 3-8, 2001 Beijing China.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
A Design for KCAF for CDF Experiment Kihyeon Cho (CHEP, Kyungpook National University) and Jysoo Lee (KISTI, Supercomputing Center) The International Workshop.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
International Workshop on HEP Data Grid Nov 9, 2002, KNU Data Storage, Network, Handling, and Clustering in CDF Korea group Intae Yu*, Junghyun Kim, Ilsung.
ORBMeeting July 11, Outline SAM Overview and Station description Resource Management Station Cache Station Prioritized Fair Share Job Control File.
Integrating JASMine and Auger Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
The SAM-Grid and the use of Condor-G as a grid job management middleware Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
SAM Installation Lauri Loebel Carpenter and the SAM Team February
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
A New CDF Model for Data Movement Based on SRM Manoj K. Jha INFN- Bologna 27 th Feb., 2009 University of Birmingham, U.K.
4 March 2004GridPP 9th Collaboration Meeting SAMGrid:JIM and CDF Development CDF Accepts the Need for the Grid –Requirements How to Meet the Need –Status.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
SAM - Sequential Data Access via Metadata Schema Metadata Functionality Workshop Glasgow University April 26-28,2004.
Outline: Tasks and Goals The analysis (physics) Resources Needed (Tier1) A. Sidoti INFN Pisa.
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
19 February 2004SAMGrid Project Review SAMGrid: Future Plans CDF Accepts the Need for the Grid –Requirements D0 Relies on the Grid –Requirements How to.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
DCAF (DeCentralized Analysis Farm) Korea CHEP Fermilab (CDF) KorCAF (DCAF in Korea) Kihyeon Cho (CHEP, KNU) (On the behalf of HEP Data Grid Working Group)
DCAF(DeCentralized Analysis Farm) for CDF experiments HAN DaeHee*, KWON Kihwan, OH Youngdo, CHO Kihyeon, KONG Dae Jung, KIM Minsuk, KIM Jieun, MIAN shabeer,
International Workshop on HEP Data Grid Aug 23, 2003, KNU Status of Data Storage, Network, Clustering in SKKU CDF group Intae Yu*, Joong Seok Chae Department.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
A New CDF Model for Data Movement Based on SRM Manoj K. Jha INFN- Bologna Presently at Fermilab 21 st April, 2009 Post Doctoral Interview University of.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
IC Status – 29/4/02 Condor BS adaptor –New interface handles JDF`s (condor,fbs,condorG) Multi-process consumers. –In progress (crashes smaster on jobSubmitted())
Lee Lueking D0RACE January 17, 2002
The DZero/PPDG D0/PPDG mission is to enable fully distributed computing for the experiment, by enhancing SAM as the distributed data handling system of.
Presentation transcript:

11 March 2004Getting Ready for the Grid SAM: Tevatron Experiments Using the Grid CDF and D0 Need the Grid –Requirements, the CAF and SAM –Grid from the User Perspective Grid to Meet the Need –How SAM works –SAM usage by D0 and CDF Near Future: SAMGrid Rick St. Denis, University of Glasgow

11 March 2004Getting Ready for the Grid Reviews: Director’s (technically), International Finance Committee (fiscally) FNAL PAC (for its physics merit) Maximize physics low Lumi –L3 output rate: 80 -> 360Hz by 06 Spokespersons’ Requirements for CDF 50% computing outside FNAL CDF needs the Grid

11 March 2004Getting Ready for the Grid Scale of CDF Requirements THz%offsiteCPU Speed #duals FY %3GHz150 FY %5GHz+360 FY %8GHz sites, 100Duals each, by

11 March 2004Getting Ready for the Grid CDF Computing Model Develop Analysis on desktop –Access to all CDF data from anywhere Large scale processing on batch clusters –Submission from anywhere –interactive tools: ls,top,head/tail/cat –Output to scratch space or desktop Implemented Now with CAF (not Grid standard) Exists Now

11 March 2004Getting Ready for the Grid Central Analysis Facility CAF is a pile of PC’s with a pile of disks. (1200 processors and 100TB) This can be implemented anywhere as dCAF: Decentralized CAF. Output of jobs can go to desktop or a scratch area Need a password for this: authentication (kerberos).

11 March 2004Getting Ready for the Grid Sequential Access through Metadata Metadata: SAM allows groups of files to be identified into datasets using attributes (metadata) such as production pass version or top quark mass to associate them. File Retrieval: SAM moves files to users as they request them. File Storage: SAM allows output files to be stored with new metadata.

11 March 2004Getting Ready for the Grid Metadata File Type: SAMMC Data File File Name: Bs_conc_4o5_3.root File ID: File Size: [B] File Start Time: 01/29/ :00:00 File End Time: 01/29/ :00:00 Application Family: generator Application Version: 1.00 Description: BsDspi_phipi MONTE CARLO Dataset 4o5 part 3 Run Number: ~]$ sam get metadata --file=Bs_conc_4o5_3.root totalevents = 7290 Work Group: cdf Node Name: cdfsam.cnaf.infn.it dataset = BsMC-lucchesi_test html =

11 March 2004Getting Ready for the Grid Use Cases User Level MC Production –All Users have access –No data on site -> write to tape at FNAL User Level Data Access –All users have access –Selected samples automaticaly copied on site SAM provides this

11 March 2004Getting Ready for the Grid Functionality User selects a place to run, saying what dataset they will use System checks they can do this (privileges) User access to data at any place User output is stored on any disk or back to tape at FNAL and results are made available for transfer to any site for others to analyse.

11 March 2004Getting Ready for the Grid CAF Gui/CLI User Perspective Analysis program Grid TorontoKoreaItalyTaiwanFermiCAFUK CAF Gui/CLI User Perspective Only Fermilab Uses SAM Outside LabGrid Uses SAM

11 March 2004Getting Ready for the Grid Meeting the Needs SAM: How it works Progress in SAM CDFGridWorkshop: “Nerd’s Paradise” D0 and CDF Usage

11 March 2004Getting Ready for the Grid Fcdfdata016Disk/Cache Station central-analysis Daemon (smaster) Stager Daemon (stagerng) FSS (Deamon) (fss) Stager Daemon (stagerng) Disk/Cache Stager Daemon (stagerng) Stager Daemon (stagerng) Stager Daemon (stagerng)

11 March 2004Getting Ready for the Grid Node1 Cache Node2 Cache Node3 Cache Node4 Cache Node5 Cache Station smaster Stager stagerng Stager stagerng Stager stagerng Stager stagerng Stager stagerng A Farm: Station with Stagers and Caches

11 March 2004Getting Ready for the Grid What can 20 duals and 6 TB do? StreamEventsDaysInput Size Top,W/Z20.5 M TB Hadronic B and charm 156M TB Need to transfer 0.6 GB/min or 1 TB/Day

11 March 2004Getting Ready for the Grid fcdfdata016 Disks/Cache

11 March 2004Getting Ready for the Grid fcdfdata016 Station central-analysis smaster Disks/Cache

11 March 2004Getting Ready for the Grid fcdfdata016 Station central-analysis smaster Disks/Cache Stager stagerng

11 March 2004Getting Ready for the Grid setenv SAM_STATION chris sam dump station --disks *** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 53 minutes 25 seconds, admins: jozwiak terekhov Known batch systems: lsf Default batch system: lsf No replica selection criteria There are 0 authorized transfer groups Minimum delivery is 1KB; external deliveries are unconstrained STATION DISKS: disk 7844 nglas08.fnal.gov:/sam/test9/jozwiak/dev/chris, 29947KB/20GB free disk 8064 nglas08.fnal.gov:/sam/test10/jozwiak/dev/chris, 93110KB/20GB free *** END OF STATION DUMP *** sam dump station --disks

11 March 2004Getting Ready for the Grid sam dump station --groups *** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 57 minutes 3 seconds, admins: jozwiak terekhov Known batch systems: lsf Default batch system: lsf No replica selection criteria There are 0 authorized transfer groups Minimum delivery is 1KB; external deliveries are unconstrained AUTHORIZED GROUPS: group test: admins: jozwiak, swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/150, disk: KB/40GB, locks:0B/0KB group test1: admins: jozwiak, swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/40, disk: KB/30GB, locks:0B/0KB group test2: admins: jozwiak, swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/50, disk: KB/40GB, locks:0B/0KB *** END OF STATION DUMP *** sam dump station --groups

11 March 2004Getting Ready for the Grid sam dump station --projects *** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 1 hours 4 minutes 49 seconds, admins: jozwiak terekhov Known batch systems: lsf Default batch system: lsf No replica selection criteria There are 0 authorized transfer groups Minimum delivery is 1KB; external deliveries are unconstrained PROJECT MANAGER: fileReleaseTO = 1 days, max files given to project: Unlimited NO PROJECTS *** END OF STATION DUMP *** sam dump station --projects

11 March 2004Getting Ready for the Grid fcdfdata016 sam submit --script=userscript --group=groupname --cpu-per-event= --defname= Station central-analysis smaster Disks/Cache Stager stagerng

11 March 2004Getting Ready for the Grid fcdfdata016 >>>>>> Starting project with the Station Master Station Master contacted, result: Started project (49008_sam_) for group test Waiting for the project to initialize... Station central-analysis smaster Disks/Cache Stager stagerng

11 March 2004Getting Ready for the Grid fcdfdata016 Callback from server: 'OK|Project is ready' Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster

11 March 2004Getting Ready for the Grid fcdfdata016 >>>>>> Submitting the job to the batch system. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP

11 March 2004Getting Ready for the Grid sam dump station --projects *** BEGIN DUMP STATION chris version v3_2_2 running at nglas08 1 hours 12 minutes 44 seconds, admins: jozwiak terekhov Known batch systems: lsf Default batch system: lsf No replica selection criteria There are 0 authorized transfer groups Minimum delivery is 1KB; external deliveries are unconstrained PROJECT MANAGER: fileReleaseTO = 1 days, max files given to project: Unlimited STATION PROJECTS: project 49205_sam_(49205) user jozwiak.test started 01 Nov 14:08:45 UNIX pid still wants/currently uses 5/0 files *** END OF STATION DUMP *** Sam dump station --projects

11 March 2004Getting Ready for the Grid sam dump project --project=49205_sam_ *** BEGIN GPM DUMP *** Input files: : sim.ztautau.1000evts c5.01, size=0K, unbuffered yet : sim.ztautau.1000evts c5.02, size=0K, unbuffered yet : sim.pmc02_01.pythia.ztautau_mb1.1av_200evts.267_1553, size=0K, unbuffered : sim.pmc02_01.pythia.ztautau_mb1.1av_200evts.276_1152, size=0K, unbuffered Cached (not buffered) files: (none) Buffered files: (none) External files with delivery problems: (none) Umer contexts (name, state, join time, nSeen): (no umers) Proc contexts (ID: name, state, join time [, current|last]): (no procs) Processes waiting for call back:(none) *** END GPM DUMP *** sam dump project –project=

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP Optimizer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP eworker

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP eworker encp

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP eworker encp Enstore

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP eworker encp Enstore

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP eworker encp Enstore

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) PSUSP eworker encp Enstore

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid SAMManager:sam Getting next input file... SAMManager:sam Project master will call back.

11 March 2004Getting Ready for the Grid sam dump project --project=49225_sam_ *** BEGIN GPM DUMP *** Input files: : d0g.test_file_1G_a_dev.0001_001, size=0K, unbuffered yet : d0g.test_file_1G_a_dev.0002_001, size=0K, unbuffered yet Cached (not buffered) files: (none) Buffered files: (none) External files with delivery problems: (none) Umer contexts (name, state, join time, nSeen): 36422: jozwiak(test-harness:1), active, 05 Nov 13:59:09, 31 Proc contexts (ID: name, state, join time [, current|last]): : wait, 05 Nov 13:59:10, Processes waiting for call back: CID=36422: (05 Nov 20:53:43) *** END GPM DUMP *** Sam dump project –project=49225_sam_

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN eworker encp Enstore samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer rm

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer Optimizer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer eworker

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer eworker rcp

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer eworker rcp Other Cache

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer eworker rcp Other Cache

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer eworker rcp

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Job is submitted to queue. Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF) RUN samscript.sh userscript consumer

11 March 2004Getting Ready for the Grid fcdfdata016 Station central-analysis smaster Disks/Cache Stager stagerng Project pmaster Batch (LSF)

11 March 2004Getting Ready for the Grid fcdfdata016 Station central-analysis smaster Disks/Cache Stager stagerng Batch (LSF)

11 March 2004Getting Ready for the Grid fcdfdata016 sam submit…. sam run project… Station central-analysis smaster Disks/Cache Stager stagerng Batch (LSF)

11 March 2004Getting Ready for the Grid fcdfdata016 Station central-analysis smaster Disks/Cache Stager stagerng Batch (LSF) RUN RUN PSUSP Project pmaster Project pmaster Project pmaster samscript.sh userscript consumer eworker rcp encp Other Cache Enstore

11 March 2004Getting Ready for the Grid fcdfdata016 Station central-analysis smaster Disks/Cache Stager stagerng Batch (LSF) RUN RUN PSUSP Project pmaster Project pmaster Project pmaster samscript.sh userscript consumer eworker rcp encp Other Cache Enstore

11 March 2004Getting Ready for the Grid SAM Animation worldScenerio.html

11 March 2004Getting Ready for the Grid Storing Files Getting things to tape from Glasgow

11 March 2004Getting Ready for the Grid fcdfdata016 Disks FSS Central-analysis fss Stager stagerng

11 March 2004Getting Ready for the Grid sam dump fss FSS version v3_2_2 at station central-analysis running on fcdfdata016.fnal.gov 6 hours 57 minutes 34 seconds No routing (all transfers are direct) Configuration for operation retrial (count, interval/timeout) DBS contact: 3, 1 hours Opter contact: 1, 1 hours Authorization receipt:1, 1 hours Stager contact: 1, 1 hours Transfer (retrials upon timeout and upon failure): 3, 6 hours Relay (multi-stage routing only): 3, 1 hours File Storage Server Dump: Stagers are known at nodes: fcdfdata016.fnal.gov No requests ever submitted Sam dump fss

11 March 2004Getting Ready for the Grid fcdfdata016 sam store descrip.py --source= [--dest=/pnfs…..] Disks FSS Central-analysis fss Stager stagerng

11 March 2004Getting Ready for the Grid fcdfdata016 sam store descrip.py --source= [--dest=/pnfs…..] Disks FSS Central-analysis fss Stager stagerng Descrip.py Metadata Info about file Sam checks info, checks location,

11 March 2004Getting Ready for the Grid fcdfdata016 sam store descrip.py --source= [--dest=/pnfs…..] Disks FSS Central-analysis fss Stager stagerng eworker encp, rcp, bbftp

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From Really Far Away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss Routing: fcdfdata016 Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore sam store enstore

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss Routing: fcdfdata016 Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore eworker bbftp fcdfdata016

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From really Far away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore eworker bbftp fcdfdata016

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From really Far away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From really Far away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore eworker encp

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From really Far away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore eworker encp

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From really Far away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore rm

11 March 2004Getting Ready for the Grid Node from Really Far Away Disk Fss From really Far away Stager fcdfdata016 Fss central-analysis Stager Tmp Disk Enstore

11 March 2004Getting Ready for the Grid D0 Sam D0 relies entirely on SAM for analysis

11 March 2004Getting Ready for the Grid D0

11 March 2004Getting Ready for the Grid D0 Files Files/Day

11 March 2004Getting Ready for the Grid D0 Data Volume 1TB-3TB/day

11 March 2004Getting Ready for the Grid D0 Files Per Month By Year ,000 files Run II Start

11 March 2004Getting Ready for the Grid D0 Total Files 2.5Million Files Served

11 March 2004Getting Ready for the Grid D0 Data Per Month By Year 50 TB per month Run II Start

11 March 2004Getting Ready for the Grid D0 Total Data Moved 700TB moved

11 March 2004Getting Ready for the Grid Progress in SAM: CDF All 800,000 CDF data files are in SAM Sam is in beta testing on the CDF CAF (1200 cpus): passed 20TB/Day delivery Karlsruhe uses SAM routinely Minos uses SAM for its Data Handling Steve Mrenna (Phenomenology) depositing ALPGEN files in SAM for common CDF/D0 use.

11 March 2004Getting Ready for the Grid Florida workshop: 11 installations in about 2 hours. Integrated with dCAF in 2 cases in 2 days. 3 in Asia, 4 in Europe 6 sites committed to summer 2004 usage of their facilities for all of CDF (mostly MC) Sam installation now: initsam cdf Follow-up on April 1. Each site has a local user support person to reduce load on core development team. Generally: Security ate 80% of the effort! Now 20!

11 March 2004Getting Ready for the Grid CDF

11 March 2004Getting Ready for the Grid Florida Workshop: After 2 Days

11 March 2004Getting Ready for the Grid 2TB/Day: Karlsruhe

11 March 2004Getting Ready for the Grid CDF Dcache on CAF ALL CDF on CAF reads 25TB/Day NonGrid Running

11 March 2004Getting Ready for the Grid Karlsruhe: 1500 files/Day CDF Files in a Month

11 March 2004Getting Ready for the Grid Karlsruhe: 5-10M Evt/Day CDF Events Transfer in a Month

11 March 2004Getting Ready for the Grid All CDF Files Moved by SAM 300K Files D0: 2.5M files

11 March 2004Getting Ready for the Grid Total CDF Data Moved 200 TB D0:700TB

11 March 2004Getting Ready for the Grid Advantage of Local Processing Karlsruhe processes 2TB/day. Rest of CDF on Central Cluster processes 25TB/day. (450 processors, 8 experiments, 10/13TB disk filled.) 5 users actively at Karlsruhe. Make ntuple for bottom and top physics for 15 people. 100 users active for rest of CDF: They pin the datasets of interest; copy new ones automatically.

11 March 2004Getting Ready for the Grid Dcache and SAM Dcache shapes traffic into disk: If a SAM cache is large, need to use Dcache instead of nfs mounts Dcache gives the user what is requested. 1TB gets same priority as 1GB: CDF users must send requesting data to be staged. SAM examines consumption rate before staging next files – No needed. SAM uses Dcache for its Caching at FNAL. This needs further work with SRM

11 March 2004Getting Ready for the Grid In the near term future:JIM Adding Grid Standard Tools

11 March 2004Getting Ready for the Grid CDF Grid Strategy 25% of CDF Computing from external resources. All CDF computing on CDF Grid by April 15: Utilize resources fully controlled by CDF: Kerberos/fbsng: dCAF + SAM October 15, 2004: JIM to capture shared resources June 2005: 50% of Computing resources external

11 March 2004Getting Ready for the Grid Desktop Anywhere Condor centers SAM DB Condor Globus GK CAF Submitter SAM each site WN Private LAN dCache June 2004 testing June 2005 required Simple JIM

11 March 2004Getting Ready for the Grid Detailed JIM Site Resource Selector Info Collector Info Gatherer Match Making User Interface Submission Global Job Queue Grid Client Submission User Interface Global DH Services SAM Naming Server SAM Log Server Resource Optimizer SAM DB Server RCMetaData Catalog Bookkeeping Service SAM Stager(s) SAM Station (+other servs) Data Handling Worker Nodes Grid Gateway Local Job Handler (CAF, D0MC, BS,...) JIM Advertise Local Job Handling Cluster AAA Dist.FS Info Manager XML DB server Site Conf. Glob/Loc JID map... Info Providers MDS MSS Cache Site Web Serv Grid Monitoring User Tools Flow of: jobdata meta-data

11 March 2004Getting Ready for the Grid Conclusions CDF has embraced the need for the Grid to achieve its physics mission SAM is working for D0 and growing in CDF