Sept 13 2004Wyatt Merritt Run II Computing Review1 Status of SAMGrid / Future Plans for SAMGrid  Brief introduction to SAMGrid  Status and deployments.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

Physics with SAM-Grid Stefan Stonjek University of Oxford 6 th GridPP Meeting 30 th January 2003 Coseners House.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
The Sam-Grid project Gabriele Garzoglio ODS, Computing Division, Fermilab PPDG, DOE SciDAC ACAT 2002, Moscow, Russia June 26, 2002.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
JIM Deployment for the CDF Experiment M. Burgon-Lyon 1, A. Baranowski 2, V. Bartsch 3,S. Belforte 4, G. Garzoglio 2, R. Herber 2, R. Illingworth 2, R.
18 Feb 2004Computing Division Project Status Report1 Project Status Report : SAMGrid  SAMGrid Management, Status, Operations – Merritt  SAMGrid Development.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Grid Job, Information and Data Management for the Run II Experiments at FNAL Igor Terekhov et al (see next slide) FNAL/CD/CCF, D0, CDF, Condor team, UTA,
SAMGrid – A fully functional computing grid based on standard technologies Igor Terekhov for the JIM team FNAL/CD/CCF.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
S. Veseli - SAM Project Status SAMGrid Developments – Part I Siniša Veseli CD/D0CA.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
29 Sep 2004Metadata for the Common Physicist Rick St. Denis Metadata for the Common Physicist ● Goals of the Presentation ● Use Cases ● SAM in light of.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham.
SAM Job Submission What is SAM? sam submit …… Data Management Details. Conclusions. Rod Walker, 10 th May, Gridpp, Manchester.
28 April 2003Lee Lueking, PPDG Review1 BaBar and DØ Experiment Reports DOE Review of PPDG January 28-29, 2003 Lee Lueking Fermilab Computing Division D0.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
SamGrid– A Reality of “Grid” Computing –SamGrid– Adam Lyon (Fermilab Computing Division and DØ Experiment) GridKa School’04 September, 2004 Outline Introduction.
SAMGrid for CDF MC (and beyond) Igor Terekhov, FNAL/CD/CCF/SAM for JIM team.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
SAMGrid “Addendum” Adam Lyon GDM / April 18, 2006.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
DØ Computing Model & Monte Carlo & Data Reprocessing Gavin Davies Imperial College London DOSAR Workshop, Sao Paulo, September 2005.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
4 March 2004GridPP 9th Collaboration Meeting SAMGrid:JIM and CDF Development CDF Accepts the Need for the Grid –Requirements How to Meet the Need –Status.
1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
SAM - Sequential Data Access via Metadata Schema Metadata Functionality Workshop Glasgow University April 26-28,2004.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Lee Lueking 1 The Sequential Access Model for Run II Data Management and Delivery Lee Lueking, Frank Nagy, Heidi Schellman, Igor Terekhov, Julie Trumbo,
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
Run II Review Closeout 15 Sept., 2005 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
19 February 2004SAMGrid Project Review SAMGrid: Future Plans CDF Accepts the Need for the Grid –Requirements D0 Relies on the Grid –Requirements How to.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
SAM Overview (training session) for CDF Users Doug Benjamin Duke University Krzysztof Genser Fermilab/CD.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Analysis Tools at D0 PPDG Analysis Grid Computing Project, CS 11 Caltech Meeting Lee Lueking Femilab Computing Division December 19, 2002.
Adapting SAM for CDF Gabriele Garzoglio Fermilab/CD/CCF/MAP CHEP 2003.
Grid Job, Information and Data Management for the Run II Experiments at FNAL Igor Terekhov et al FNAL/CD/CCF, D0, CDF, Condor team.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
Project Status Report : SAMGrid
DØ MC and Data Processing on the Grid
Presentation transcript:

Sept Wyatt Merritt Run II Computing Review1 Status of SAMGrid / Future Plans for SAMGrid  Brief introduction to SAMGrid  Status and deployments of SAM  Status and deployments of JIM  Problems encountered/solved/unresolved  The SAMGrid team  SAMGrid services & development plans

Sept Wyatt Merritt Run II Computing Review2 Introduction to SAMGrid  Data handling system with following capabilities: File storage from online and processing systems Routed file delivery to (possibly parallel) processes Dataset creation based on file metadata Dataset creation based on processing metadata Local and remote job submission User authentication Local and remote monitoring capabilities  Looking at SAM in operation - SAM DØ SAM CDFSAM DØSAM CDF Currently created from log files Version in development is created from MIS database, filled by new MIS server

Sept Wyatt Merritt Run II Computing Review3 Status and deployments of SAM  Active stations: 40 DØ FNAL) 36(9) 26 CDF FNAL) 9(0)  Accomplishments since September 03: DØ Reprocessing Nov03-Jan04 Deployment of V5.1 SAM schema (1 st common version) in production for CDF & DØ Deployment of new access methods to V5.1 schema, new station to match, new python and C++ api’s Station changes in response to operational issues Deployment of new test harness Deployment of new installation procedures Deployment beginning for MINOS New Monitoring & Information Server in testing SAM schema being tested by USCMS Sep 03

Sept Wyatt Merritt Run II Computing Review4 Station Fixes/Enhancements Sep 03 – Sep 04  Numerous changes for field issues: recent example – reconfiguration of cache replacement algorithm  Dramatic improvement in station ability to handle large request queues - supports much larger projects than 1 yr ago; processes requests an order of magnitude faster overall to catch up with CDF CAF data throughput  Other developments for SAM on CAF  Interface changes that allow for a greater flexibility in station clients. Such as : TCP callbacks, TCP polls (yes, client can be behind one-way firewall and still get files).  Changes required to work with new dbserver  Station integration with SRM – code ready, simple demo performed

Sept Wyatt Merritt Run II Computing Review5 Usage Statistics for D0 SAM 250K 0 70K D0 CAB GB/month 120K 00 D0 CAB1GB/month 30K 0 D0 Farm GB/monthD0 ALL GB/month Sum = 2.1 PB; 50B evts

Sept Wyatt Merritt Run II Computing Review6 Sam Statistics - DØ Files delivered by month Run II Begins

Sept Wyatt Merritt Run II Computing Review7 Usage Statistics for CDF SAM CDF-SAM GB/month 250K 0 30K 0 CDF-SAM GB/day CDF-ALL GB/month CDF-ALL GB/day 0 35K 450K 0 Sum = 1.5 PB; 12B evts

Sept Wyatt Merritt Run II Computing Review8 Total CDF Files To User 1000 TB

Sept Wyatt Merritt Run II Computing Review9 SAM Statistics - Operations Data  Time between Request Next File and Open File  For CAB and CABSRV1 50% of enstore transfers occur within 10 minutes. 75% within 20 minutes 95% within 1 hour  For CENTRAL-ANALYSIS and CLUED0 95% of enstore transfers within 10 minutes Station CABCABSRV1CLUED0CA % no wait 30%40%38%18%

Sept Wyatt Merritt Run II Computing Review10 Status and deployments of JIM  Active execution sites: 10 DØ FNAL) 0 1 CDF in testing 0  Accomplishments since September 03: Solutions provided for sandboxing and merging Deployment for Monte Carlo production for DØ at multiple sites Testbed established at Fermilab on General Purpose farm In testing for CDF MC production In testing for DØ reprocessing from raw data Demonstration of use of sam_client on LCG site Sep 03

Sept Wyatt Merritt Run II Computing Review11 The SAM-Grid Team Project Co-Leaders: Wyatt Merritt (CD/Run II); Rick St. Denis (CDF/ U Glasgow) Technical Co-Leaders: Rob Kennedy (CD/CCF); Sinisa Veseli (CD/Run II) Core Developers (SAM components): Lauri Loebel Carpenter, Andrew Baranovski, Steve White, Carmenita Moore,* Adam Lyon, Petr Vokac,*** Mariano Zimmler***, Matt Leslie Core Developers (JIM components): Igor Terekhov,** Gabriele Garzoglio, Sankalp Jain,** Aditya Nishandar** Support for CDF Migration: Fedor Ratnikov, Randy Herber, Art Kreymer, Morag Burgon- Lyon,** Valeria Bartsch, Stefan Stonjek, Krzysztof Genser Database support: Anil Kumar Associated external projects: PPDG, GridPP, SBIR II Core Development 7 FTEs * Deceased | ** Left project | *** Summer Students

Sept Wyatt Merritt Run II Computing Review12 Problems Encountered/Solved/Unresolved  Grid operational stability in DØ MC deployment: initial efficiency 80% took ~ 2 months  Installation difficulty for CDF: doc week, initial config man subproject  followups in both areas  Major operational issues Sep 03 – Sep 04 DØ – hardware problems with production database machine (central point of failure) Dec03-Jan04; 15% drop in file deliv  Contentious design issues Sep 03 – Sep 04 CDF – file name as GUID no change to model CDF – interface into experiment framework work in SAM CDF – communication with dcache work in SAM, future work CDF – use of dimensions and parameters proposed work in SAM CDF – process bookkeeping future work in SAM MINOS – file delivery ordering & grouping no change to model

Sept Wyatt Merritt Run II Computing Review13 SAMGrid & Grid Services  Distributable sam_client provides access to: VO storage service (sam store command, interfaced to sam_cp) VO metadata service (sam translate constraints) VO replica location service (sam get next file) Process bookkeeping services  JIM components provide: Job submission service via Globus Job Manager, augmented by some VO requirements Job monitoring service from remote infrastructure Authentication services

Sept Wyatt Merritt Run II Computing Review14 Grid Discussions & Activities  SAMGrid design discussions - ongoing  CDF Grid Workshops – January 04, Florida April 04, FNAL September 04, FNAL  DØ Grid Workshops - April 04, London August 04, Wuppertal  LHC Metadata Working Group – April 04, Glasgow  CMS RTAG input & prototyping activity – August 04  12 CHEP papers on SAMGrid work (6 talks, 6 posters)  FermiGrid discussions – beginning

Sept Wyatt Merritt Run II Computing Review15 Extra Slides

Sept Wyatt Merritt Run II Computing Review16 What is SAM?  Data handling system for Run II DØ and CDF  SAM manages file storage (replica catalogs) Data files are stored in tape systems at FNAL and elsewhere (most use ENSTORE at FNAL) Files are cached around the world for fast access  SAM manages file delivery Users at FNAL and remote sites retrieve files out of file storage. SAM handles caching for efficiency You don't care about file locations  SAM manages file meta-data cataloging SAM DB holds meta-data for each file. You don't need to know the file names to get data  SAM manages analysis bookkeeping SAM remembers what files you ran over, what files you processed successfully, what applications you ran, when you ran them and where  Designed for PETABYTE (10 15 ) sized experiment datasets (that's us)!

Sept Wyatt Merritt Run II Computing Review17 SAM Terms and Concepts  A project runs on a station and requests delivery of a dataset to one or more consumers on that station.  Station: Processing power + disk cache + (connection to tape storage) + network access to SAM catalog and other station caches Example: a linux analysis cluster at D0  Dataset: metadata description which is resolved through a catalog query to a list of files. Datasets are named. Examples: (syntax not exact) data_type physics and run_number and data_tier raw request_id 5879 and data_tier thumbnail  Consumer: User application (one or many exe instances) Examples: script to copy files; reconstruction job

Sept Wyatt Merritt Run II Computing Review18 SAM Terms and Concepts  A project runs on a station and requests delivery of a dataset to one or more consumers on that station.  Station: Processing power + disk cache + (connection to tape storage) + network access to SAM catalog and other station caches Example: a linux analysis cluster at D0  Dataset: metadata description which is resolved through a catalog query to a list of files. Datasets are named. Examples: (syntax not exact) data_type physics and run_number and data_tier raw request_id 5879 and data_tier thumbnail  Consumer: User application (one or many exe instances) Examples: script to copy files; reconstruction job

Sept Wyatt Merritt Run II Computing Review19 SAM Statistics - Operations Data

Sept Wyatt Merritt Run II Computing Review20 The following two slides are the two-year work plan presented at last year’s review, updated as follows. Bullets in green are finished. Bullets in blue are in progress. Bullets left black aren’t started yet.

Sept Wyatt Merritt Run II Computing Review21 SAM-Grid: The work plan for the next 2 years  Implement Schema Update I. (file_type, runs changes)  Automate MC production; understand issues in automating job distribution for re-processing and analysis.  Revise caching strategies. (local vs fileserving; merging operations; connections w/ other layers)  Implement Schema Update II. (processing requirements for jobs, group info)  Equip optimizers and job brokers to deal w/ info in Schema Update II.  Sort out parallelization issues.  Implement Virtual Organization tools.  Implement Monitoring and Information server on the SAM side  Provide for distributed database: two parts, file location info and processing info.; equip servers for more autonomous operation

Sept Wyatt Merritt Run II Computing Review22 SAM-Grid: The work plan for the next 2 years  Evaluate technology changes/upgrades Improvements for installation/config management? Move to VDT suite (production version of Condor, Globus, etc.) Possible CORBA replacements – WebServices? XML-based logging – will this be the way to go? Which solution for distributed DB’s?  Plan for interoperability Merge SAM catalog w/ other replica schemas? Follow example of DØ/CDF merge? Interoperation with other replica catalogs? GLUE schema for resource description; job description language Sam_batch_adapter technology Working with SRM’s – await outcome of caching strategy discussions Interactions of tools w/ data handling system: cf. mc_runjob & d0tools w/ JIM and CAF(CDF) VO organization issues Security issues (VO, file transfer, job submission)