Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Status Report : SAMGrid

Similar presentations


Presentation on theme: "Project Status Report : SAMGrid"— Presentation transcript:

1 Project Status Report : SAMGrid
SAMGrid Management, Status, Operations – Merritt SAMGrid Development I. – Veseli SAMGrid Development II. – Kennedy SAMGrid Future Plans – St. Denis 18 Feb 2004 Computing Division Project Status Report

2 SAMGrid Project Description
Purpose: Provide data handling services to Run II experiments and other interested experiments with similar problems. These services should scale in performance and convenience for cataloging and delivery of Petabyte-sized datasets, and should evolve to availability in relevant Grid environments. Current stakeholders: CDF, DØ, MINOS, CD Duration: Development effort is expected to extend through ’07 as the components move to become Grid services. High-level maintenance ( I.e., effort that includes capability to respond to feature requests) is expected to continue through at least the data collection lifetime of the stakeholders. 18 Feb 2004 Computing Division Project Status Report

3 Computing Division Project Status Report
The SAM-Grid Team Revised Management Plan went into effect Dec 03 Project Co-Leaders: Wyatt Merritt CD/DØCA Rick St. Denis CDF/ U Glasgow Project Technical Co-Leaders: Rob Kennedy CD/CDF Sinisa Veseli CD/DØCA CCF: Andrew Baranovski, Gabriele Garzoglio, Igor Terekhov CEPA: Carmenita Moore, Steve White (0.5 FTE) CDF: Randy Herber, Art Kreymer, Stefan Stonjek (GS) DØCA: Lauri Loebel Carpenter, Robert Illingworth, Adam Lyon 18 Feb 2004 Computing Division Project Status Report

4 The SAM-Grid Team - Extended
Database support (CSS-DSG): Diana Bonham, Anil Kumar Associated internal projects: RUNJOB (with CMS) Authorization Project (with CMS, still being defined) Associated external projects: PPDG Sankalp Jain, Aditya Nishandar GridPP Morag Burgon-Lyon, Valeria Bartsch, Iain Bertram, Dave Evans, Peter Love SBIR II Matt Vranicar, Jeremy Simmons, Josh Gramlich, Ngan MacDonald, John Grace 18 Feb 2004 Computing Division Project Status Report

5 SAMGrid Project Management & Organization
Project co-leaders Represent largest stakeholders: requirements & priorities Run weekly design meetings Project technical leaders Run weekly operations meeting Conduct subproject assessments Active Subprojects: C++ API, DBServer, JIM, H Stream Reco for CDF, Caching, Chains&Links, CDF DFC, Test Harness, Linux deploy of DBServers, Config Man Planned Subprojects: Request system, Autodest, Further monitoring (MIS) Related Subprojects: d0tools, SBIR II, Condor mods, workflow packages for CDF & D0, Authorization & Accounting Recently completed Subprojects: Python API, V5.1 Schema Design, Batch Adapter, D0 Online dcache TDP, 1st Gen Monitoring Tools, Data Dimensions Grammar 18 Feb 2004 Computing Division Project Status Report

6 Computing Division Project Status Report
SAMGrid Components Event/File Catalog for metadata (contents & processing) and locations Dbservers for accessing catalog Station servers for file delivery to projects Optimizer File storage server Interface to station cache and MSS (samcp) JIM components for Grid job submission & monitoring User API C++ client API 18 Feb 2004 Computing Division Project Status Report

7 Status and deployments of SAMGrid For DØ
FNAL: online, reco farm, d0mino, cab, new cab, clued0 Monte Carlo production sites remote analysis sites: ~20 active, ~40 deployed Operational 11/03 – 2/04 for remote reconstruction: IN2P3, UKGrid (Manchester/ICL/RAL), WestGRID, GridKA, NIKHEF M events reprocessed remotely Stats: ~78K proj FNAL, >14K proj remote (since 1/1/03) billion evts, 3 PB, 8 M files consumed (all D0 stations) 18 Feb 2004 Computing Division Project Status Report

8 Computing Division Project Status Report
18 Feb 2004 Computing Division Project Status Report

9 Computing Division Project Status Report
D0 Files Files/Day 18 Feb 2004 Computing Division Project Status Report

10 D0 Files Per Month By Year
1999 2000 2001 2002 2003 100,000 files Run II Start 18 Feb 2004 Computing Division Project Status Report

11 Computing Division Project Status Report
D0 Total Files 2.5Million Files Served 18 Feb 2004 Computing Division Project Status Report

12 Computing Division Project Status Report
D0 Total Data Moved 700TB moved 18 Feb 2004 Computing Division Project Status Report

13 Status and deployments of SAMGrid For CDF
Operational 24/7 to store online metadata Operational at remote stations: ~15 active, ~30 deployed Large recent increase: Fla. Wkshp! In testing for Monte Carlo production File delivery tests up to 20 TB on testcaf Statistics: ~3000 proj total (since 1/1/03) Note CDF usage pattern is different from DØ: CDF moves more GB (but not more events) because it does not use small summary format like DØ thumbnail. 18 Feb 2004 Computing Division Project Status Report

14 CDF Florida DH Workshop
Now 20! 11 installations in about 2 hours. Integrated with dCAF in 2 cases in 2 days. 3 in Asia, 4 in Europe 6 sites committed to summer 2004 usage of their facilities for all of CDF (mostly MC) Sam installation now: initsam cdf <stationname> Follow-up on April 1. Each site has a local user support person to reduce load on core development team. Generally: Security ate 80% of the effort! 18 Feb 2004 Computing Division Project Status Report

15 Computing Division Project Status Report
18 Feb 2004 Computing Division Project Status Report

16 Computing Division Project Status Report
2TB/Day: Karlsruhe 18 Feb 2004 Computing Division Project Status Report

17 CDF Dcache on CAF ALL CDF on CAF reads 20TB/Day 18 Feb 2004
Computing Division Project Status Report

18 Computing Division Project Status Report
18 Feb 2004 Computing Division Project Status Report

19 Computing Division Project Status Report
18 Feb 2004 Computing Division Project Status Report

20 Status and deployments of JIM
Job broker, execution and submission site software, job monitor, client software for grid job submission Deployment plan for DØ Monte Carlo Test at 3 sites (Manchester, CCIN2P3, Wisconsin) with basic functionality and measure efficiency of job completion Verify use by experimenter for job submission (this week) Add merging Move to production at these 3 sites (DØ milestone: Mar 1) Add remainder of DØ MC sites (Lancaster, SAR, NIKHEF, Prague) Improve brokering algorithm Manchester CCIN2P3 Wisconsin JIM eff >99% Site eff x Code eff ~85% ~60% 18 Feb 2004 Computing Division Project Status Report

21 Computing Division Project Status Report
JIM Issues Site operational requirements (e.g. clock synch, disk & node reliability, OS issues) Experiment operational requirements (e.g. code footprint may exceed site capability and is variable w/ release) File transfer capabilities & policies: cf. mtg this week w/ GridKA rep Allocation of services to head node vs worker nodes Sandboxing mechanisms (last week design mtg) Merging mechanism, brokering (this week design mtg) 18 Feb 2004 Computing Division Project Status Report

22 Computing Division Project Status Report
Operational Model Experiments provide shifters for 1st line problem fielding and solving Project provides on-call list from developers At DØ, on average ~60 – 80% of problems are answered by shifters Classes of problems Routine jobs like adding info to database Less routine: cleanup after failed stores Answering user questions regarding usage Updating documentation Investigating user reports of problems, and problems visible in project monitoring tools Providing solutions for problems 18 Feb 2004 Computing Division Project Status Report

23 Computing Division Project Status Report
Operations Outlook Improve documentation with aim of improving shifter & user ability to diagnose/solve problems Expect doubling of central station capacity at DØ Expect transition to more SAM usage at CDF Expect Grid operations in production for simulation, first at DØ then at CDF 18 Feb 2004 Computing Division Project Status Report


Download ppt "Project Status Report : SAMGrid"

Similar presentations


Ads by Google