Key Project Drivers - an Update Ruth Pordes, June 14th 2008, V2: June 23 rd. These slides are in addition to the information available in https://twiki.grid.iu.edu/pub/Management/20080605ETAgendaMinutes/OSG_Year3_Planning.pdf.

Slides:



Advertisements
Similar presentations
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Jan 2010 Current OSG Efforts and Status, Grid Deployment Board, Jan 12 th 2010 OSG has weekly Operations and Production Meetings including US ATLAS and.
Assessment of Core Services provided to USLHC by OSG.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
Integration and Sites Rob Gardner Area Coordinators Meeting 12/4/08.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Overview of day-to-day operations Suzanne Poulat.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Key Project Drivers - FY10 Ruth Pordes, June 15th 2009.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL
Overall Goal of the Project  Develop full functionality of CMS Tier-2 centers  Embed the Tier-2 centers in the LHC-GRID  Provide well documented and.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
Job and Data Accounting on the Open Science Grid Ruth Pordes, Fermilab with thanks to Brian Bockelman, Philippe Canal, Chris Green, Rob Quick.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
BNL Tier 1 Service Planning & Monitoring Bruce G. Gibbard GDB 5-6 August 2006.
Open Science Grid Open Science Grid: Beyond the Honeymoon Dane Skow Fermilab September 1, 2005.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
Sep 25, 20071/5 Grid Services Activities on Security Gabriele Garzoglio Grid Services Activities on Security Gabriele Garzoglio Computing Division, Fermilab.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL May 8 th, 2008.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
OSG Report for DOE/NSF Joint Oversight Group U.S. Large Hadron Collider Program OSG Report for DOE/NSF Joint Oversight Group U.S. Large Hadron Collider.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
1 Open Science Grid.. An introduction Ruth Pordes Fermilab.
Towards deploying a production interoperable Grid Infrastructure in the U.S. Vicky White U.S. Representative to GDB.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
A Computing Tier 2 Node Eric Fede – LAPP/IN2P3. 2 Eric Fede – 1st Chinese-French Workshop Plan What is a Tier 2 –Context and definition To be a Tier 2.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
LCG Accounting Update John Gordon, CCLRC-RAL 10/1/2007.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
Bob Jones EGEE Technical Director
Gene Oleynik, Head of Data Storage and Caching,
Key Project Drivers - FY10 Ruth Pordes, June 15th 2009
Open Science Grid Progress and Status
Ian Bird GDB Meeting CERN 9 September 2003
POW MND section.
Report on SLA progress Ioannis Liabotis <ilaboti at grnet.gr>
Leigh Grundhoefer Indiana University
LHC Data Analysis using a worldwide computing grid
Presentation transcript:

Key Project Drivers - an Update Ruth Pordes, June 14th 2008, V2: June 23 rd. These slides are in addition to the information available in “The Year 3 Planning Process and Schedule”. Goals & Plans of the experiments OSG Strategic Roadmap - two slides of comment Feedback from DOE/NSF review - please use the slides above

May 9, WLCG US ATLAS and US CMS tested their initial data taking usage rates and throughput during CCRC’08.  They were successful in spurts but not able to sustain them the rates and throughput in some cases. In FY09 sustainability, robustness, and responsiveness are key. Contribute to goals and metrics of ongoing operations of WLCG and to specific activities such as CCRC’09.

May 9, US ATLAS and US CMS reource increase Summary of US ATLAS Tier2s %Increase CPU (kSI2K)4,9486,36922% Disk (Tbytes)1,5662,46737% Summary of US CMS Tier2s CPU (kSI2K)7,0007,7009% Disk (Tbytes)1,4002,52044% ATLAS BNL Tier-1 CPU (kSI2K)4,8447,33734% Disk (Tbytes)3,1365,82246% Tape (Tbytes)1,7153,27748% CMS FNAL Tier-1 CPU (kSI2K)4,3005,10016% Disk (Tbytes)2,0002,60023% Tape (Tbytes)4,7007,10034% OSG must maintain performance and scalability of the infrastructure with these resource levels.

May 9, WLCG MOU Goals - OSG support for LHC Tier-2s provision of managed disk storage providing permanent and/or temporary data storage for files and databases; provision of access to the stored data by other centres of the WLCG operation of an end-user analysis facility provision of other services, e.g. simulation, according to agreed Experiment requirements; ensure network bandwidth and services for data exchange with Tier1 Centres, as part of an overall plan agreed between the Experiments and the Tier1 Centres concerned. All storage and computational services shall be “grid enabled” according to standards agreed between the LHC Experiments and the regional centres. ServiceMaximum delay in responding to operational problems Average availability measured on an annual basis Prime timeOther periods End-user analysis facility2 hours72 hours95% Other services12 hours72 hours95%

May 9, WLCG MOU - OSG provides a Grid Operations Center Annex ハ 3.4.Grid Operations Services This section lists services required for the operation and management of the grid for LHC computing. This section reflects the current (September 2005) state of experience with operating grids for high energy physics. It will be refined as experience is gained. Grid Operations Centres – Responsible for maintaining configuration databases, operating the monitoring infrastructure, pro-active fault and performance monitoring, provision of accounting information, and other services that may be agreed. Each Grid Operations Centre shall be responsible for providing a defined sub-set of services, agreed by the WLCG Collaboration. Some of these services may be limited to a specific region or period (e.g. prime shift support in the country where the centre is located). Centres may share responsibility for operations as agreed from time to time by the WLCG Collaboration. User Support for grid and computing service operations:  First level (end-user) helpdesks are assumed to be provided by LHC Experiments and/or national or regional centres, and are not covered by this MoU.  Grid Call Centres – Provide second level support for grid-related problems, including pro-active problem management. These centres would normally support only service staff from other centres and expert users. Each call centre shall be responsible for the support of a defined set of users and regional centres and shall provide coverage during specific hours.

May 9, WLCG Operations Centers in the US Indiana University iGOC Scope of the serviceOpen Science Grid Operations Centre Period during which the centre operates as the primary monitoring centre 24  7  52 BNL, Fermilab Scope of the serviceUS-ATLAS and US-CMS Virtual Organisation Support Centre respectively Period during which the centre operates as the primary monitoring centre 24  7  52

May 9, US ATLAS Throughput goals:  200 Mbytes/sec data distribution across Tier-0/Tier-1/Tier-2 robust, sustained.  14,000 simultaneous jobs (10,000 simultaneous jobs 5/2008; must scale with number of CPUs. Tier-1+Tier-2 increase is 29%;) Middleware needs:  Dependable deployment, installation and configuration of CE compatible with PANDA,  Software for dCache, Bestman ad xRootD to meet SRM V2.2 spec WLCG SRM MOU & addendum.  Support from centralized expert group for use of storage on OSG sites  Glexec integrated with PANDA and accepted by EGEE sites.  Support for software in the VDT and coordination of requirements and deliverables of external software providers as needed by PANDA and US ATLAS.  Support for PANDA deployment, monitoring, installation and extensions to meet ATLAS needs  Support for OS needed by the experiment, expect to evaluate whether to move to Scientific Linux 5 and/or to Scientific Linux 6.  User and Worker Node Client tools with consistent with EGEE and providing interoperability at all levels. Contributions to the WLCG  Contributions to the WLCG deployment of PANDA/pilot job infrastructure.  Support for Physics Analysis based on PROOF: Multi-User support in general purpose Processing Farm environment  Integration of EGEE-compatible Security across various Storage, File Transfer, Globus, Pilot Jobs.

May 9, US ATLAS Service Needs Critical?Interface to WLCG? YSecurity monitoring, incident response, notification and mitigation Communicate and collaboration with EGEE, WLCG to the extent possible. YAccounting - CPU, Storage & EfficienciesYes YUS ATLAS specific accounting reportsNo YReliability and Availability monitoringYes YIntegration and system validation of new and updated middleware. Test interoperation of new releases with EGEE infrastructure. NUser/VO monitoring and validation using the RSV infrastructure Perhaps YTicket HandlingBi-directional:US ATLAS OSG GGUS, including alarms. YSRM V2.2 Storage at Tier-2sTrack WLCG deployments. YCE interface to meet ATLAS throughput needs YReporting of trends in usage, reliability, job state, job monitoring. YGrid wide Information system accessible to ATLAS applications. No

May 9, US CMS Throughput goals:  Demonstrate peak burst rate for one day, (or one week), once a month, for each permutation of Tier-2 to all Tier-1s of >50MBytes/second.  Notes: the goals of the WLCG Megatable are very out of date for CMS and should not be used.

May 9, US CMS Throughput goals:  From C-TDR # events produced and processed. Measurement is in ProdMon for production. For CRAB TBD simultaneous jobs. Terms of 33% of total number of total CMS events/jobs produced on OSG. Support for US CMS Tier-3s  OSG software releases & documentation, operations, software and troublehshooting support to allow US CMS Tier-3s to easily and dependably join and operate resources to receive, analyse and write derived results from US CMS data. Middleware needs  Dependable deployment, installation and configuration of Glidein WMS based workload management  Software for dCache, to meet SRM V2.2 spec WLCG SRM MOU & addendum  Evaluation of Bestman in US CMS environment.  Glexec support and acceptance by EGEE sites.  Support for software in the VDT and coordination of requirements and deliverables of external software providers as needed by US CMS.  Support for general pilot GlideinWMS based job management and throughput to meet US CMS needs -including effective policies and prioritization across sites.  Support for OS needed by the experiment, (Scientific Linux 5, Scientific Linux 6,increased usage of MacOSX)  User and Worker Node Client tools consistent with EGEE and providing interoperability at all levels. Contributions to the WLCG  Data storage prioritization and access control to meet CMS needs.

11 US CMS Service Needs Critical?Interface to WLCG? YSecurity monitoring, incident response, notification and mitigation Communicate and collaboration with EGEE, WLCG to the extent possible. YGOC BDII Information System with accurate information published by all OSG sites that support US CMS VO. Reliability publish accurate information to WLCG BDII YIntegration and system validation of new and updated middleware. Test interoperation of new releases with EGEE infrastructure. YAccounting - CPU, Storage & EfficienciesReliably publish information to WLCG APEL database YReliability and Availability monitoringReliably publish information to WLCG SAM and GridView databases NUser/VO monitoring and validation using the RSV infrastructure Perhaps YTicket HandlingBi-directional:US CMS OSG GGUS, including alarms. NTroubleshooting and user support, especially support from centralized expert group for use of storage on OSG sites YSRM V2.2 Storage at Tier-2sTrack WLCG deployments. YCE interface to meet CMS throughput needs based on GlideinWMS workload management. Job Submission Interoperability Track use of Cream and ensure WS-Gram sites can be used by CMS glite WMS (when glideinWMS used) YReporting of trends in usage, reliability, job state, job monitoring. Site level dashboard of usage, job state, efficiencies and errors across US CMS OSG sites.

May 9, LIGO Application needs:  Full deployment and support of ws-gram across the majority of OSG sites.  Support for data movement and placement on OSG sites for LIGO applications.  Science results from running Inspiral analysis on OSG.  Evaluation of another LIGO science application on the OSG. Middleware needs:  Support for LIGO OS -- Alain please fill in  Easier to deploy and upgrade software releases.  Integration of and support for LIGO security infrastructure. Service needs: Critical? YSecurity monitoring, incident response, notification and mitigation YAccounting -Integration of accounting with OSG accounting reports YIntegration and system validation of new and updated middleware. YTicket Handling YCE ws-gram interfaces YReporting of trends in usage, reliability, job state, job monitoring

May 9, Run II Needs Continued support for opportunistic use of OSG resources for simulation needs. Maintain deployed middleware and services compatible with existing experiment software. Throughput  DZero : 5 M events/week

May 9, STAR Support for xrootd on OSG sites. Evaluation and possible use of CEDPS workspaces on OSG sites.

May 9, Comments on Milestones from OSG Proposal (slides of “The Year 3 Planning Process and Schedule”) Reduce the “in-effectiveness” metrics of the Facility by 50%. To my mind this cuts across all areas in the Project and it would be good if each area could think about it not only within their area but also what could be gained by changing the boundaries between and scope between areas, not just within the Facility itself. Also, as part of the planning it would be good if all area coordinators could define ~2-4 measurements and metrics to be able to measure “in- effectiveness”. This is something I think Brian asked for but we did not have time to follow up on yet. When defining these please include a) who the measurement would be used by b) the value (e.g. less user effort, fewer wasted resources, better communication, more throughput etc.) in reducing the “in-effectiveness” c) how the measurement could be done and

May 9, Other Comments.. Other things : Moving to software releases that can be updated incrementally from installed versions, rather than fully reinstalled releases, as quite a shift in our model and may change our processes quite a bit. I would like everyone to think about how this could impact or could create deliverables for their area. I would prefer you put things in your program of work that you think need doing without regard to the effort available. I would also prefer any holes you are worried about get included in the area plans rather than left to chance later. I would ask that we include documentation, technical reports, administrative overheads, other cross-cutting tasks in each area somewhere as appropriate. If it is useful to merge we can do that later.