Grid Deployment Area Status Report

Slides:



Advertisements
Similar presentations
6/2/2015Bernd Panzer-Steindel, CERN, IT1 Computing Fabric (CERN), Status and Plans.
Advertisements

12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
HEPiX Catania 19 th April 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 19 th April 2002 HEPiX 2002, Catania.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
EGEE is a project funded by the European Union under contract IST JRA1 Testing Activity: Status and Plans Leanne Guy EGEE Middleware Testing.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
SRM 2.2: status of the implementations and GSSD 6 th March 2007 Flavia Donno, Maarten Litmaath INFN and IT/GD, CERN.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
EGEE is a project funded by the European Union under contract IST JRA1-SA1 requirement gathering Maite Barroso JRA1 Integration and Testing.
LCG LHC Computing Grid Project – LCG CERN – European Organisation for Nuclear Research Geneva, Switzerland LCG LHCC Comprehensive.
CERN LCG Deployment Overview Ian Bird CERN IT/GD LHCC Comprehensive Review November 2003.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
EGEE is a project funded by the European Union under contract IST EGEE Services Ian Bird SA1 Manager Cork Meeting, April
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
INFSO-RI Enabling Grids for E-sciencE EGEE SA1 in EGEE-II – Overview Ian Bird IT Department CERN, Switzerland EGEE.
HEPiX IPv6 Working Group David Kelsey GDB, CERN 11 Jan 2012.
Site Manageability & Monitoring Issues for LCG Ian Bird IT Department, CERN LCG MB 24 th October 2006.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
LCG CERN David Foster LCG WP4 Meeting 20 th June 2002 LCG Project Status WP4 Meeting Presentation David Foster IT/LCG 20 June 2002.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
CERN LCG Deployment Overview Ian Bird CERN IT/GD LCG Internal Review November 2003.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE Project Review Fabrizio Gagliardi EDG-7 30 September 2003 EGEE is proposed as a project funded by the European Union under contract IST
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Summary of the EDG review Some info for the next future of the WP1 software Massimo Sgaravatto INFN Padova.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
Towards deploying a production interoperable Grid Infrastructure in the U.S. Vicky White U.S. Representative to GDB.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
Dave Newbold, University of Bristol14/8/2001 Testbed 1 What is it? First deployment of DataGrid middleware tools The place where we find out if it all.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
Ian Bird LCG Project Leader Summary of EGI workshop.
J Jensen / WP5 /RAL UCL 4/5 March 2004 GridPP / DataGrid wrap-up Mass Storage Management J Jensen
Bob Jones EGEE Technical Director
WLCG IPv6 deployment strategy
Testbed: Status & Plans
Status of Task Forces Ian Bird GDB 8 May 2003.
Grid Operations Centre Progress to Aug 03
Regional Operations Centres Core infrastructure Centres
LCG Service Challenge: Planning and Milestones
EGEE Middleware Activities Overview
David Kelsey CCLRC/RAL, UK
DataGrid Quality Assurance
SA1 Execution Plan Status and Issues
gLite->EMI2/UMD2 transition
Ian Bird GDB Meeting CERN 9 September 2003
Database Services at CERN Status Update
3D Application Tests Application test proposals
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
Database Readiness Workshop Intro & Goals
Olof Bärring LCG-LHCC Review, 22nd September 2008
Update from the HEPiX IPv6 WG
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
Workshop Summary Dirk Duellmann.
LCG Operations Centres
Bernd Panzer-Steindel CERN/IT
WLCG Collaboration Workshop;
A conceptual model of grid resources and services
Operating the LCG and EGEE Production Grid for HEP
LCG experience in Integrating Grid Toolkits
Leigh Grundhoefer Indiana University
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
LHC Data Analysis using a worldwide computing grid
Report on GLUE activities 5th EU-DataGRID Conference
Presentation transcript:

Grid Deployment Area Status Report Ian Bird SC2 14 March 2003

Deployment Status Summary GDB Working groups reported and defined LCG-1 First “packaged” version of LCG (LCG-0) is released and available LCG-0 has been deployed at CERN, RAL, CNAF, (Taiwan) Expect middleware for LCG-1 (July) to be delivered from VDT/EDG end April Re-aligned EDG/LCG at CERN to share resources and people Re-organisation of groups and people within IT also Collaborative activities developing in several areas Ian.Bird@cern.ch

Grid Deployment Organisation ALICE ATLAS CMS LHCb policies, strategy, scheduling, standards, recommendations grid deployment manager grid deployment board Grid Resource Coordinator LCG security group LCG operations team LCG toolkit integration & certification grid infra- structure team experiment support team Joint Trillium/ EDG/LCG testing team CERN-based teams regional centre operations regional centre operations regional centre operations regional centre operations operations call centre core infra- structure security tools grid monitoring regional centre operations regional centre operations regional centre operations regional centre operations anticipated teams at other institutes Ian.Bird@cern.ch

Deployment Personnel CERN LCG Other Requested* Management 1 Certification & Test 2 3 Test group 1.3 System analysis & support Grid infrastructure (people moved from ADC and retain commitments to EDG WP4,6) 0.5 2*0.5 Experiment integration & support Total 13.8 + 6 1.5 8 4.3 6 * Expected to be fulfilled by INFN-funded Fellows by July Ian.Bird@cern.ch

Personnel Issues Staff to support LCG-1 grid services have commitments to EDG test-bed support – part of reason for rationalising EDG/LCG resources Have used 2 people from certification to help support test-beds/LCG Pilot Testing group is understaffed – need to find at least 3 full time people to contribute here Don’t expect to get final LCG funded effort before July Ian.Bird@cern.ch

Milestones Define LCG-1 – Feb 1 The GDB WG reports were published and discussed at Feb 6 GDB. Sufficient to define direction and issues to be addressed; used in planning and deploying Initial Pilot Service available (Feb 1), Pilot 2 (March 15) Strategy at CERN changed – no separation between LCG and physics production systems. Pilot cluster is available – configured as minimum but can move batch nodes between Pilot and LXBatch as needed. This makes more efficient usage of the machines. Pilot service worker nodes managed by FIO group. Integrating LSF, Addressing NFS vs AFS issues Deployment schedule -> LCG-0 deployed to CERN, RAL, CNAF + Legnaro(T2), Taiwan, preparing for FNAL by end of March This is actually ahead of proposed schedule Ian.Bird@cern.ch

LCG-1 Ramp-up Schedule Date Regional Center Experiment Pilot 1 start – Feb 1 15/2/03 CERN All 1 28/2/03 CNAF, RAL 2 30/3/03 FNAL CMS 3 15/4/03 Taiwan Atlas,CMS 4 30/4/03 Karlsruhe 5 7/5/03 IN2P3 6 15/5/03 BNL (want to wait) Atlas 7 21/5/03 Russia(Moscow),Tokyo LCG-1 Initial Public Service Start – July 1 Tier 2 centres will be brought on-line in parallel once the local Tier 1 is up to provide support Ian.Bird@cern.ch

Milestones - 2 Certification process defined (January) This has been done – agreed common process with EDG Have agreed joint project with VDT (US): VDT provide basic level (Globus, Condor) testing suites We provide higher level testing Expect to get HEPCAL test-cases from GAG Need to pull in other expertise E.g. EDG WP8/loose cannons Look at using common tools and frameworks (where it makes sense) NMI/VDT-LCG We need to do this soon to avoid divergence Need much more effort on devising & writing tests Real effort currently is only 2 people Ian.Bird@cern.ch

Test and Validation process Developers machines Build system Development Testbed ~15cpu Certification Testbed ~40cpu Production Unit Test Build Integration Certification Production WPs add unit tested code to CVS repository Run nightly build & auto. tests Individual WP tests Grid certification Certified public release for use by apps. Build system Integration Team Test Group Users Tagged package WPs Tagged release selected for certification Overall release tests Certified release selected for deployment Application Certification Fix problems Appl. Representatives Releases candidate Releases candidate Tagged Releases Certified Releases Office hours 24x7 Bugzilla anomalies reports Ian.Bird@cern.ch

Testing and Certification Building certification tb At CERN is 4x10 (25 available now) nodes (local grid) Will include Wisconsin asap Currently installing the EDG2.0/LCG-1 pre-cursor on the cert tb and beginning the certification work Ian.Bird@cern.ch

Test-beds and services Agreement 2 weeks ago to merge EDG and LCG production services and to separate test-beds The only way to share scarce resources (support) March – July: No EDG production test-bed at CERN now (but access to Castor) EDG core sites will run dev. test-beds and either EDG production or LCG pilot, unless they have resources for both LCG pilot on other EDG sites and at non-EDG centres From July: There will be a single production system – LCG-1 at least at CERN, hopefully at other EDG/LCG sites Ian.Bird@cern.ch

Milestones – 3 Packaging/configuration mechanism defined – March Group (EDG, LCG, VDT) have documented an agreed common approach Now will proceed with a staged implementation Basic for LCG-1 in July, and more developed later Delivery of middleware – March 1 We have a working set (“LCG-0”) that is in use now Expect delivery of mw for July by end April (from EDG) Identify operations and call centres – February 1 2 candidates for operations centres – hopefully this should be clarified in the next 2 weeks No clear candidate for a support centre – but we (LCG CERN group) will have a basic support service – already in place. Ian.Bird@cern.ch

Other Progress

Middleware system support European grid support centre Maarten Litmaath as 1/3 of technical Globus support people (SE, UK, LCG) Will participate in Globus 2.4 release process LCG team Currently 2 (3) people Building relationships with EDG, Globus, VDT Worked on RLS stress testing and debugging Ian.Bird@cern.ch

Security Dave Kelsey will lead ongoing security activity Policies Security strategy and plan This is needed urgently – as basis for operational agreements at centres Security operational issues: Led by Dane Skow (FNAL), group of site security contacts Gathering issues, constraints, etc. This group will handle daily security issues Proposing collaboration on VO management FNAL, INFN, … Ian.Bird@cern.ch

Collaborative Activities HICB – JTB GLUE Schema and evolution Validation and Test Suites Distribution, Meta-Packaging, Configuration Monitoring tools (proposed), aspects of ops centres Proposed collaboration on VO tools (led by FNAL) GGF Production Grid Management (operations) User Services (call centres) Tools, trouble ticket exchange standards, etc Site AAA (security) Particle and Nuclear Physics Applications area As a forum in GGF to present issues and get collaboration Other HEPiX – Fabric, operations, tools, procedures Security – site security contacts Storage Interfaces – SRM Ian.Bird@cern.ch

Compiler Issues EDG 2.0 release plan has assumed gcc 2.95.2 This was agreed with the experiments CMS and Atlas now request EDG 2.0/LCG-1 with gcc 3.2.2 For LHCb and ALICE 2.95.2 is acceptable Essential for integration with POOL Problem This will delay delivery of EDG 2.0 to LCG by ~6 weeks (this was estimate after 5 minutes thought) Possibilities: Continue as scheduled and deliver EDG2.0 with 2.95.2 ..and .. foresee update in September This implies LCG-1 cannot use POOL before upgrade Switch to gcc 3.2.2 now – introduces 6 week delay Ian.Bird@cern.ch

Compiler Issues Observations: (I propose ) Continue with agreed plan – Pool will not have been integrated in experiment software and tested before LCG-1 is deployed in July It has a command-line interface A delay of 6 weeks will mean nothing is deployed during the summer (vacations) An upgrade of middleware is already foreseen for Sep/Oct (I propose ) Continue with agreed plan – deploy EDG 2.0/LCG-1 as planned, and upgrade to gcc 3.2 and include POOL in September Ian.Bird@cern.ch

Summary Progress on schedule with deployment Need to find effort on testing activities Need to get operations and call centre activities started Ian.Bird@cern.ch