Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP Structures and Status Report Jeremy Coles

Slides:



Advertisements
Similar presentations
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
Advertisements

ESLEA and HEPs Work on UKLight Network. ESLEA Exploitation of Switched Lightpaths in E- sciences Applications Exploitation of Switched Lightpaths in E-
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
LCG-France Project Status Fabio Hernandez Frédérique Chollet Fairouz Malek Réunion Sites LCG-France Annecy, May
UK Status for SC3 Jeremy Coles GridPP Production Manager: Service Challenge Meeting Taipei 26 th April 2005.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
NorthGrid status Alessandra Forti Gridpp13 Durham, 4 July 2005.
Applications Area Issues RWL Jones Deployment Team – 2 nd June 2005.
March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.
Edinburgh Site Report 1 July 2004 Steve Thorn Particle Physics Experiments Group.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Quarterly report SouthernTier-2 Quarter P.D. Gronbech.
D. Britton GridPP Status - ProjectMap 22/Feb/06. D. Britton22/Feb/2006GridPP Status GridPP2 ProjectMap.
Andrew McNab - Manchester HEP - 5 July 2001 WP6/Testbed Status Status by partner –CNRS, Czech R., INFN, NIKHEF, NorduGrid, LIP, Russia, UK Security Integration.
Southgrid Technical Meeting Pete Gronbech: 16 th March 2006 Birmingham.
Tony Doyle GridPP – From Prototype To Production, GridPP10 Meeting, CERN, 2 June 2004.
Preparation of KIPT (Kharkov) computing facilities for CMS data analysis L. Levchuk Kharkov Institute of Physics and Technology (KIPT), Kharkov, Ukraine.
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
23 Oct 2002HEPiX FNALJohn Gordon CLRC-RAL Site Report John Gordon CLRC eScience Centre.
Robin Middleton RAL/PPD DG Co-ordination Rome, 23rd June 2001.
RAL Tier 1 Site Report HEPSysMan – RAL – May 2006 Martin Bly.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
1 st EGEE Conference – April UK and Ireland Partner Dave Kant Deputy ROC Manager.
Tier1 Status Report Martin Bly RAL 27,28 April 2005.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
John Gordon CCLRC e-Science Centre LCG Deployment in the UK John Gordon GridPP10.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Jeremy Coles UK LCG Operations The Geographical Distribution of GridPP Institutes Production Manager.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Southgrid Technical Meeting Pete Gronbech: 26 th August 2005 Oxford.
GridPP Deployment Status GridPP14 Jeremy Coles 6 th September 2005.
CSCS Status Peter Kunszt Manager Swiss Grid Initiative CHIPP, 21 April, 2006.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
LCG Service Challenges: Planning for Tier2 Sites Update for HEPiX meeting Jamie Shiers IT-GD, CERN.
LCG Service Challenges: Planning for Tier2 Sites Update for HEPiX meeting Jamie Shiers IT-GD, CERN.
…building the next IT revolution From Web to Grid…
Tony Doyle - University of Glasgow 8 July 2005Collaboration Board Meeting GridPP Report Tony Doyle.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory Review of U.S. LHC Software and Computing Projects Fermi National Laboratory November.
Site Report --- Andrzej Olszewski CYFRONET, Kraków, Poland WLCG GridKa+T2s Workshop.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Tier1 Andrew Sansum GRIDPP 10 June GRIDPP10 June 2004Tier1A2 Production Service for HEP (PPARC) GRIDPP ( ). –“ GridPP will enable testing.
Production Manager’s Report PMB Jeremy Coles 13 rd September 2004.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
UK Tier 1 Centre Glenn Patrick LHCb Software Week, 28 April 2006.
Partner Logo A Tier1 Centre at RAL and more John Gordon eScience Centre CLRC-RAL HEPiX/HEPNT - Catania 19th April 2002.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
RAL Site Report HEPiX - Rome 3-5 April 2006 Martin Bly.
Tier-1 Andrew Sansum Deployment Board 12 July 2007.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
ASCC Site Report Eric Yen & Simon C. Lin Academia Sinica 20 July 2005.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
The status of IHEP Beijing Site WLCG Asia-Pacific Workshop Yaodong CHENG IHEP, China 01 December 2006.
London Tier-2 Quarter Owen Maroney
LCG Service Challenge: Planning and Milestones
UK GridPP Tier-1/A Centre at CLRC
Presentation transcript:

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP Structures and Status Report Jeremy Coles

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Contents GridPP components involved in Service Challenge preparations Sites involved in SC3 SC3 deployment status Questions and issues Summary GridPP background

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP background GridPP GridPP – A UK Computing Grid for Particle Physics 19 UK Universities, CCLRC (RAL & Daresbury) and CERN Funded by the Particle Physics and Astronomy Research Council (PPARC) GridPP1 - Sept £17m "From Web to Grid“ GridPP2 – Sept £16(+1)m "From Prototype to Production"

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP2 Components C. Grid Application Development LHC and US Experiments + Lattice QCD + Phenomenology B. Middleware Security Network Development F. LHC Computing Grid Project (LCG Phase 2) [review] E. Tier-1/A Deployment: Hardware, System Management, Experiment Support A. Management, Travel, Operations D. Tier-2 Deployment: 4 Regional Centres - M/S/N support and System Management

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP2 ProjectMap

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP sites ScotGrid Tier-2s NorthGrid SouthGrid London Tier-2

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting UK Core e-Science Programme Institutes Tier-2 Centres CERN LCG EGEE GridPP GridPP in Context Tier-1/A Middleware, Security, Networking Experiments Grid Support Centre Not to scale! Apps Dev Apps Int GridPP

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting GridPP Management Collaboration Board Project Management Board Project Leader Project Manager Deployment Board User Board (Production Manager) (Dissemination Officer) GGF, LCG, (EGEE), UK e- Science, Liaison Project Map Risk Register Tier-1 Board Tier-2 Board

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Reporting Lines Project Manager Application Coordinator Tier-1 Manager Production Manager Tier-2 Board Chair Middleware Coordinator ATLAS GANGA LHCb CMS PhenoGrid BaBar CDF D0 UKQCD SAM Tier-1 Staff 4 EGEE Funded Tier-2 Coordinators Tier-2 Hardware Support Posts Metadata Data Info.Mon. WLMS Security Network CB PMB OC Portal LHC Dep.

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Service Challenge coordination Tier-1 Manager Production Manager Tier-1 Staff 4 EGEE Funded Tier-2 Coordinators Tier-2 Hardware Support Posts Storage group Network group 1.LCG “expert” 2.dCache deployment 3.Hardware support 1.Hardware support post 2.Tier-2 coordinator (LCG) 1.Network advice/support 2.SRM deployment advice/support

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Activities to help deployment RAL storage workshop – review of dCache model, deployment and issues Biweekly network teleconferences Biweekly storage group teleconferences. GridPP storage group members available to visit sites. UK SC3 teleconferences – to cover site problems and updates SC3 “tutorial” last week for grounding in FTS, LFC and review of DPM and dCache Agendas for most of these can be found here:

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Sites involved in SC3 ScotGrid Tier-2s NorthGrid SouthGrid London Tier-2 SC3 sites LHCb ATLAS CMS Tier-1

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Networks status Connectivity RAL to CERN 2*1 Gbits/s via UKLight and Netherlight Lancaster to RAL 1*1 Gbits/s via UKLight (fallback: SuperJanet 4 production network) Imperial to RAL SuperJanet 4 production network Edinburgh to RAL SuperJanet 4 production network Status 1) Lancaster to RAL UKLight connection requested 6 th May ) UKLight access to Lancaster available now 3) Additional 2*1 Gbit/s interface card required at RAL Issues Awaiting timescale from UKERNA on 1 & 3 IP level configuration discussions (Lancaster-RAL) just started Merging SC3 production traffic and UKLight traffic raises several new problems Underlying network provision WILL change between now and LHC start-up

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Tier-1 Currently merging SC3 dCache with production dCache –Not yet clear how to extend production dCache to allow good bandwidth to UKLIGHT, the farms and SJ4 –dCache expected to be available for Tier-2 site tests around 3 rd June Had problems with dual attachment of SC2 dCache. A fix has been released but we have not yet tried it. –Implications for site network setup CERN link undergoing “health check” this week. During SC2 the performance of the link was not great Need to work on scheduling (assistance/attention) transfer tests with T2 sites. Tests should complete by 17 th June. Still unclear on exactly what services need to be deployed – there is increasing concern that we will not be able to deploy in time!

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Tier-1 high-level plan DateStatus Friday 20 th MayNetwork provisioned in final configuration Friday 3 rd JuneNetwork confirmed to be stable and able to deliver at full capacity. SRM ready for use Friday 17 th JuneTier-1 SRM tested end-to-end with CERN. Full data rate capability demonstrated. Load balancing and tuning completed to Tier-1 satisfaction Friday 1 st JulyCompleted integration tests with CERN using FTS. Certification complete. Tier-1 ready for SC3

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Edinburgh (LHCb) HARDWARE GridPP frontend machines upgraded to Limited CPU available (3 machines – 5CPUs). Still waiting for more details about requirements from LHCb 22TB datastore deployed but a raid array problem leads to half the storage being currently unavailable – IBM investigating Disk server: IBM xSeries 440 with eight 1.9 GHz Xeon processors, 32Gb RAM GB copper Ethernet to GB 3com switch SOFTWARE dCache head and pool nodes rebuilt with Scientific Linux dCache now installed on admin node using apt-get. Partial install on pool node. On advice will restart using YAIM method. What software does LHCb need to be installed?

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Imperial (CMS) HARDWARE 1.5TB CMS dedicated storage available for SC3 1.5TB (shared) shared storage can be made partially available May purchase more if needed – what is required!? Farm on 100Mb connection.1Gb connection to all servers. –400Mb firewall may be a bottleneck. –Two separate firewall boxes available –Could upgrade to 1Gb firewall – what is needed?

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Imperial (CMS) SOFTWARE dCache SRM now installed on test nodes 2 main problems both related to installation scripts This week starting installation on a dedicated production server –Issues: Which ports need to be open through the firewall, use of pools as VO filespace quotas and optimisation of local network topology CMS software –Phedex deployed, but not tested –PubDB in process of deployment –Some reconfiguration of farm was needed to install CMS analysis software –What else is necessary? More details from CMS needed!

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Imperial Network CISCO HP2828 HP2424HP2626 Extreme 5i GW (farm) storage GFE 100Mb Gb 400Mb Firewall

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Lancaster HARDWARE Farm now at LCG I/O servers with 2x6TB RAID5 arrays each currently has 100 Mb/s connectivity Request submitted to UKERNA for dedicated lightpath to RAL. Currently expect to use production network. T2 Designing system so that no changes needed for transition from throughput phase to service phase. –Except possible IPerf and/or transfers of “fake data” as backup of connection and bandwidth tests in throughput phase SOFTWARE dCache deployed onto test machines. Current plans have production dCache available in early July! Currently evaluating what ATLAS software and services needed overall and their deployment Waiting for FTS release.

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Lancs/ATLAS SC3 Plan TaskStart DateEnd DateResource Optimisation of networkMon 18/09/06Fri 13/10/06BD,MD,BG,NP, RAL Test of data transfer rates (CERN-UK)Mon 16/10/06Fri 01/12/06BD,MD, ATLAS, RAL Provision of end-to-end conn. (T1-T2) Integrate Don Quixote tools into SC infrastructure at LANMon 19/09/05Fri 30/09/05BD Provision of memory-to-memory conn. (RAL-LAN)Tue 29/03/05Fri 13/05/05UKERNA,BD,BG,NP, RAL Provision and Commission of LAN h/wTue 29/03/05Fri 10/06/05BD,BG,NP Installation of LAN dCache SRMMon 13/06/05Fri 01/07/05MD,BD Test basic data movement (RAL-LAN)Mon 04/07/05Fri 29/07/05BD,MD,ATLAS, RAL Review of bottlenecks and required actionsMon 01/08/05Fri 16/09/05BD [SC3 – Service Phase] Review of bottlenecks and required actionsMon 21/11/05Fri 27/01/06BD Optimisation of networkMon 30/01/06Fri 31/03/06BD,MD,BG,NP Test of data transfer rates (RAL-LAN)Mon 03/04/06Fri 28/04/06BD,MD, Review of bottlenecks and required actionsMon 01/05/06Fri 26/05/06BD

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting What we know! TaskStart DateEnd Date Tier-1 network in final configurationStarted20/5/05 Tier-1 Network confirmed ready. SRM ready.Started1/6/05 Tier-1 – test dCache dual attachment? Production dCache installation at EdinburghStarted30/05/05 Production dCache installation at LancasterTBC Production dCache installation at ImperialStarted30/05/05 T1-T2 dCache testing17/06/05 Tier-1 SRM tested end-to-end with CERN. Load balanced and tuned. 17/06/05 Install local LFC at T115/06/05 Install local LFC at T2s15/06/05 FTS server and client installed at T1 ASAP? ?? FTS client installed at Lancaster, Edinburgh and Imperial ASAP? ??

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting What we know! TaskStart DateEnd Date CERN-T1-T2 tests24/07/05 Experiment required catalogues installed??WHAT??? Integration tests with CERN FTS 1/07/05 Integration tests T1-T2 completed with FTS Light-path to Lancaster provisioned 6/05/05?? Additional 2*1Gb/s interface card in place at RAL 6/05/05TBC Experiment software (agents and daemons) 3D01/10/05?

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting What we do not know! How much storage is required at each site? How many CPUs should sites provide? Which additional experiment specific services need deploying for September? E.g. Databases (COOL – ATLAS) What additional grid middleware is needed and when will it need to be available (FTS, LFC) What are the full set of pre-requisites before SC3 can start (perhaps SC3 is now SC3-I and SC3-II) –Monte Carlo generation and pre-distribution –Metadata catalogue preparation What defines success for the Tier-1, Tier-2s, the experiments and LCG!

Jeremy Coles - RAL 17th May 2005Service Challenge Meeting Summary UK participating sites have continued to make progress for SC3 Lack of clear requirements is still causing some problems in planning and thus deployment Engagement with the experiments varies between sites Friday’s “tutorial” day was considered very useful but we need more! We do not have enough information to plan an effective deployment (dates, components, testing schedule etc.) and there is growing concern as to whether we will be able to meet SC3 needs if they are defined late The clash of the LCG release and start of the Service Challenge has been noted as a potential problem.