Northgrid Status Alessandra Forti Gridpp24 RHUL 15 April 2010.

Slides:



Advertisements
Similar presentations
London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
Advertisements

S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
ESLEA and HEPs Work on UKLight Network. ESLEA Exploitation of Switched Lightpaths in E- sciences Applications Exploitation of Switched Lightpaths in E-
Cloud Computing at the RAL Tier 1 Ian Collier STFC RAL Tier 1 GridPP 30, Glasgow, 26th March 2013.
Northgrid Status Alessandra Forti Gridpp20 Dublin 12 March 2008.
NorthGrid status Alessandra Forti Gridpp15 RAL, 11 th January 2006.
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Northgrid Status Alessandra Forti Gridpp22 UCL 2 April 2009.
UKI-SouthGrid Overview Pete Gronbech SouthGrid Technical Coordinator GridPP 25 - Ambleside 25 th August 2010.
RAL Tier1: 2001 to 2011 James Thorne GridPP th August 2007.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
NorthGrid status Alessandra Forti Gridpp12 Brunel, 1 February 2005.
Liverpool HEP – Site Report May 2007 John Bland, Robert Fay.
Liverpool HEP - Site Report June 2010 John Bland, Robert Fay.
CBPF J. Magnin LAFEX-CBPF. Outline What is the GRID ? Why GRID at CBPF ? What are our needs ? Status of GRID at CBPF.
BINP/GCF Status Report Jan 2010
Edinburgh (ECDF) Update Wahid Bhimji On behalf of the ECDF Team HepSysMan,10 th June 2010 June-10 Hepsysman1Wahid Bhimji - ECDF  Edinburgh Setup  Hardware.
Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
Chris Brew RAL PPD Site Report Chris Brew SciTech/PPD.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Northgrid Status Alessandra Forti Gridpp25 Ambleside 25 August 2010.
Wahid Bhimji Andy Washbrook And others including ECDF systems team Not a comprehensive update but what ever occurred to me yesterday.
NorthGrid status Alessandra Forti Gridpp13 Durham, 4 July 2005.
Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL th June 2010 Santanu Das Cavendish Laboratory, Cambridge Santanu.
SouthGrid Status Pete Gronbech: 4 th September 2008 GridPP 21 Swansea.
London Tier 2 Status Report GridPP 12, Brunel, 1 st February 2005 Owen Maroney.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator GridPP 24 - RHUL 15 th April 2010.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
RAL PPD Site Update and other odds and ends Chris Brew.
Southgrid Technical Meeting Pete Gronbech: 16 th March 2006 Birmingham.
David Hutchcroft on behalf of John Bland Rob Fay Steve Jones And Mike Houlden [ret.] * /.\ /..‘\ /'.‘\ /.''.'\ /.'.'.\ /'.''.'.\ ^^^[_]^^^ * /.\ /..‘\
BINP/GCF Status Report BINP LCG Site Registration Oct 2009
LT 2 London Tier2 Status Olivier van der Aa LT2 Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza,
SouthGrid Status Pete Gronbech: 2 nd April 2009 GridPP22 UCL.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
Tier1 Status Report Martin Bly RAL 27,28 April 2005.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
Andrew McNabNorthGrid, GridPP8, 23 Sept 2003Slide 1 NorthGrid Status Andrew McNab High Energy Physics University of Manchester.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
Southgrid Technical Meeting Pete Gronbech: 26 th August 2005 Oxford.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
UKI-SouthGrid Update Hepix Pete Gronbech SouthGrid Technical Coordinator April 2012.
London Tier 2 Status Report GridPP 11, Liverpool, 15 September 2004 Ben Waugh on behalf of Owen Maroney.
Rob Allan Daresbury Laboratory NW-GRID Training Event 25 th January 2007 Introduction to NW-GRID R.J. Allan CCLRC Daresbury Laboratory.
Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
HEP SYSMAN 23 May 2007 National Grid Service Steven Young National Grid Service Manager Oxford e-Research Centre University of Oxford.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
UK Tier 1 Centre Glenn Patrick LHCb Software Week, 28 April 2006.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN – RAL 10 th June 2010.
RAL PPD Tier 2 (and stuff) Site Report Rob Harper HEP SysMan 30 th June
BaBar Cluster Had been unstable mainly because of failing disks Very few (
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
UK Status and Plans Catalin Condurache – STFC RAL ALICE Tier-1/Tier-2 Workshop University of Torino, February 2015.
HEPiX Spring 2014 Annecy-le Vieux May Martin Bly, STFC-RAL
Oxford Site Report HEPSYSMAN
ETHZ, Zürich September 1st , 2016
WLCG Tier-2 site at NCP, Status Update and Future Direction
RHUL Site Report Govind Songara, Antonio Perez,
Presentation transcript:

Northgrid Status Alessandra Forti Gridpp24 RHUL 15 April 2010

Outline Apel pies Lancaster status Liverpool status Manchester status Sheffield Conclusions

Apel pie (1)

Apel pie (2)

Apel pie (3)

Lancaster All WN moved to tarball Moving all nodes to SL5 solved sub-cluster problems. Deployed and decommissioned a test SCAS. – Will install glexec when user demand it In the middle of deploying CREAM CE Finished tendering for the HEC facility – Will give us access to 2500 cores – Extra 280 TB of storage – Shared Facility has Roger Jones as director so we have a strong voice for GridPP interests

Lancaster Older storage nodes are being re-tasked Tarball WN are working well but YAIM is suboptimal to configure them Maui continues to be weird for us – Jobs blocking other jobs – Confused by multiple queues – Jobs don't use their reservations when they are blocked Problems trying to use the same NFS server for experiment software and tarballs. – Now they have been split

Liverpool What we did (we were supposed to do) – Major hardware procurement 48TB unit with 4Gbit bonded link 7X4X8 units = 224 cores, 3GB mem, 2x1TB disk – Scrapped some 32bit nodes – CREAM test CE running Other things we did – General guide to capacity publishing – Horizontal job allocation – Improved use of Vms – Grid use of slack local HEP nodes

Liverpool Things in progress – Put CREAM in GOCDB (ready) – Scrap all 32 bit nodes (gradually) – Production runs of central computer cluster (other dept involved) Problems – Obsolete equipment – WMS/ICE fault at RAL What's next – Install/deploy newly procured storage and CPU hardware – Achieve production runs of central computing cluster

Manchester Since last time – Upgraded WN to SL5 – Eliminated all dcache setup from the nodes – Raid0 on internal disks – Increased scratch area – Unified two DPM instances – 106 TB/84 dedicated to atlas – Upgraded to – Changed network configuration of data servers – Installed squid cache – Installed Cream CE (still in test phase) – Last HC test in March 99% efficiency

Manchester Major UK site in atlas production 2 or 3 after RAL and Glasgow Last HC in March had 99% efficiency 80 TB almost empty – Not many jobs – But from the stats of the past few days also real users seem also fine. 96%

Manchester Tender – European Tender submitted 15/9/2009 – Vendors replies should be in 16/04/2010 (in two days) – Additional GridPP3 money can be added Included a clause for increased budget – Minimum requirements 4400 HEPSPEC/240TB Can be exceeded Buying only nodes – Talking to Uni for Green funding to replace what we can't replace Not easy

Sheffield Storage Upgrade – Storage moved to physics: 24/7 access – All nodes running SL5, DPM – 4x25TB disk pools, 2TB disks, RAID5, 4 cores – Memory will be upgaded to 8GB on all nodes – 95% reserved for atlas – Xfs crashed, problem solved with additional kernel module – Sw server 1TB (raid1) – Squid server

Sheffield Worker Nodes – 200 old 2.4GHz, 2GB, SL5 – 72 TB of local disk per 2 cores – Lcg-CE and MONBOX on SL4 – Additional 32 amp ring has been added – Fiber link between CICS and physics Availability – 97-98% since January 2008 – 94.5% efficiency in atlas

Sheffield Plans Additional storage – 20TB bring total 120TB for atlas Cluster integration – Local HEP and UKI-NORTHGRID-SHEF-HEP will have joint Wns – 128 CPU + 72 new nodes ??? – Torque server from local cluster and lcg-CE from grid cluster – Need 2 days DT waiting for atlas approval – CREAM CE installed waiting to complete cluster integration