Northgrid Status Alessandra Forti Gridpp20 Dublin 12 March 2008.

Slides:



Advertisements
Similar presentations
London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
Advertisements

NorthGrid status Alessandra Forti Gridpp15 RAL, 11 th January 2006.
Northgrid Status Alessandra Forti Gridpp21 Swansea 4 September 2008.
Deployment metrics and planning (aka Potentially the most boring talk this week) GridPP16 Jeremy Coles 27 th June 2006.
GridPP3 Storage Perspective, Achievements, Challenges Jens Jensen, STFC RAL GridPP20 TCD Dublin, March 2008.
Northgrid Status Alessandra Forti Gridpp22 UCL 2 April 2009.
LCG WLCG Operations John Gordon, CCLRC GridPP18 Glasgow 21 March 2007.
Status Report University of Bristol 3 rd GridPP Collaboration Meeting 14/15 February, 2002Marc Kelly University of Bristol 1 Marc Kelly University of Bristol.
LondonGrid Status Duncan Rand. Slide 2 GridPP 21 Swansea LondonGrid Status LondonGrid Five Universities with seven GOC sites –Brunel University –Imperial.
SouthGrid Status Pete Gronbech: 12 th March 2008 GridPP 20 Dublin.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
Northgrid Status Alessandra Forti Gridpp24 RHUL 15 April 2010.
NorthGrid status Alessandra Forti Gridpp12 Brunel, 1 February 2005.
The National Grid Service Mike Mineter.
12th September 2002Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting Imperial College, London 12 th September 2002.
Liverpool HEP - Site Report June 2008 Robert Fay, John Bland.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Alastair Dewhurst, Dimitrios Zilaskos RAL Tier1 Acknowledgements: RAL Tier1 team, especially John Kelly and James Adams Maximising job throughput using.
Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Northgrid Status Alessandra Forti Gridpp25 Ambleside 25 August 2010.
Wahid Bhimji Andy Washbrook And others including ECDF systems team Not a comprehensive update but what ever occurred to me yesterday.
NorthGrid status Alessandra Forti Gridpp13 Durham, 4 July 2005.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Site Report US CMS T2 Workshop Samir Cury on behalf of T2_BR_UERJ Team.
Cambridge Site Report Cambridge Site Report HEP SYSMAN, RAL th June 2010 Santanu Das Cavendish Laboratory, Cambridge Santanu.
AMOD Report Doug Benjamin Duke University. Hourly Jobs Running during last week 140 K Blue – MC simulation Yellow Data processing Red – user Analysis.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
SouthGrid Status Pete Gronbech: 4 th September 2008 GridPP 21 Swansea.
London Tier 2 Status Report GridPP 12, Brunel, 1 st February 2005 Owen Maroney.
Quarterly report SouthernTier-2 Quarter P.D. Gronbech.
RAL PPD Site Update and other odds and ends Chris Brew.
Southgrid Technical Meeting Pete Gronbech: 16 th March 2006 Birmingham.
SouthGrid Status Pete Gronbech: 2 nd April 2009 GridPP22 UCL.
Northgrid Alessandra Forti M. Doidge, S. Jones, A. McNab, E. Korolkova Gridpp26 Brighton 30 April 2011.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
11/30/2007 Overview of operations at CC-IN2P3 Exploitation team Reported by Philippe Olivero.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Site Report BEIJING-LCG2 Wenjing Wu (IHEP) 2010/11/21.
Testing the UK Tier 2 Data Storage and Transfer Infrastructure C. Brew (RAL) Y. Coppens (Birmingham), G. Cowen (Edinburgh) & J. Ferguson (Glasgow) 9-13.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
Presenter Name Facility Name UK Testbed Status and EDG Testbed Two. Steve Traylen GridPP 7, Oxford.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
2-Sep-02Steve Traylen, RAL WP6 Test Bed Report1 RAL and UK WP6 Test Bed Report Steve Traylen, WP6
UK Tier 1 Centre Glenn Patrick LHCb Software Week, 28 April 2006.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN – RAL 10 th June 2010.
GridPP storage status update Joint GridPP Board Deployment User Experiment Update Support Team, Imperial 12 July 2007,
RAL PPD Tier 2 (and stuff) Site Report Rob Harper HEP SysMan 30 th June
BaBar Cluster Had been unstable mainly because of failing disks Very few (
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Scientific Computing in PPD and other odds and ends Chris Brew.
RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
The GridPP DIRAC project DIRAC for non-LHC communities.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
VOMS chapter 1&1/2 Alessandra Forti Sergey Dolgodobrov HEP Sysman meeting 5 December 2005.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
II EGEE conference Den Haag November, ROC-CIC status in Italy
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
UK Status and Plans Catalin Condurache – STFC RAL ALICE Tier-1/Tier-2 Workshop University of Torino, February 2015.
The CMS Beijing Tier 2: Status and Application Xiaomei Zhang CMS IHEP Group Meeting December 28, 2007.
Oxford Site Report HEPSYSMAN
Simulation use cases for T2 in ALICE
Presentation transcript:

Northgrid Status Alessandra Forti Gridpp20 Dublin 12 March 2008

Layout General status Manpower Other VOs Atlas shifts Sites news Conclusions

General Status (1) 90% DPMyes upgradi ngGlite3.1 Sheffiel d 83% dcache working on it installi ng upgradi ngGlite3.1 Manche ster 80% dcache working on it installi ngSL4Glite3.1 Liverpo ol 82% dcache -> DPMyes SL4Glite3.1 Lancast er Aver age avail abilit y Used Storage (TB) Storage (TB) CPU (kSI2K) SRM brand Space Tokens SRM2. 2OS Middle wareSite

General Status (2)

General Status (3)

Man power Lancaster: –Brian Davies –Matt Doidge, Peter Love Liverpool: –Pawel Trepka –Rob Fay, John Bland Manchester: –Colin Morey –Owen McShane, Stuart Wild, Sergey Dolgodobrov Sheffield –Dominic Wilson –Elena Korolkova, Matt Robinson

Other VOs Northgrid VO has been created for VO-less users and is being installed. –Some users have already subscribed it –Users now in gridpp will be moved to northgrid Other VOs are running on our systems –~24 enabled between all sites –hone, dzero and biomed leading the cpu usage of Other VOs

Atlas Shifts Northgrid among the biggest supplier of shifters: 5 people from all sites already involved –Carl Gwilliams expert shifter –Peter Love and Alessandra Forti: senior shifters –Mark Hodgkinson, Paul Hogson: trainees Benefits are evident: site managers have an inside perspective of atlas problems and atlas can benefit from sys admins shifters feedback.

Lancaster news UKLight link to RAL had problems, affected Atlas upload into RAL (Cause: bad 10G card with core Ciena kit in Reading) – link/ link/ Power cut toasted dCache system disk –Forced a fresh install and upgrade to SL4 –dCache install not smooth Migrating from dCache to DPM –DPM installation trivial, up and running with no problems –Atlas production now on DPM –space tokens in place –data migration underway This weekend, FTS problems from RAL (diagnosis ongoing) –Active transfers still not normal: New data centre, hoarding going up on site

Liverpool News Cluster upgraded to SL4 Working on the dcache upgrade and enabling Space Tokens. –dcache installation not smooth Installed a new more powerful CE and SE Upgraded the rack software servers to 250GB RAID1 to cope with the >100GB size of the ATLAS code. Still testing Puppet as preferred fabric management solution.

Manchester News Manchester setup has been completely reorganised –cfengine configuration rewritten according to tasks and not to host type –All the quick and dirty extra steps have been cleaned up and are now handled by cfengine –Test and trash machines have been reorganised and installation doesnt require any special handling in cfengine or out. –All the certificates have been renewed in one go thanks to the new bulk request/renewal script –Still in the process of upgrading the first cluster as dcache proved to be more complicated than it should be. The WN+pools and CEs are ready to go though. Plan to go ahead and deal with dcache head node more slowly Tickets from GGUS a sorer point than ever –GGUS opens a ticket in RT at each reply…

Sheffield news Benefited by the staff change –Elena is located in the physics department. Upgraded to SRM 2.2 Enabled Space Tokens for atlas Added 2.5 TB of storage Problems with apel accounting due to apel using the wrong batch system Problems with biomed jobs hanging for >70 because there is no time out when a remote server doesnt reply. –Still handled with manual monitoring

Conclusions Northgrid is in a healthy state Upgrades to SL4, SRM 2.2 and enabling space tokens are going on. –We should make it for the deadlines The main problems at the moment –sys admins turn over –dcache installation/upgrade and setup is not smooth Well integrated with atlas and good exploitation from other users communities.