London Tier 2 Status Report GridPP 11, Liverpool, 15 September 2004 Ben Waugh on behalf of Owen Maroney.

Slides:



Advertisements
Similar presentations
London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
Advertisements

LondonGrid Status Duncan Rand. Slide 2 GridPP 21 Swansea LondonGrid Status LondonGrid Five Universities with seven GOC sites –Brunel University –Imperial.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
NorthGrid status Alessandra Forti Gridpp12 Brunel, 1 February 2005.
Condor use in Department of Computing, Imperial College Stephen M c Gough, David McBride London e-Science Centre.
12th September 2002Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting Imperial College, London 12 th September 2002.
UCL HEP Computing Status HEPSYSMAN, RAL,
Martin Bly RAL Tier1/A RAL Tier1/A Site Report HEPiX-HEPNT Vancouver, October 2003.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
NorthGrid status Alessandra Forti Gridpp13 Durham, 4 July 2005.
Grid Computing Reinhard Bischof ECFA-Meeting March 26 th 2004 Innsbruck.
London Tier 2 Status Report GridPP 13, Durham, 4 th July 2005 Owen Maroney, David Colling.
1 Deployment of an LCG Infrastructure in Australia How-To Setup the LCG Grid Middleware – A beginner's perspective Marco La Rosa
London Tier 2 Status Report GridPP 12, Brunel, 1 st February 2005 Owen Maroney.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
RHUL1 Site Report Royal Holloway Sukhbir Johal Simon George Barry Green.
UCL Site Report Ben Waugh HepSysMan, 22 May 2007.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
Quarterly report SouthernTier-2 Quarter P.D. Gronbech.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
RAL PPD Site Update and other odds and ends Chris Brew.
30-Jun-04UCL HEP Computing Status June UCL HEP Computing Status April DESKTOPS LAPTOPS BATCH PROCESSING DEDICATED SYSTEMS GRID MAIL WEB WTS.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
LT 2 London Tier2 Status Olivier van der Aa LT2 Team M. Aggarwal, D. Colling, A. Fage, S. George, K. Georgiou, W. Hay, P. Kyberd, A. Martin, G. Mazza,
SouthGrid Status Pete Gronbech: 2 nd April 2009 GridPP22 UCL.
CERN Manual Installation of a UI – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Quarterly report ScotGrid Quarter Fraser Speirs.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
Organisation Management and Policy Group (MPG): Responsible for setting and policy decisions and resolving any issues concerning fractional usage, acceptable.
Andrew McNabNorthGrid, GridPP8, 23 Sept 2003Slide 1 NorthGrid Status Andrew McNab High Energy Physics University of Manchester.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
28 April 2003Imperial College1 Imperial College Site Report HEP Sysman meeting 28 April 2003.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Southgrid Technical Meeting Pete Gronbech: 26 th August 2005 Oxford.
1 PRAGUE site report. 2 Overview Supported HEP experiments and staff Hardware on Prague farms Statistics about running LHC experiment’s DC Experience.
First attempt for validating/testing Testbed 1 Globus and middleware services WP6 Meeting, December 2001 Flavia Donno, Marco Serra for IT and WPs.
25th October 2006Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar UK Physics Meeting Queen Mary, University of London 25 th October 2006.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.
Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham.
Brunel University, School of Engineering and Design, Uxbridge, UB8 3PH, UK Henry Nebrensky (not a systems manager) SIRE Group.
Production Manager’s Report PMB Jeremy Coles 13 rd September 2004.
University of Bristol 5th GridPP Collaboration Meeting 16/17 September, 2002Owen Maroney University of Bristol 1 Testbed Site –EDG 1.2 –LCFG GridPP Replica.
Tier1A Status Andrew Sansum 30 January Overview Systems Staff Projects.
Presenter Name Facility Name UK Testbed Status and EDG Testbed Two. Steve Traylen GridPP 7, Oxford.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
2-Sep-02Steve Traylen, RAL WP6 Test Bed Report1 RAL and UK WP6 Test Bed Report Steve Traylen, WP6
Andrew McNab - Manchester HEP - 17 September 2002 UK Testbed Deployment Aim of this talk is to the answer the questions: –“How much of the Testbed has.
DataTAG Work Package 4 Meeting Bologna Simone Ludwig Brunel University 23rd and 24th of May 2002.
LCG LCG-1 Deployment and usage experience Lev Shamardin SINP MSU, Moscow
13 October 2004GDB - NIKHEF M. Lokajicek1 Operational Issues in Prague Data Challenge Experience.
RAL PPD Tier 2 (and stuff) Site Report Rob Harper HEP SysMan 30 th June
CERN Running a LCG-2 Site – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
II EGEE conference Den Haag November, ROC-CIC status in Italy
CERN LCG1 to LCG2 Transition Markus Schulz LCG Workshop March 2004.
Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.
UK Status and Plans Catalin Condurache – STFC RAL ALICE Tier-1/Tier-2 Workshop University of Torino, February 2015.
London Tier-2 Quarter Owen Maroney
Belle II Physics Analysis Center at TIFR
Presentation transcript:

London Tier 2 Status Report GridPP 11, Liverpool, 15 September 2004 Ben Waugh on behalf of Owen Maroney

15 September 2004GridPP 11: London Tier 2 Status LT2 Sites Brunel University Imperial College London –(including London e-Science Centre) Queen Mary University of London Royal Holloway University of London University College London

15 September 2004GridPP 11: London Tier 2 Status LT2 Management Internal LT2 MoU signed by all institutes MoU with GridPP signed by David Colling as acting chair of Management Board Management board being formed but has not yet met Technical board meets every three to four weeks

15 September 2004GridPP 11: London Tier 2 Status Contribution to LCG2 ‘Snapshot’ taken on 26 th August –Number of WN CPUs in use by LCG SiteWNsJobs RunningJobs Waiting IC66 21 QMUL* RHUL UCL-HEP UCL-CCC Total *QMUL have since turned on hyperthreading and now allow up to 576 jobs Brunel joined LCG2 on 3 rd September

15 September 2004GridPP 11: London Tier 2 Status Brunel Test system (1WN) PBS LCG-2_2_0 Joined Testzone on 3 rd September –Completely LCFG installed In process of adding 60 WN’s –LCFG installation –Private network –Some problems with SCSI drives and network booting with LCFG Have had problems with local firewall restrictions –These now seem to be resolved GLOBUS_TCP_PORT_RANGE is not the default range

15 September 2004GridPP 11: London Tier 2 Status Imperial College London 66 CPU PBS HEP LCG-2_1_1 Joined LCG2 prior to 1 st April –Completely LCFG installed –In core zone –Early adopter of RGMA London e-Science Centre has 900 CPU cluster –Cluster runs on locally patched RH7.2 version Shared facility: no possibility of changing operating system Could not install LCG2 on RH7.2 Will run RH7.3 under User Mode Linux to install LCG2 –Batch system (Sun Grid Engine) is not currently supported by LCG LeSC have already provided a globus-jobmanager for SGE Work is in progress on updating this for LCG jobmanager Information provider is being developed by Durham Interests in SGE from other sites LeSC Cluster will soon be SAMGrid enabled –LCG2 to follow

15 September 2004GridPP 11: London Tier 2 Status Queen Mary 348 CPU Torque+Maui LCG-2_1_0 Joined Testzone on 6 th July Private networked WN’s Existing Torque server –“Manual” installation on WNs (local automated procedure) –LCFG installed CE and SE –Configure CE to be client to Torque server OS is Fedora 2 –Only site in LCG2 not running RH7.3 ! Recently turned on hyperthreading –Offers 576 job slots

15 September 2004GridPP 11: London Tier 2 Status Queen Mary Fedora 2 Port CE and SE are LCFG installed RH7.3 LCG-2_0_0 was installed on Fedora 2 WN –tar up the /opt directory from an LCFG installed RH7.3 node –untar it on the Fedora 2 WN. –Only needed to recompile the Globus toolkit. –Also jar files in /usr/share/java needed –Everything worked! For LCG-2_1_0 this method failed! –The upgraded edg-rm functions no longer worked. –Recompiling the Globus toolkit did not help. –LCG could not provide SRPMS for the edg-rm Current status: –LCG-2_1_0 on SE, CE with LCG-2_0_0 on WN –Seems to work! But is clearly not ideal… With LCG-2_2_0 upgrade will test the lcg-* utilities –Will replace the edg-rm functions

15 September 2004GridPP 11: London Tier 2 Status Royal Holloway 148 CPU PBS+Maui LCG-2_1_1 Joined Testzone on 19 th July Private networked WN’s Existing PBS server –Manual installation on WN’s –LCFG installed CE and SE –Configure CE to be client to PBS server Shared NFS /home directories –Uses pbs jobmanager, not lcgpbs jobmanager –Needed to configure WN for jobs to run in scratch area not enough space in /home for whole farm Some problems still under investigation –Stability problems with Maui. –Large, compressed files sometimes become corrupted when copied to the SE Looks like a hardware problem. Also: 80 cpu Babar farm running Babargrid

15 September 2004GridPP 11: London Tier 2 Status University College London UCL-HEP 20 CPU PBS LCG-2_1_1 Joined Testzone 18 th June Existing PBS server –Manual installation of WN’s –LCFG installed CE and SE –Configure CE to be client to PBS server Shared /home directories –Uses pbs jobmanager, not lcgpbs jobmanager so far no problems with space on shared /home Hyperthreading allows up to 76 jobs, but grid queues are restricted to less than this Stability problems with OpenPBS

15 September 2004GridPP 11: London Tier 2 Status University College London UCL-CCC 88 CPU PBS LCG-2_2_0 Joined Testzone on 24 th June –Had power failure that took farm offline from 4 th to 25 th August Originally had cluster of 192 CPUs running Sun Grid Engine under RH9 –UCL central computing services agreed to reinstall half of farm with RH7.3 for LCG, using LCFG Hyperthreading allows 176 jobs (44 dual- CPU WNs)

15 September 2004GridPP 11: London Tier 2 Status Contribution to GridPP Promised vs. Delivered SitePromisedDelivered CPUkSI2KTBCPUkSI2KTB Brunel IC (HEP) IC (LeSC)916* QMUL444* * RHUL UCL-HEP UCL-CCC192* Total *CPU count includes shared resources where CPU’s are not 100% dedicated to Grid/HEP kSI2K value takes this sharing into account

15 September 2004GridPP 11: London Tier 2 Status Site Experience Storage Elements are all ‘classic’ gridftp servers –Cannot pool large TB raid arrays to deploy large disk spaces Many farms are shared facilities –Existing batch queues Manual installation of WN – needs to be automated for large farms! CE becomes client to batch server –Private networked WNs Needed additional Replica Manager configuration –Some OS constraints Lack of all SRPMS still a problem Most sites taken by surprise by lack of warning of new releases –Problems scheduling workload –Documentation has improved –But communication could be improved further! The default LCFG installed farms (IC-HEP, UCL-CCC) have been amongst the most stable and easily upgraded –But this is not an option for most significant Tier 2 resources

15 September 2004GridPP 11: London Tier 2 Status Summary LT2 sites have managed to contribute a significant amount of resources to LCG –Still more to come! This has required a significant amount of (unfunded) effort from staff – HEP and IT – at the institutes –2.5 GridPP2 funded support posts to be appointed soon –Will help! Any deviation from a “standard” installation comes at a price –Installation, upgrades, maintenance –But the large resources at Tier 2s tend to be shared facilities don’t have the freedom to install a “standard” OS, whatever it might be LCG moving from RH7.3 to Scientific Linux will not necessarily help!