2010-05-15Computing Infrastructure Arthur Kreymer 1 ● Power status in FCC (UPS1) ● Bluearc disk purchase – still coming soon ● Planned downtimes – none.

Slides:



Advertisements
Similar presentations
12th September 2002Tim Adye1 RAL Tier A Tim Adye Rutherford Appleton Laboratory BaBar Collaboration Meeting Imperial College, London 12 th September 2002.
Advertisements

Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
Profit from the cloud TM Parallels Dynamic Infrastructure AndOpenStack.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,
CD FY08 Tactical Plan Status FY08 Tactical Plan Status Report for Network Infrastructure Upgrades Rick Finnegan April 22, 2008.
Minerva Infrastructure Meeting – October 04, 2011.
Red Hat Installation. Installing Red Hat Linux is the process of copying operating system files from a CD, DVD, or USB flash drive to hard disk(s) on.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
S. Veseli - SAM Project Status SAMGrid Developments – Part I Siniša Veseli CD/D0CA.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
CD FY08 Tactical Plan Status FY08 Tactical Plan Status Report for Network Infrastructure Upgrades Rick Finnegan April 22, 2008.
Workshop on Computing for Neutrino Experiments - Summary April 24, 2009 Lee Lueking, Heidi Schellman NOvA Collaboration Meeting.
Fermilab Site Report Spring 2012 HEPiX Keith Chadwick Fermilab Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
SONIC-3: Creating Large Scale Installations & Deployments Andrew S. Neumann Principal Engineer, Progress Sonic.
SONIC-3: Creating Large Scale Installations & Deployments Andrew S. Neumann Principal Engineer Progress Sonic.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
Online System Status LHCb Week Beat Jost / Cern 9 June 2015.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Cloud Computing Lecture 5-6 Muhammad Ahmad Jan.
BaBar Cluster Had been unstable mainly because of failing disks Very few (
Scientific Computing in PPD and other odds and ends Chris Brew.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
FermiGrid Keith Chadwick. Overall Deployment Summary 5 Racks in FCC:  3 Dell Racks on FCC1 –Can be relocated to FCC2 in FY2009. –Would prefer a location.
36 th LHCb Software Week Pere Mato/CERN.  Provide a complete, portable and easy to configure user environment for developing and running LHC data analysis.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
Patrick Gartung 1 CMS 101 Mar 2007 Introduction to the User Analysis Facility (UAF) Patrick Gartung - Fermilab.
April 18, 2006FermiGrid Project1 FermiGrid Project Status April 18, 2006 Keith Chadwick.
CD FY10 Budget and Tactical Plan Review FY10 Tactical Plans for Scientific Computing Facilities / General Physics Computing Facility (GPCF) Stu Fuess 06-Oct-2009.
CD FY10 Budget and Tactical Plan Review FY10 Tactical Plan for SCF / System Administration DocDB #3389 Jason Allen 10/06/2009.
GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is.
Parag Mhashilkar (Fermi National Accelerator Laboratory)
CD FY08 Tactical Plan Status FY08 Tactical Plan Status Report for Networks Physical Infrastructure Rick Finnegan April 22, 2008.
Virtualization within FermiGrid Keith Chadwick Work supported by the U.S. Department of Energy under contract No. DE-AC02-07CH11359.
Minos Computing Infrastructure Arthur Kreymer 1 ● Young Minos issues: – /local/stage1/minosgli temporary reprieve – need grid shared area – Retiring.
Computing Infrastructure – Minos 2009/06 ● MySQL and CVS upgrades ● Hardware deployment ● Condor / Grid status ● /minos/data file server ● Parrot status.
Computing Infrastructure – Minos 2009/12 ● Downtime schedule – 3 rd Thur monthly ● Dcache upgrades ● Condor / Grid status ● Bluearc performance – cpn lock.
Computing Infrastructure Arthur Kreymer 1 ● Power status in FCC (UPS1) ● Bluearc disk purchase – coming soon ● Planned downtimes – none ! ● Minos.
Minos Computing Infrastructure Arthur Kreymer 1 ● Young Minos issues: – /local/stage1/minosgli temporary reprieve – need grid shared area – Retiring.
Nov 05, 2008, PragueSA3 Workshop1 A short presentation from Owen Synge SA3 and dCache.
Minos Computing Infrastructure Arthur Kreymer 1 ● Grid – Going to SLF 5, doubled capacity in GPFarm ● Bluearc - performance good, expanding.
Valencia Cluster status Valencia Cluster status —— Gang Qin Nov
Compute and Storage For the Farm at Jlab
Guide to Operating Systems, 5th Edition
Extended Operating System Support
RHEV Platform at LHCb Red Hat at CERN 17-18/1/17
Installation 1. Installation Sources
Dag Toppe Larsen UiB/CERN CERN,
Building a Virtual Infrastructure
CC - IN2P3 Site Report Hepix Spring meeting 2011 Darmstadt May 3rd
Status of the CERN Analysis Facility
Welcome! Thank you for joining us. We’ll get started in a few minutes.
Olof Bärring LCG-LHCC Review, 22nd September 2008
VceTests VCE Test Dumps
PES Lessons learned from large scale LSF scalability tests
Subtitle of Presentation
Storage elements discovery
TYPES OF SERVER. TYPES OF SERVER What is a server.
Oracle Solaris Zones Study Purpose Only
Group 8 Virtualization of the Cloud
Managing Clouds with VMM
Haiyan Meng and Douglas Thain
Guide to Operating Systems, 5th Edition
Upgrading Your Private Cloud with Windows Server 2012 R2
RHUL Site Report Govind Songara, Antonio Perez,
Presentation transcript:

Computing Infrastructure Arthur Kreymer 1 ● Power status in FCC (UPS1) ● Bluearc disk purchase – still coming soon ● Planned downtimes – none ! ● Minos Cluster moved to minos50-53 ● Bluearc performance – cpn working well, 1.9 Million served ● Parrot – is it dead yet ? ● Nucomp – integrated IF support ● GPCF – the New FNALU ? INFRASTRUCTURE

Computing Infrastructure Arthur Kreymer 2 ● Young Minos issues: – Lent minos54 to NoVA – get it back June-ish – When can we remount /minos/data no-execute ? ● Shut down minos01-24, using minos50-53 – Xrootd on minos-xrootd (minos27) – /local/scratch disks are having problems, r/o on minos51 ● Bluearc disk expansion is formatted, planning file copies ● Clearing daikon07 STAGE files, up to 10 TB ● Shall the Farm run directly from our Bluearc base releases ? ● Linked cp1 to cpn on March 23. It delivered 400K locks. R.I.P. ● Removed SSH keys from CVS March 25, only George was affected. ● Parrot status – pushing up daisies ? ● GPCF hardware arrives week of May SUMMARY

Computing Infrastructure Arthur Kreymer 3 FCC POWER ● Power outages in FCC due to breaker trips Feb 9 and 17. – FCC has less capacity than formerly believed, not easily fixed – New Minos Cluster nodes are installed in GCC – We have shut down minos01-24 – We will move minos24/26 to GCC, for fallback and tests. – Minos Servers can remain in FCC on UPS

Computing Infrastructure Arthur Kreymer 4 Computing Infrastructure – Minos 2009/12 ● We continue to reach peaks of 2000 running jobs – still want 5000, after fixing Bluearc issues ● Still using wrapper for condor_rm, condor_hold – To prevent SAZ authentication overload – Condor 7.4 upgrade will fix this internally, soonish ● Changed cp1 to be a symlink to cpn around 23 March – cpn has taken out over 1.9 Million locks so far GRID

Computing Infrastructure Arthur Kreymer 5 Bluearc ● Continuing performance tests – Demonstrated 550 Mbytes/sec sustained delivery – Modest slowdowns up to 50 clients, 20x slowdown with 200 – Need to test writes ● See monitoring plots, such as – ● /minos/scratch is now /minos/app, for consistency with Fermigrid – /minos/scratch symlink available for compatibility for present ● Installed 99 TB for IF ( MNS/MNV/NOVA ) – Disk is installed and formatted – Some files must be copied to new disk before we get quota – Asking to merge data and data2, rather than add data3

Computing Infrastructure Arthur Kreymer 6 PARROT ● PARROT should no longer be needed at Fermilab ● All our code has been rsync'd to /grid/farmiapp/minos/... ● Production analysis is being done sans-parrot. ● We will are building all new releases in Bluearc, and not in AFS. ● We may start running Farm processing from Bluearc – This eliminates separate Farm builds. ● minos_jobsub defaults to sans-parrot ( -pOFF ) – Why are some users specifying -p ?

Computing Infrastructure Arthur Kreymer 7 NUCOMP ● GPCF hardware is coming next week, by May 22-ish ● Andrew Norman has been hired by REX as Associate Scientist – Major player in Nova DAQ support. ● Getting signatures now on a requisition for Soudan networking upgrade – Existing hardware is far beyond end-of-life – Targeting July/Aug shutdown for installation ● Budget process – Much earlier this year, to allow feedback – Present FY11/12 guidance from CD is grim, under ½ of needs – Drafting Minos Tactical plan, with data and software model – /afs/fnal.gov/files/home/room1/kreymer/minos/budget/2011/FY11TacticalPlanForMINOS.odt ●

Computing Infrastructure Arthur Kreymer 8 GPCF ● Initial configuration for GPCF – 14 interactive, 14 batch hosts – 24 TB SAN disk probably not enough) – Interactive systems will be virtual machines for flexibility ● Installing Oracle VM over next 2 months ● Nova is the primary new customer ● Will set up 5 Nova interactive nodes immediately, pending VM tests ● We will ask for at least some virtual machines for testing – 32 and 64 bit, SLF 5, worker nodes, etc.

Computing Infrastructure Arthur Kreymer 9 Minos Cluster migration ● New nodes are minos50 through minos54 – Each roughly 8x the capacity of an old node – 5 nodes replaced 24, roughly x2 capacity increase ● Lent minos54 to NOVA short term ? – Will get it back by early June, when Nova CPCF nodes are up ● Deployment issues : – /local/scratch disks are unstable. – We need new WD disk firmware, cannot yet install it – Will move minos24/26 to GCC, for fallback and tests – Have copied all old /local/scratch areas to /local/scratch5*/... – scratch01 was copied separately to /minos/data/maint/scratch01 – All releases are built on 64bit kernel hosts ● Thanks to Luke Corwin, Robert Hatcher, et.al.