FIO Services and Projects Post-C5 February 22 nd 2002 CERN.ch.

Slides:



Advertisements
Similar presentations
Fabric Management at CERN BT July 16 th 2002 CERN.ch.
Advertisements

CCTracker Presented by Dinesh Sarode Leaf : Bill Tomlin IT/FIO URL
Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
6/2/2015Bernd Panzer-Steindel, CERN, IT1 Computing Fabric (CERN), Status and Plans.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
The CERN Computer Centres October 14 th 2005 CERN.ch.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Client/Server Architecture
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
EU funding for DataGrid under contract IST is gratefully acknowledged GridPP Tier-1A Centre CCLRC provides the GRIDPP collaboration (funded.
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
1.Database plan 2.Information systems plan 3.Technology plan 4.Business strategy plan 5.Enterprise analysis Which of the following serves as a road map.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
1 CAPACITY BUILDING AND TRAINING ON PROCUREMENT Session 9 – The World Bank Procedures for IT Procurement September 21, 2005 by Arif Hassan.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
7/2/2003Supervision & Monitoring section1 Supervision & Monitoring Organization and work plan Olof Bärring.
Chapter 16 Designing Effective Output. E – 2 Before H000 Produce Hardware Investment Report HI000 Produce Hardware Investment Lines H100 Read Hardware.
1 Linux in the Computer Center at CERN Zeuthen Thorsten Kleinwort CERN-IT.
Transforming B513 into a Computer Centre for the LHC Era Tony Cass —
Planning the LCG Fabric at CERN openlab TCO Workshop November 11 th 2003 CERN.ch.
30-Jun-04UCL HEP Computing Status June UCL HEP Computing Status April DESKTOPS LAPTOPS BATCH PROCESSING DEDICATED SYSTEMS GRID MAIL WEB WTS.
Group Computing Strategy Introduction and BaBar Roger Barlow June 28 th 2005.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
Physical Infrastructure Issues In A Large Centre July 8 th 2003 CERN.ch.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
Large Farm 'Real Life Problems' and their Solutions Thorsten Kleinwort CERN IT/FIO HEPiX II/2004 BNL.
Fabric Infrastructure LCG Review November 18 th 2003 CERN.ch.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Deployment work at CERN: installation and configuration tasks WP4 workshop Barcelona project conference 5/03 German Cancio CERN IT/FIO.
20-May-2003HEPiX Amsterdam EDG Fabric Management on Solaris G. Cancio Melia, L. Cons, Ph. Defert, I. Reguero, J. Pelegrin, P. Poznanski, C. Ungil Presented.
Computer Centre Upgrade Status & Plans Post-C5, June 27 th 2003 CERN.ch.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
S.Jarp CERN openlab CERN openlab Total Cost of Ownership 11 November 2003 Sverre Jarp.
CERN.ch 1 Issues  Hardware Management –Where are my boxes? and what are they?  Hardware Failure –#boxes  MTBF + Manual Intervention = Problem!
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Nikhef/(SARA) tier-1 data center infrastructure
Managing the CERN LHC Tier0/Tier1 centre Status and Plans March 27 th 2003 CERN.ch.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
Cluster Configuration Update Including LSF Status Thorsten Kleinwort for CERN IT/PDP-IS HEPiX I/2001 LAL Orsay Tuesday, December 08, 2015.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
1 LHCC RRB SG 16 Sep P. Vande Vyvre CERN-PH On-line Computing M&O LHCC RRB SG 16 Sep 2004 P. Vande Vyvre CERN/PH for 4 LHC DAQ project leaders.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Consolidation Project Vincent Doré IT Technical.
HEPiX 2 nd Nov 2000 Alan Silverman Proposal to form a Large Cluster SIG Alan Silverman 2 nd Nov 2000 HEPiX – Jefferson Lab.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Infrastructure availability and Hardware changes Slides prepared by Niko Neufeld Presented by Rainer Schwemmer for the Online administrators.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
Computing Division FY03 Budget and budget outlook for FY04 + CDF International Finance Committee April 4, 2003 Vicky White Head, Computing Division.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.

1 Veloxum Corporation © Veloxum ACO solution improves the efficiency and capacity of your environment for both physical and.
AB/CO Review, Interlock team, 20 th September Interlock team – the AB/CO point of view M.Zerlauth, R.Harrison Powering Interlocks A common task.
Computer Centre Upgrade Status & Plans Post-C5, October 11 th 2002 CERN.ch.
Vault Reconfiguration IT DMM January 23 rd 2002 Tony Cass —
Virtual Server Server Self Service Center (S3C) JI July.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
Trusted Virtual Machine Images the HEPiX Point of View Tony Cass October 21 st 2011.
Monitoring and Fault Tolerance
Status of Fabric Management at CERN
CERN Data Centre ‘Building 513 on the Meyrin Site’
Background B513 built in early 1970’s for mainframe era.
Grid related projects CERN openlab LCG EDG F.Fluckiger
Bernd Panzer-Steindel CERN/IT
10 – Workstation Fleet Logistics
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
The Problem ~6,000 PCs Another ~1,000 boxes But! Affected by:
DSS Architecture MBA 572 Craig K. Tyran Fall 2002.
Presentation transcript:

FIO Services and Projects Post-C5 February 22 nd 2002 CERN.ch

CERN.ch 2 Headline Services  Physics Services –P=4FTE; M=800K; I=1,300K  Computing Hardware Supply –P=6.35FTE; M=15K; I=200K (funded by sales uplift)  Computer Centre Operations –P=3.7FTE; M=100K; I= 635K  Printing –P=3.5FTE; M=20K; I=175K  Remedy Support –P=1.65FTE; M=215K; I=0  Mac Support –P=1.25FTE; M=5K; I=50K

CERN.ch 3 Projects  WP4 –Contribution to WP4 of EU DataGrid –P=0.5FTE  LCG – Implementation of WP4 tools –Active progress towards day-to-day management of large farms. –P=LCG allocation (2FTE?)  Computer Centre Supervision (PVSS) –Test PVSS for CC monitoring. Prototypes in Q1, Q2 and Q4. –P=2.7FTE, M=25K  B513 Refurbishment –Adapt B513 for 2006 needs. Remodel vault in –P=0.3FTE, M=1,700K

CERN.ch 4 Macintosh Support  Support for MacOS and applications, plus backup services.  We only support MacOS 9, but this is out of date. No formal MacOS X support, but software is downloaded centrally for general efficiency.  Staffing level for Macintosh support is declining; now at 1.25FTE plus 50% service contract. –Plus 0.25FTE for CARA—used by some PC people, not just Mac users.  Key work area in 2002 is removal of AppleTalk access to printers. –Migrate users to lpr client (already used by Frank and Ludwig). –Streamlines general printing service—and is another move towards an IP only network. (LocalTalk removed last year.)

CERN.ch 5 Printing Support  Overall service responsibility is with FIO, but clearly much valuable assistance from –PS for OS and Software support for central servers –IS for Print Wizard  General aim is to have happy and appreciative users. –Install printers, maintain replace toner as necessary, … –Seems to be working: spontaneous outburst of public appreciation during January’s Desktop Forum. –Promote and support projector installation in order to reduce (expensive) colour printing.  Working (if slowly) to improve remote monitoring of printers—enable pre-emptive action. –Or (say it softly) a “Managed Print Service”

CERN.ch 6 Computing Hardware Supply  Aim to supply standardised hardware (desktop PCs, portables, printers, Macs) to users rapidly and efficiently.  Migration to use of CERN’s BAAN package is almost complete. Increases efficiency through –use of a standard stock management application, –end-user purchases are by Material Request not TID, –streamlined ordering procedures.  Could we ever move to a “Managed Desktop” service rather than shifting boxes? –Idea is appreciated outside IT but needs capital.  Service relies on the Desktop Support Contract…  Service also handles CPU and Disk Servers.

CERN.ch 7 Remedy Support  “Remedy” was introduced to meet the needs of the Desktop Support and Serco contracts for workflow management—problem and activity tracking.  FIO supports two Remedy Applications –PRMS for general problem tracking (the “3 level” model for support) »Used for Desktop Contract (including the helpdesk) and within IT –ITCM tracks direct CERN  Contractor activity requests for Serco and Network Management Contracts.  Do we need two different applications? Yes and No. –Two distinct needs, but could be merged. »However, this isn’t a priority and effort is scarce. »And don’t even ask about consolidated Remedy support across CERN!

CERN.ch 8 PRMS and ITCM Developments  PRMS –Continuing focus over past couple of years has been to consolidate the basic service—integrate the many little changes that have been made to meet punctual needs. –Outstanding requests for additional functionality include »An improved “SLA Trigger mechanism”—defining how and when (and to whom) to raise alarms if tickets are left untreated too long. »A “service logbook” to track interventions on a single system »Various small items including a Palm interface  ITCM –No firm developments planned, but many suggestions are floating around.  Overall, we need to migrate to Remedy 5…  … and available effort is limited.

CERN.ch 9 Computing Hardware Supply  Aim to supply standardised hardware (desktop PCs, portables, printers, Macs) to users rapidly and efficiently.  Migration to use of CERN’s BAAN package is almost complete. Increases efficiency through –use of standard stock management application, –end-user purchases are by Material Request not TID, –streamlined ordering procedures.  Could we ever move to a “Managed Desktop” service rather than moving boxes? –Idea is appreciated outside IT but needs capital.  Service relies on the Desktop Support Contract.  Service also handles CPU and Disk Servers…

CERN.ch 10 Physics Services  Last year’s reorganisation split “PDP” Services across –ADC: services to “push the envelope” –PS: Solaris and engineering services –FIO: Everything else.

CERN.ch 11 Physics Services  Last year’s reorganisation split “PDP” Services across –ADC: services to “push the envelope” –PS: Solaris and engineering services –FIO: Everything else.  So, what is “Everything else”? –lxplus: main interactive service –lxbatch: ditto for batch –lxshare: time shared lxbatch extension –RISC remnants—mainly for LEP experiments –Much general support infrastructure –First line interface for physics support

CERN.ch 12 Physics Service Concerns  RISC Reduction

CERN.ch 13 Physics Service Concerns  RISC Reduction

CERN.ch 14 Physics Service Concerns  RISC Reduction

CERN.ch 15 Physics Service Concerns  RISC Reduction  Managing Large Linux Clusters –Fabric Management

CERN.ch 16 Fabric Management Concerns  Software Installation — OS and Applications  (Performance and Exception…) Monitoring  Configuration Management  Logistics  State Management

CERN.ch 17 Fabric Management Concerns  Software Installation — OS and Applications –We need rapid and rock-solid system and application installation tools. –Development discussions are part of EDG/WP4 to which we contribute. –Full scale testing and deployment as part of LCG project.  (Performance and Exception…) Monitoring  Configuration Management  Logistics  State Management

CERN.ch 18 Fabric Management Concerns  Software Installation — OS and Applications  (Performance and Exception…) Monitoring –The Division is now committed to testing PVSS as a monitoring and control framework for the computer centre. »Overall architecture remains as decided within PEM and WP4 –New “Computer Centre Supervision” Project has 3 key milestones for 2002 »“Brainless” rework of PEM monitoring with PVSS u 900 systems now being monitored. Post-C5 presentation in March/April. »Intelligent rework for Q2 then wider system for Q4  Configuration Management  Logistics  State Management

CERN.ch 19 Fabric Management Concerns  Software Installation — OS and Applications  (Performance and Exception…) Monitoring  Configuration Management –How do systems know what they should install? –How does the monitoring system know what a system should be running? –An overall configuration database is required.  Logistics  State Management

CERN.ch 20 Fabric Management Concerns  Software Installation — OS and Applications  (Performance and Exception…) Monitoring  Configuration Management  Logistics –How do we keep track of 20,000+ objects? –We can’t manage 5,000 objects today. »Where are they all? (Feb 9 th : Some systems couldn’t be found) »Which are in production? New? Obsolete? u And which are temporarily out of service? –How do physical and logical arrangements relate? »Where is this service located? »What happens if this normabarre/PDU fails?  State Management

CERN.ch 21 Fabric Management Concerns  Software Installation — OS and Applications  (Performance and Exception…) Monitoring  Configuration Management  Logistics  State Management –What needs to be done to move this box »from reception to a final location to be part of a given service? –What procedures should be followed if a box fails (after automatic recovery actions, naturally!) –This is workflow management »that should integrate with overall workflow management.

CERN.ch 22 Fabric Management Concerns  Software Installation — OS and Applications  (Performance and Exception…) Monitoring  Configuration Management  Logistics  State Management  Work on these items is the FIO contribution to the Fabric Management part of the LHC Computing Grid Project. –Detailed activities and priorities will be set by LCG »They are providing the additional manpower! »Planning document being prepared now based on input from FIO and ADC.

CERN.ch 23 … And where do the clusters go?  Estimated Space and Power Requirements for LHC Computing –2,500m 2 — increase of ~1,000m 2 –2MW — nominal increase of 800kW (1.2MW above current load)  Conversion of Tape Vault to Machine Room area agreed at post-C5 in June –Best option for space provision –Initial cost estimate of 1,300-1,400KCHF

CERN.ch 24  We are converting the tape vault to a Machine Room area of ~1,200m 2 with –False floor, finished height of 70cm –6  “In room” air conditioning units. »Total cooling capacity: 500kW –5  130kW electrical cabinets »Double power input »5 or 6 20kW normabarres/PDU u 3-4 racks of 44PCs/normabarre –2  130kW cabinets supplying “critical equipment area” »Critical equipment can be connected to each PDU »Two zones, one for network equipment, one for other critical services. –Smoke detection, but no fire extinction Vault Conversion

CERN.ch 25 The Vault Not So Long Ago

CERN.ch 26 An Almost Empty Vault

CERN.ch 27 One air conditioning room…

CERN.ch 28 …is gone.

CERN.ch 29 The next…

CERN.ch 30 …is on the way out

CERN.ch 31 The Next Steps  Create a new Substation for B513 –To power 2MW of computing equipment plus air- conditioning and ancillary loads. –Included in the site-wide 18kV loop—more redundancy. –Favoured location: –Underground, but 5 transformers on top.  Refurbish the main Computer Room once enough equipment has moved to the vault.

CERN.ch 32 View from B31 Today

CERN.ch 33 View from B31 with Substation

CERN.ch 34 Looking Towards B31

CERN.ch 35 Summary  Six Services –Physics ServicesRemedy Support –Computing Hardware SupplyPrinting –Computer Centre OperationsMacintosh Support  Service Developments to –Follow natural developments »Remedy 5, LSF 4.2, RedHat 7.2 –Streamline provision of existing services »To reduce “P”—c.f. BAAN for Hardware Supply »To manage more—c.f. developments for Physics Services  Four Projects –Computer Centre SupervisionB513 Refurbishment –Fabric Management Development (EDG) –Fabric Management Implementation (LCG)