17th October 2013Graduate Lectures1 Oxford University Particle Physics Unix Overview Pete Gronbech Senior Systems Manager and GridPP Project Manager.

Slides:



Advertisements
Similar presentations
Liverpool HEP – Site Report May 2007 John Bland, Robert Fay.
Advertisements

Oxford PP Computing Site Report HEPSYSMAN 28 th April 2003 Pete Gronbech.
Computing Infrastructure
9th May 2006HEPSYSMAN RAL - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager.
MUNIS Platform Migration Project WELCOME. Agenda Introductions Tyler Cloud Overview Munis New Features Questions.
Report of Liverpool HEP Computing during 2007 Executive Summary. Substantial and significant improvements in the local computing facilities during the.
Chris Brew RAL PPD Site Report Chris Brew SciTech/PPD.
Birmingham site report Lawrie Lowe: System Manager Yves Coppens: SouthGrid support HEP System Managers’ Meeting, RAL, May 2007.
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Department of Epidemiology & Biostatistics K12 Scholar Presentation: Terminal Server.
14th October 2014Graduate Lectures1 Oxford University Particle Physics Unix Overview Sean Brisbane Particle Physics Systems Administrator Room 661 Tel.
Tuesday, September 08, Head Node – Magic.cse.buffalo.edu Hardware Profile Model – Dell PowerEdge 1950 CPU - two Dual Core Xeon Processors (5148LV)
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
SouthGrid Status Pete Gronbech: 4 th September 2008 GridPP 21 Swansea.
Gareth Smith RAL PPD HEP Sysman. April 2003 RAL Particle Physics Department Site Report.
Tier 3g Infrastructure Doug Benjamin Duke University.
Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.
14th April 1999Hepix Oxford Particle Physics Site Report Pete Gronbech Systems Manager.
UCL Site Report Ben Waugh HepSysMan, 22 May 2007.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
27/04/05Sabah Salih Particle Physics Group The School of Physics and Astronomy The University of Manchester
RAL PPD Site Update and other odds and ends Chris Brew.
Southgrid Technical Meeting Pete Gronbech: 16 th March 2006 Birmingham.
20th October 2003Hepix Vancouver - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager.
David Hutchcroft on behalf of John Bland Rob Fay Steve Jones And Mike Houlden [ret.] * /.\ /..‘\ /'.‘\ /.''.'\ /.'.'.\ /'.''.'.\ ^^^[_]^^^ * /.\ /..‘\
Group Computing Strategy Introduction and BaBar Roger Barlow June 28 th 2005.
SouthGrid Status Pete Gronbech: 2 nd April 2009 GridPP22 UCL.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
17-April-2007 High Performance Computing Basics April 17, 2007 Dr. David J. Haglin.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPIX 2009 Umea, Sweden 26 th May 2009.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN RAL 30 th June 2009.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.
Configuration Management with Cobbler and Puppet Kashif Mohammad University of Oxford.
11th Oct 2005Hepix SLAC - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager and South Grid Technical Co-ordinator.
RAL PPD Computing A tier 2, a tier 3 and a load of other stuff Rob Harper, June 2011.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
22nd March 2000HEPSYSMAN Oxford Particle Physics Site Report Pete Gronbech Systems Manager.
2-3 April 2001HEPSYSMAN Oxford Particle Physics Site Report Pete Gronbech Systems Manager.
UKI-SouthGrid Update Hepix Pete Gronbech SouthGrid Technical Coordinator April 2012.
13th October 2011Graduate Lectures1 Oxford University Particle Physics Unix Overview Pete Gronbech Senior Systems Manager and GridPP Project Manager.
1st July 2004HEPSYSMAN RAL - Oxford Site Report1 Oxford University Particle Physics Site Report Pete Gronbech Systems Manager.
Southgrid Technical Meeting Pete Gronbech: 24 th October 2006 Cambridge.
Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham.
Oxford University Particle Physics Unix Overview Sean Brisbane Particle Physics Systems Administrator Room 661 Tel th.
14th October 2010Graduate Lectures1 Oxford University Particle Physics Unix Overview Pete Gronbech Senior Systems Manager and SouthGrid Technical Co-ordinator.
HEPSYSMAN May 2007 Oxford & SouthGrid Computing Status (Ian McArthur), Pete Gronbech May 2007 Physics IT Services PP Computing.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
Gareth Smith RAL PPD RAL PPD Site Report. Gareth Smith RAL PPD RAL Particle Physics Department Overview About 90 staff (plus ~25 visitors) Desktops mainly.
Oxford & SouthGrid Update HEPiX Pete Gronbech GridPP Project Manager October 2015.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
11th October 2012Graduate Lectures1 Oxford University Particle Physics Unix Overview Pete Gronbech Senior Systems Manager and GridPP Project Manager.
UKI-SouthGrid Overview and Oxford Status Report Pete Gronbech SouthGrid Technical Coordinator HEPSYSMAN – RAL 10 th June 2010.
RAL PPD Tier 2 (and stuff) Site Report Rob Harper HEP SysMan 30 th June
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
Patrick Gartung 1 CMS 101 Mar 2007 Introduction to the User Analysis Facility (UAF) Patrick Gartung - Fermilab.
Compute and Storage For the Farm at Jlab
Oxford University Particle Physics Unix Overview
Oxford University Particle Physics Unix Overview
Oxford Site Report HEPSYSMAN
Oxford University Particle Physics Unix Overview
Presentation transcript:

17th October 2013Graduate Lectures1 Oxford University Particle Physics Unix Overview Pete Gronbech Senior Systems Manager and GridPP Project Manager

17th October 2013Graduate Lectures2 l Strategy l Local Cluster Overview l Connecting to it l Grid Cluster l Computer Rooms l How to get help

17th October 2013Graduate Lectures3 Particle Physics Strategy The Server / Desktop Divide Win 7 PC Linux Desktop Desktops Servers General Purpose Unix Server Group DAQ Systems Linux Worker nodes Web Server Linux File Servers Win 7 PC Ubuntu PC Approx 200 Desktop PC’s with Exceed, putty or ssh/X windows used to access PP Linux systems Virtual Machine Host NIS Server torque Server

17th October 2013Graduate Lectures4 Particle Physics Linux l Unix Team (Room 661): n Pete Gronbech - Senior Systems Manager and GridPP Project Manager n Ewan MacMahon – Grid Systems Administrator n Kashif Mohammad – Grid and Local Support n Sean Brisbane – Local Server and User Support l General purpose interactive Linux based systems for code development, short tests and access to Linux based office applications. These are accessed remotely. l Batch queues are provided for longer and intensive jobs. Provisioned to meet peak demand and give a fast turnaround for final analysis. l Systems run Scientific Linux which is a free Red Hat Enterprise based distribution. l The Grid & CERN are just migrating to SL6. The local cluster is following and currently has one interactive node with a growing set of worker nodes available from "pplxint8". l Most cluster systems are still currently running SL5. These can be accessed from pplxint5 and 6. l We will be able to offer you the most help running your code on the newer SL6. Some experimental software frameworks still require SL5.

17th October 2013Graduate Lectures5 Current Clusters l Particle Physics Local Batch cluster l Oxfords Tier 2 Grid cluster

pplxwnnn 8 * Intel 5420 cores 17th October 2013 PP Linux Batch Farm pplxwn9 Scientific Linux 5 pplxint6 pplxint5 8 * Intel 5420 cores Interactive login nodes pplxwn10 8 * Intel 5420 cores pplxwnnn 8 * Intel 5420 cores pplxwnnn 8 * Intel 5420 cores pplxwnnn 8 * Intel 5420 cores 6Graduate Lectures pplxwn25 pplxwn26 pplxwn27 pplxwn28 pplxwn31 pplxwn32 pplxwn41 pplxwn42 16 * E cores 16 * Intel 5650 cores 16 * AMD Opteron 6128 cores Users log in to the interactive nodes Pplxint5 & 6, the home directories and all the data disks (/home area or /data/group ) are shared across the cluster and visible on the interactive machines and all the batch system worker nodes. Approximately 300 Cores each with 4GB of RAM memory. pplxwnnn 8 * Intel 5420 cores

17th October 2013 PP Linux Batch Farm Scientific Linux 6 pplxint8 Interactive login nodes pplxwn49 16 * Intel 2650 cores pplxwn50 16 * Intel 2650 cores pplxwnnn 16 * Intel 2650 cores 7Graduate Lectures Migration to SL6 ongoing. New SL6 interactive node pplxint8. Use this by preference. Worker nodes will be migrated from the SL5 cluster to SL6 over the next month. Currently four servers with 16 cores each with 4GB of RAM memory per core but more will arrive as required. ie 64 job slots. pplxwnnn 16 * Intel 2650 cores

17th October 2013 PP Linux Batch Farm Data Storage pplxfsn 9TB pplxfsn 40TB Data Areas pplxfsn 19TB 8 Graduate Lectures NFS Servers Home areas Data Areas NFS is used to export data to the smaller experimental groups, where the partition size is less than the total size of a server. The data areas are too big to be backed up. The servers have dual redundant PSUs, RAID 6 and are running on uninterruptible powers supplies. This safeguards against hardware failures, but does not help if you delete files. The home areas are backed up to by two different systems nightly. The OUCS HFS service and a local back up system. If you delete a file tell us a soon as you can when you deleted it and it’s full name. The latest nightly backup of any lost or deleted files from your home directory is available at the read-only location "/data/homebackup/{username} The home areas are quota’d but if you require more space ask us. Store your thesis on /home NOT /data. pplxfsn 30TB Data Areas

Particle Physics Computing Lustre MDSLustre OSS02Lustre OSS03 18TB 44TB SL5 Node SL6 Node Lustre OSS01 44TB Lustre OSS04 df -h /data/atlas Filesystem Size Used Avail Use% Mounted on /lustre/atlas 244T 215T 18T 93% /data/atlas df -h /data/lhcb Filesystem Size Used Avail Use% Mounted on /lustre/lhcb 95T 82T 8.5T 91% /data/lhcb 17th October Graduate Lectures The Lustre file system is used to group multiple file servers together to provide extremely large continuous file spaces. This is used for the Atlas and LHCb groups.

17th October 2013Graduate Lectures10

17th October 2013Graduate Lectures11 Strong Passwords etc l Use a strong password not open to dictionary attack! n fred123 – No good n Uaspnotda!09 – Much better l Better to use ssh with a passphrased key stored on your desktop.

17th October 2013Graduate Lectures12 Connecting with PuTTY Question: How many of you are using Windows? & Linux? On the desktop Demo 1. Plain ssh terminal connection 2. With key and Pageant 3. ssh with X windows tunnelled to passive exceed 4. ssh, X windows tunnel, passive exceed, KDE Session

17th October 2013Graduate Lectures13

Puttygen to create an ssh key on Windows 17th October 2013Graduate Lectures14 Paste this into ~/.ssh/authorized_keys on pplxint Enter a secure passphrase then save the public and private parts of the key to a subdirectory of your h: drive

Pageant l Run Pageant once after login to load your (windows ssh key) 17th October 2013Graduate Lectures15

17th October 2013Graduate Lectures16 SouthGrid Member Institutions l Oxford l RAL PPD l Cambridge l Birmingham l Bristol l Sussex l JET at Culham

Current capacity l Compute Servers n Twin and twin squared nodes –1300 CPU cores l Storage n Total of ~700TB n The servers have between 12 and 36 disks, the more recent ones are 3TB capacity each. These use hardware RAID and UPS to provide resilience. 17th October 2013Graduate Lectures17

17th October 2013Graduate Lectures18 Get a Grid Certificate Must remember to use the same PC to request and retrieve the Grid Certificate. The new UKCA page uses a JAVA based CERT WIZARD

17th October 2013Graduate Lectures19 Two Computer Rooms provide excellent infrastructure for the future The New Computer room built at Begbroke Science Park jointly for the Oxford Super Computer and the Physics department, provides space for 55 (11KW) computer racks. 22 of which will be for Physics. Up to a third of these can be used for the Tier 2 centre. This £1.5M project was funded by SRIF and a contribution of ~£200K from Oxford Physics. The room was ready in December Oxford Tier 2 Grid cluster was moved there during spring All new Physics High Performance Clusters will be installed here.

17th October 2013Graduate Lectures20 Local Oxford DWB Physics Infrastructure Computer Room Completely separate from the Begbroke Science park a computer room with 100KW cooling and >200KW power has been built. ~£150K Oxford Physics money. Local Physics department Infrastructure computer room. Completed September This allowed local computer rooms to be refurbished as offices again and racks that were in unsuitable locations to be re housed.

Cold aisle containment 21 17th October 2013Graduate Lectures

17th October 2013Graduate Lectures22 The end for now… l Sean will give more details of use of the clusters next week l Help Pages n n physics/particle-physics-computer-support physics/particle-physics-computer-support l n l Questions…. l Network Topology

17th October 2013Graduate Lectures23 Network l Gigabit JANET connection to campus July l Second JANET gigabit connection Sept l JANET campus connection upgraded to dual 10 gigabit links August 2009 l Gigabit Juniper firewall manages internal and external Physics networks. l 10Gb/s network links installed between Tier-2 and Tier-3 clusters in l Physics-wide wireless network. Installed in DWB public rooms, Martin Wood, AOPP and Theory. New firewall provides routing and security for this network.

17th October 2013Graduate Lectures24 Network Access Campus Backbone Router Super Janet 4 2* 10Gb/s with Janet 6 OUCS Firewall depts Physics Firewall Physics Backbone Router 1Gb/s 10Gb/s 1Gb/s 10Gb/s Backbone Edge Router depts 100Mb/s 1Gb/s depts 100Mb/s Backbone Edge Router 10Gb/s

17th October 2013Graduate Lectures25 Physics Backbone desktop Server switch Physics Firewall Physics Backbone Router 1Gb/s 100Mb/s 1Gb/s Particle Physics desktop 100Mb/s 1Gb/s 100Mb/s Clarendon Lab 1Gb/s Linux Server Win 2k Server Astro 1Gb/s Theory 1Gb/s Atmos 1Gb/s Server switch 10Gb/s Linux Server 10Gb/s Linux Server 1Gb/s

17th October Future Physics Backbone desktop Server switch Physics Firewall Physics Backbone Switch Dell 8024F 10Gb/s 1Gb/s Particle Physics Dell 8024F desktop 1Gb/s Clarendon Lab Dell 8024F 10Gb/s Win 2k Server Astro Dell 8024F 10Gb/s 1Gb/s Theory Dell 8024F 10Gb/s Atmos Dell 8024F 10Gb/s Server Switch S Gb/s Linux Server 10Gb/s Linux Server 10Gb/s Super FRODO Frodo 10Gb/s 1Gb/s Graduate Lectures