Site Report: ATLAS Great Lakes Tier-2 HEPiX 2011 Vancouver, Canada October 24 th, 2011.

Slides:

Advertisements

Similar presentations

System Center 2012 R2 Overview

Advertisements

What’s New: Windows Server 2012 R2 Tim Vander Kooi Systems Architect

MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter 14 Upgrading to Exchange Server 2003.

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.

Upgrading the Platform - How to Get There!

AGLT2 Site Report Benjeman Meekhof University of Michigan HEPiX Fall 2013 Benjeman Meekhof University of Michigan HEPiX Fall 2013.

VMware vSphere 4 Introduction. Agenda VMware vSphere Virtualization Technology vMotion Storage vMotion Snapshot High Availability DRS Resource Pools Monitoring.

11 Capacity Planning Methodologies / Reporting for Storage Space and SAN Port Usage Bob Davis EMC Technical Consultant.

Understand what’s new for Windows File Server Understand considerations for building Windows NAS appliances Understand how to build a customized NAS experience.

Tier 3g Infrastructure Doug Benjamin Duke University.

Module 10 Configuring and Managing Storage Technologies.

A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.

Virtualization. Virtualization  In computing, virtualization is a broad term that refers to the abstraction of computer resources  It is "a technique.

OSG Public Storage and iRODS

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.

Site Lightning Report: MWT2 Mark Neubauer University of Illinois at Urbana-Champaign US ATLAS Facilities UC Santa Cruz Nov 14, 2012.

Planning and Designing Server Virtualisation.

Module – 4 Intelligent storage system

BINP/GCF Status Report BINP LCG Site Registration Oct 2009

Locality Aware dCache & Discussion on Sharing Storage USATLAS Facilities Meeting SMU October 12, 2011.

ATLAS Great Lakes Tier-2 Site Report Ben Meekhof USATLAS Tier2/Tier3 Workshop SLAC, Menlo Park, CA November 30 th, 2007.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

AGLT2 Site Report Shawn McKee/University of Michigan HEPiX Spring 2012 Prague, Czech Republic April 23 th, 2012 April 23 th, 2012.

Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.

ATLAS Great Lakes Tier-2 (AGL-Tier2) Shawn McKee (for the AGL Tier2) University of Michigan US ATLAS Tier-2 Meeting at Harvard Boston, MA, August 17 th,

Factors affecting ANALY_MWT2 performance MWT2 team August 28, 2012.

Shawn McKee/University of Michigan

VMware vSphere Configuration and Management v6

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

1 Worker Node Requirements TCO – biggest bang for the buck –Efficiency per $ important (ie cost per unit of work) –Processor speed (faster is not necessarily.

RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,

Florida Tier2 Site Report USCMS Tier2 Workshop Livingston, LA March 3, 2009 Presented by Yu Fu for the University of Florida Tier2 Team (Paul Avery, Bourilkov.

Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.

PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.

RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.

AGLT2 Site Report Shawn McKee University of Michigan March / OSG-AHM.

A Service-Based SLA Model HEPIX -- CERN May 6, 2008 Tony Chan -- BNL.

Tackling I/O Issues 1 David Race 16 March 2010.

AGLT2 Site Report Shawn McKee/University of Michigan HEPiX Fall 2012 IHEP, Beijing, China October 14 th, 2012.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.

Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer

Managing a growing campus pool Eric Sedore

The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.

LHCONE NETWORK SERVICES: GETTING SDN TO DEV-OPS IN ATLAS Shawn McKee/Univ. of Michigan LHCONE/LHCOPN Meeting, Taipei, Taiwan March 14th, 2016 March 14,

AGLT2 Site Report Shawn McKee/University of Michigan Bob Ball, Chip Brock, Philippe Laurens, Ben Meekhof, Ryan Sylvester, Richard Drake HEPiX Spring 2016.

© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.

DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.

Storage at the ATLAS Great Lakes Tier-2 Tier-2 Storage Administrator Talks Shawn McKee / University of Michigan OSG Storage ForumShawn McKee1.

ATLAS Tier-2 Storage Status AGLT2 OSG Storage Forum – U Chicago – Sep Shawn McKee / University of Michigan OSG Storage ForumShawn McKee1.

VM Technologies (and others) for Site and Service Resiliency Shawn McKee/University of Michigan OSG AHM - Lincoln, Nebraska March 19 th, 2012.

STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,

AGLT2 Info on VM Use Shawn McKee/University of Michigan USATLAS Meeting on Virtual Machines and Configuration Management June 15 th, 2011.

Dynamic Extension of the INFN Tier-1 on external resources

WHAT IS A NETWORK TYPES OF NETWORK NETWORK HARDWARE

High Availability Planning for Tier-2 Services

Bob Ball/University of Michigan

Joint AGLT2-MWT2 Networking meeting

Virtualization OVERVIEW

AGLT2 Site Report Shawn McKee/University of Michigan

Welcome! Thank you for joining us. We’ll get started in a few minutes.

ATLAS Sites Jamboree, CERN January, 2017

AGLT2 Site Report Shawn McKee/University of Michigan

GGF15 – Grids and Network Virtualization

Migration Strategies – Business Desktop Deployment (BDD) Overview

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

Specialized Cloud Architectures

PerformanceBridge Application Suite and Practice 2.0 IT Specifications

Presentation transcript:

Site Report: ATLAS Great Lakes Tier-2 HEPiX 2011 Vancouver, Canada October 24 th, 2011

Topics  Site info – Overview of site details  Virtualization/iSCSI – Use of iSCSI for service virtualization  dCache – dCache “locality-aware” configuration  LSM-DB – Gathering I/O logging from “lsm-get” 10/24/2011 AGLT2 Site Report - HEPiX

AGLT2 Overview  ATLAS Great Lakes Tier-2: One of five USATLAS Tier-2s.  Has benefited from strong interactions/support from the other Tier-2s.  Unique in the US in that AGLT2 is also one of three ATLAS Muon Calibration Centers – unique needs and requirements  Our Tier-2 is physically hosted at two sites: Michigan State University and the University of Michigan  Currently ~ 36.2 kHS06 compute, 4252 job-slots, 250 opportunistic job-slots, 2210 TB storage. AGLT2 Site Report - HEPiX /24/2011 3

AGLT2 Notes  We are working on minimizing hardware bottlenecks:  Network: 4x10GE WAN paths, Many 10GE ports: UM:156/MSU:80  Run Multiple Spanning Tree at UM to better utilize 10GE links  Storage: 25 10GE dCache servers, disk count: UM:723/MSU:798  Using service virtualization, SSDs for DB/NFS “hot” areas  AGLT2 is planning to be one of the first US Tier-2 sites to put LHCONE into production (VLANs already routed)  We have 6 perfSONAR-PS instances at each site (UM and MSU: 2 production, 4 for testing, prototyping and local use)  Strong research flavor: A PI/Co-PI site for DYNES, UltraLight, GridNFS and involved in Terapaths/StorNet. 10/24/2011 AGLT2 Site Report - HEPiX

AGLT2 Operational Details  We use ROCKS v5.3 to provision our systems (SL5.4/x64)  Extensive monitoring in place (Ganglia, php-syslog-ng, Cacti, dCache monitoring, monit, Dell management software)  Twiki used for site documentation and informal notes  Automated s via Cacti, Dell OMSA and custom scripts for problem notification  OSG provides primary middleware for grids/ATLAS software  Configuration control via Subversion and CFengine 10/24/2011 AGLT2 Site Report - HEPiX

WLCG Delivered HS06-hours Last Year 10/24/2011 AGLT2 Site Report - HEPiX 2011 AGLT2 has delivered beyond pledge and has done well in comparison to all WLCG Tier-2 sites. The above plots shows HS06-hours for all WLCG VOs by Tier-2 (which is one or more sites) based upon WLCG published spreadsheets. USATLAS Tier-2s are green, USCMS red. NOTE: US-NET2 data from WLCG is wrong! Missing Harvard for example 6

10GE Protected Network for ATLAS  We have two “/23” networks for the AGL-Tier2 but a single domain: aglt2.org  Currently 3 10GE paths to Chicago for AGLT2. Another 10GE DCN path also exists (BW limited)  Our AGLT2 network has three 10GE wavelengths on MiLR in a “triangle”  Loss of any of the 3 waves doesn’t impact connectivity for both sites. VRF to utilize 4 th wave at UM AGLT2 Site Report - HEPiX /24/2011 7

Virtualization at AGLT2 AGLT2 Site Report - HEPiX 2011 AGLT2 is heavily invested in virtualization for our services. VMware Enterprise Plus provides the virtualization infrastructure 10/24/2011 VM hardware: 3xR710, 96GB, 2xX5670 (2.93GHz), 2x10GE, 6x146GB, 3x quad 1GE (12 ports) MD3600i, 15x600GB 15kSAS MD1200, 15x600GB 15kSAS Mgmt: vCenter now a VM Network uses NIC teaming, VLAN trunking, 4 switches 8

iSCSI Systems at AGLT2 10/24/2011 AGLT2 Site Report - HEPiX 2011 Having this set of iSCSI systems gives us lots of flexibilty: Can migrate VMs live to different storage Allows redundant Lustre MDTs to use the same storage Can serve as a DB backend Backup for VMs to different backends 9

Virtualization Summary  We have virtualized many of our services:  Gatekeepers (ATLAS and OSG), LFC  AFS Cell (both the DB and Fileservers)  Condor and ROCKS Headnodes  LSM-DB node, 4 SQUIDs  Terapaths control nodes  Lustre MGS node  System has worked well. Saved in not having to buy dedicated hardware. Has eased management/backup/test.  Future: May enable better overall resiliency by having at both sites 10/24/2011 AGLT2 Site Report - HEPiX

dCache and Locality-Awareness 10/24/2011 AGLT2 Site Report - HEPiX 2011  For AGLT2 we have seen significant growth in the amount of storage and compute-power at each site.  We currently have a single 10GE connection used for intersite transfers and it is becoming strained.  Given 50% of resources at each site, 50% of file access will be on the intersite link. Seeing periods of 100% utilization!  Cost for an additional link is $30K/year + addtl. equipment  Could try traffic engineering to utilize the other direction on the MiLR triangle BUT this would compete with WAN use  This got us thinking: we have seen pCache works OK for a single node but the hit rate is relatively small. What if we could “cache” our dCache at each site and have dCache use “local” files? We don’t want to halve our storage though! 11

Original AGLT2 dCache Config Oct 24th 2011 AGLT2 Site Report - HEPiX

dCache and Locality-Awareness 10/24/2011 AGLT2 Site Report - HEPiX 2011 At the WLCG meeting in DESY we worked with Gerd, Tigran and Paul on some dCache issues We came up with a ‘caching’ idea that has some locality awareness It transparently uses pool space for cached replicas Working Well! 13

Planning for I/O  A recent hot-topics has been planning for I/O capacity to best support I/O intensive jobs (typically user analysis).  There is both a hardware and a software aspect to this and a possible network impact as well  How many spindles and of what type on a worker node?  Does SAS vs SATA make a difference? 7.2K vs 10K vs 15K?  How does any of the above scale with job-slots/node?  At AGLT2 we have seen some pathological jobs which had ~10% CPU use because of I/O wait 10/24/2011 AGLT2 Site Report - HEPiX

LSM, pCache and SYSLOG-NG 10/24/2011 AGLT2 Site Report - HEPiX 2011  To try to remedy some of the worker-node I/O issues we decided to utilize some of the tools from MWT2  pCache was installed on all worker nodes in spring 2011  pCache “hit rate” is around 15-20%  Saves recopying AND duplicated disk space use  Easy to use and configure  To try to take advantage of the callbacks to PANDA, we also installed LSM (Local Site Mover) which is a set of wrapper scripts to ‘put’, ‘get’, ‘df’ and ‘rm’  Allows us to easily customize our site behavior and “mover” tools  Important bonus: serves as a window into file transfer behavior  Logs to a local file by default  AGLT2 has long used a central logging host running syslog-ng  Configure LSM to also log to syslog…now we centrally have ALL LSM logs in the log-system…how to use that? 15

LSM DB 10/24/2011 AGLT2 Site Report - HEPiX 2011 The syslog-ng central loghost stores all the logs in MySQL To make the LSM info useful I created another MySQL DB for the LSM data Shown at the right is the design diagram with each table representing an important component we want to track. See 16 We have a cron-job which updates the LSM DB from the syslog DB every 5 minutes. It also updates the Pools/Files information for all new transfers found.

Transfer Information from LSM DB 10/24/2011 AGLT2 Site Report - HEPiX 2011 Stack-plot from Tom Rockwell on the right shows 4 types of transfers: Within a site (UM- UM or MSU-MSU) is the left side of each day Between sites (UM- MSU or MSU-UM) are on the right side of each day You can see traffic between sites ~= traffic within sites 17

Transfer Reuse from the LSM DB 10/24/2011 AGLT2 Site Report - HEPiX 2011 The plot from Tom on the right shows the time between the first and second copy of a specific file for the MSU worker nodes The implication is caching of about 1 weeks worth of files would cover most reuse cases 18

LSM DB Uses  With LSM DB there a many possibilities for better understanding the impact of our hardware and software configurations:  We can ask about how many “new” files since X (by site)?  We can get “hourly” plots of transfer rates by transfer type and source-destination site. Could alert on problems.  We can compare transfer rates for different worker node disks and disk configurations (or vs any other worker-node characteristics)  We can compare pool node performance vs memory on the host (or more generally vs any of the pool node characteristics)  How many errors (by node) in the last X minutes? Alert ?  We have just started using this new tool and hope to have some useful information to guide our coming purchases as well as improve our site monitoring. 10/24/2011 AGLT2 Site Report - HEPiX

Summary  Our site has been performing very well for Production Tasks, Users and in our Calibration role  Virtualization of services working well. Eases management.  We have a strong interest in creating high performance “end- to-end” data movement capability to increase our effectiveness (both for production and analysis use).  This includes optimizing for I/O intensive jobs on the worker nodes  Storage (and its management) is a primary issue. We continue exploring dCache, Lustre, Xrootd and/or NFS v4.1 as options Questions? AGLT2 Site Report - HEPiX /24/

EXTRA SLIDES AGLT2 10/24/2011 AGLT2 Site Report - HEPiX

Current Storage Node (AGLT2) AGLT2 Site Report - HEPiX /24/2011 Relatively inexpensive ~$200/TB(useable) Uses resilient cabling (active-active) 22

WLCG Delivered HS06-hours (Since Jan 2009) 10/24/2011 AGLT2 Site Report - HEPiX 2011 The above plot is the same as the last, except it cover s the complete period of WLCG data from January 2009 to July Details and more plots at: NOTE: US-NET2 data from WLCG is wrong! Missing Harvard for example 23