AGLT2 Site Report Shawn McKee University of Michigan HEPiX Fall 2014 / UNL
OutlineOutline Site Summary and Status Monitoring Provisioning with Cobbler HTCondor MCORE details Virtualization Status Networking Upgrade Updates on projects Plans for the future AGLT2-HEPiX 14-Oct-14 Outline
AGLT2-HEPiX 14-Oct-14 Site Summary The ATLAS Great Lake Tier-2 (AGLT2) is a distributed LHC Tier-2 for ATLAS spanning between UM/Ann Arbor and MSU/East Lansing. Roughly 50% of storage and compute at each site 5722 single core job slots (added 480 cores) MCORE slots increased from 240 to 420 (dynamic) 269 Tier-3 job slots usable by Tier-2 Average 9.26 HS06/slot 3.5 Petabytes of storage (adding 192 TB, retiring 36 TB) Total of 54.4 kHS06, up from 49.0 kHS06 in spring Most Tier-2 services virtualized in VMware 2x40 Gb inter-site connectivity, UM has 100G to WAN, MSU has 10G to WAN, lots of 10Gb internal ports and 16 x 40Gb ports High capacity storage systems have 2 x 10Gb bonded links 40Gb link between Tier-2 and Tier-3 physical locations
AGLT2-HEPiX 14-Oct-14 AGLT2 Monitoring AGLT2 has a number of monitoring components in use As shown in Annecy we have: Customized “summary” page-> OMD (Open Monitoring Distribution) at both UM/MSU Ganglia Ganglia ElasticsearchLogstashKibana Central syslog’ing via ELK: Elasticsearch, Logstash, Kibana SRMwatch SRMwatch to track dCache SRM status GLPI GLPI to track tickets (with FusionInventory)
Provisioning with Cobbler AGLT2-HEPiX 14-Oct-14 AGLT2 Provisioning/Config Mgmt AGLT2 uses a Cobbler server configuration managed by CFEngine and duplicated at both sites for building service nodes (excepting site-specific network/host info) Created flexible default kickstart template with Cobbler’s template language (Cheetah) to install a variety of “profiles” as selected when adding system to Cobbler (server, cluster-compute, desktop, etc). Simple PXE based installation from network Cobbler handles (with included post-install scripts) creating bonded NIC configurations – used to deal with those manually Cobbler manages mirroring of OS and extra repositories Kickstart setup is kept minimal and most configuration done by CFEngine on first boot Dell machines get BIOS and Firmware updates in post-install using utils/packages from Dell yum repositories See Ben Meekhof’s talk Thursday for details (
AGLT2-HEPiX 14-Oct-14 HTCondor CE at AGLT2 Bob Ball worked for ~1 month at AGLT2 setup – Steep learning curve for newbies – Lots of non-apparent niceties in preparing job-router configuration – RSL no longer available for routing decisions Cannot change content of job route except during condor-ce restart However, CAN modify variables and place them in ClassAd variables set in the router – Used at AGLT2 to control MCORE slot access Currently in place on test gatekeeper only Will extend to the primary GK ~10/22/14 See full details of our experience and setup at
AGLT2-HEPiX 14-Oct-14 MCORE at AGLT2 AGLT2 AGLT2 has supported MCORE jobs for many months now Condor configured for two MCORE job types – Static slots (10 total, 8 cores each) – Dynamic slots (420 of 8 cores each) Requirements statements added by the “condor_submit” script – Depends on count of queued MP8 jobs Result is instant access for a small number with gradual release of cores for more with time. Full details at QUEUED RUNNING
Virtualization Status AGLT2-HEPiX 14-Oct-14 Virtualization at AGLT2 Most Tier-2 services run on VMware (vSphere 5.5) UM uses iSCSI storage backends Dell MD3600i, MD3000i and SUN NAS 7410 vSphere manages virtual disk allocation between units and RAID volumes based on various volume performance capabilities and VM demand MSU runs on DAS – Dell MD3200 Working on site resiliency details Multisite SSO operational between sites (SSO at either site manages both sites) MSU is operating site-specific Tier-2 VMs (dcache doors, xrootd, cobbler) on vSphere VMware Replication Appliance is used to perform daily replications of critical UM VMs to MSU’s site. This is working well Our goal is to have MSU capable of bringing up Tier-2 service VMs within 1 day of loss of UM site. Queued: a real test of this process
AGLT2-HEPiX 14-Oct-14 AGLT2 100G Network Details Link down problematic optics
AGLT2-HEPiX 14-Oct-14 Software-Defined Storage Research NSF proposal submitted involving campus and our Tier-2 Ceph Exploring Ceph for future software- defined storage Goal is centralized storage that supports in place access from CPUs across campus Intends to leverage Dell “dense” storage MD3xxx (12 Gbps SAS) in JBOD mode Still waiting for news…
AGLT2-HEPiX 14-Oct-14 Update on DIIRT At Ann Arbor Gabriele Carcassi presented on “Using Control Systems for Operation and Debugging”Using Control Systems for Operation and Debugging This effort has continued and is now called DIIRT (Data Integration In Real Time)DIIRT Control System Studio UI for operators NFS CSV or JSON diirt server Websockets + JSON Web pages HTML + Javascript scripts dependency data flow Currently implemented Scripts populate NFS directory from condor/ganglia Files are served by diirt server through web sockets Control System Studio can create “drag’n’drop” UI
AGLT2-HEPiX 14-Oct-14 DIIRT UI Canvas allows drag-n-drop of elements to assemble views, no programming required Server can feed remote clients in real-time. Project info at
AGLT2-HEPiX 14-Oct-14 Future Plans Participating in SC14 (simple WAN data-pump system) Lustre Our Tier-3 uses Lustre 2.1 and has ~500TB – Approximately 35M files averaging 12MB/file – We will purchase new hardware providing another 500TB. LustreLustre ZFS – Intend to go to Lustre 2.5+ and VERY interested in using Lustre on ZFS for this LustreLustre – Plan: install new Lustre instance, then migrate existing Lustre data over, then rebuild older hardware into the new instance, retiring some components for spare parts. Still exploring OpenStack as an option for our site. Would like to use Ceph for a back-end. New network components support Software Defined Networking (OpenFlow). Once v1.3 is supported we intend to experiment with SDN in our Tier-2 and as part of LHCONE point-to-point testbed. Working on IPv6 dual-stack for all nodes in our Tier-2
ConclusionConclusion AGLT2-HEPiX 14-Oct-14 Summary Monitoring is helping us easily find/fix issues Virtualization tools working well and we are close to meeting our site resiliency goals Network upgrade in place 2x40G inter-site, 100G WAN DIIRT is a new project allowing us to customize how we manage and correlated diverse data. FUTURE: OpenStack, IPv6, Lustre on ZFS for Tier-3, SDN Questions ?