New Directions and BNL USATLAS TIER-3 & NEW COMPUTING DIRECTIVES William Strecker-Kellogg RHIC/ATLAS Computing Facility (RACF) Brookhaven National.

Slides:

Advertisements

Similar presentations

Grid Computing at The Hartford OGF22 February 27, 2008 Robert Nordlund

Advertisements

Introduction to Grid Application On-Boarding Nick Werstiuk

Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.

Overview of Wisconsin Campus Grid Dan Bradley Center for High-Throughput Computing.

S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.

Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini

High Performance Computing Course Notes Grid Computing.

Introduction CSCI 444/544 Operating Systems Fall 2008.

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.

Condor at Brookhaven Xin Zhao, Antonio Chan Brookhaven National Lab CondorWeek 2009 Tuesday, April 21.

HTCondor at the RACF SEAMLESSLY INTEGRATING MULTICORE JOBS AND OTHER DEVELOPMENTS William Strecker-Kellogg RHIC/ATLAS Computing Facility Brookhaven National.

Assessment of Core Services provided to USLHC by OSG.

Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.

Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

PCGRID ‘08 Workshop, Miami, FL April 18, 2008 Preston Smith Implementing an Industrial-Strength Academic Cyberinfrastructure at Purdue University.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

INFSO-RI Enabling Grids for E-sciencE EGEODE VO « Expanding GEosciences On DEmand » Geocluster©: Generic Seismic Processing Platform.

March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.

EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.

Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.

Yeti Operations INTRODUCTION AND DAY 1 SETTINGS. Rob Lane HPC Support Research Computing Services CUIT

New Data Center at BNL– Status Update HEPIX – CERN May 6, 2008 Tony Chan - BNL.

HTCondor and Beyond: Research Computer at Syracuse University Eric Sedore ACIO – Information Technology and Services.

Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,

Condor In Flight at The Hartford 2006 Transformations Condor Week 2007 Bob Nordlund.

Grid Computing at The Hartford Condor Week 2008 Robert Nordlund

Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.

14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.

Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.

Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!

Condor Usage at Brookhaven National Lab Alexander Withers (talk given by Tony Chan) RHIC Computing Facility Condor Week - March 15, 2005.

The CRI compute cluster CRUK Cambridge Research Institute.

Migration to 7.4, Group Quotas, and More William Strecker-Kellogg Brookhaven National Lab.

Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce Leonidas Akritidis Panayiotis Bozanis Department of Computer & Communication.

US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.

Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison

| nectar.org.au NECTAR TRAINING Module 4 From PC To Cloud or HPC.

RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,

Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.

Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.

GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

HTCondor-CE for USATLAS Bob Ball AGLT2/University of Michigan OSG AHM March, 2015 Bob Ball AGLT2/University of Michigan OSG AHM March, 2015.

Next Generation of Apache Hadoop MapReduce Owen

Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.

Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.

Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison

EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu

Scheduling Policy John (TJ) Knoeller Condor Week 2017.

Yaodong CHENG Computing Center, IHEP, CAS 2016 Fall HEPiX Workshop

Architecture & System Overview

CREAM-CE/HTCondor site

ATLAS Sites Jamboree, CERN January, 2017

Monitoring HTCondor with Ganglia

Artem Trunov and EKP team EPK – Uni Karlsruhe

LQCD Computing Operations

The Scheduling Strategy and Experience of IHEP HTCondor Cluster

Brian Lin OSG Software Team University of Wisconsin - Madison

GLOW A Campus Grid within OSG

PU. Setting up parallel universe in your pool and when (not

Introduction to research computing using Condor

Presentation transcript:

New Directions and BNL USATLAS TIER-3 & NEW COMPUTING DIRECTIVES William Strecker-Kellogg RHIC/ATLAS Computing Facility (RACF) Brookhaven National Lab May 2016

RACF Overview  RHIC—Collider at BNL  STAR+PHENIX detectors  ~15000 slots each  ~6Pb central Disk (GPFS/NFS)  ~16Pb Distributed Disk (dCache/xrootd)  ~60Pb HPSS Tape  USATLAS T1  ~15kCores  ~11Pb Replicated dCache Disk  ~20Pb HPSS Tape  3 Large HTCondor clusters  Share resources via flocking  Several Smaller clusters  Recent 8.2  8.4 update HTCondor Week

ATLAS Load Balancing  Working steadily since last year's talk, and HEPIX Fall 2015 talklast year's talkHEPIX Fall 2015 talk  Allows occupancy to remain at 95% or above despite dynamically changing workload with no human intervention  Prevents starvation due to competition with larger jobs  Only inefficiency is due to (de)fragmentation HTCondor Week

V8.4 Bug Fixing  Minor issue with classad not appearing in the negotation context  RemoteGroupResourcesInUse  In context of group-based preemption  Fixed quicky by Greg  Ticket #  Major issue with Schedd halting for up to an hour  No disk/network IO of any kind  No syscalls  GDB shows a mess (STL)  Ticket # HTCondor Week

V8.4 Schedd Bug  Schedd spending up to an hour recomputing internal array using autocluster→jobid and the reverse mapping  Jobs would die after their shadows couldn't talk to their schedd that went dark  After day of debugging, code fix was implemented  Built & tested at BNL, max time reduced from 1h to 2s  Thank you to Todd & TJ  Still suspect something off in our environment (500k ac/day)  Not a problem anymore! HTCondor Week

USATLAS BNL  Consolidates previously scattered Tier-3 facilities  Shared resource ~1000 cores  Many user-groups represented  Local submission  Hierarchical Group Quotas  Group-Membership authorization problem  Surplus sharing  Group-based fair-share preemption  After slow start, increased usage in past few months HTCondor Week

Group Membership  Extended group-quota editing web UI  Added user-institute-group mappings  Cron generates a config fragment that asserts Owner → Group in START expression  Require group at submission  Currently 74 users and 26 groups use the T3  Surplus sharing & group-respecting preemption  RemoteGroupResourcesInUse HTCondor Week

Preemptable Partitionable Slots  Users are frequently asking to run high-memory jobs…  …which motivates: User-prio preemption with partitionable slots  Absolutely need to support group-constrained fair-share with preemption  Currently Pslot Preemption operates on entire slot  Entire group's quota can fit on one 40-core machine  A function of the small scale of having 30+ groups sharing 1000 cores  No way to respect group quotas as schedd splits slots  Not Currently Possible! HTCondor Week

New Directions WHAT WE TALK ABOUT WHEN WE TALK ABOUT H[TP]C HTCondor Week

New Science & Evolving Needs Traditional Model  RACF as a microcosm of HEP/NP computing  Many other facilities of a similar (within an order of magnitude) scale and capability  Embarrassingly parallel workloads  Data storage as large or larger a problem than computing  Batch is simple— provisioning/matchmaking vs. scheduling  Large, persistent, well-staffed experiments New Model  New Users with traditionally HPC-based workloads  National Synchrotron Lightsource II  Center for Functional Nanomaterials  Revolving userbase  No institutional “repository” of computing knowledge  Not used to large-scale computing  Software support not well-defined  Large pool of poorly supported open source or free/abandon-ware  Commercial or GUI-Interactive 10 HTCondor Week 2016

Institutional Cluster  New cluster of 108 nodes, each with 2xGPUs  Plans to increase next year by a factor of 2, then perhaps more  Infiniband interconnect in fat-tree topology  SLURM is being evaluated  Seems to be the growing choice for new HPC clusters  Active Development  DOE experience  Open Source  Large userbase (6 of top 10 of TOP500)  Test sample workloads in proposed queue configurations  Set up Shifter for docker intergrationShifter  Will be run as a “traditional” HPC cluster  MPI support an important consideration HTCondor Week

New Science & Evolving Needs  Lab-management support for consolidation of computing  Computational Science Initiative (CSI) at BNL “leverages” experience at RACF in support of other non-HEP/NP science domains  CSI is also organizing software support to fill a gap in current BNL computational services …  The $10,000 question: Can we leverage existing infrastructure? HTCondor Week

Running at HTC Facility Zero-Order Requirements  Embarrassingly Parallel  (small) Input → (one) Process → (small) Output  No communication of intermediate results  X86_64  Other hardware not standard in the community  Data accessible  May seem obvious, but need adequate bandwidth to get data to the compute and back  Something to think about of moving from single desktop HTCondor Week

Running at HTC Facility First-Order Requirements  Linux (RedHat)  Virtualization is an extra complexity, Windows expensive  Containers / Docker allows simple cross-linux compatibility  Free Software  Instance-limited licenses are hard to control across many pools  Cost of licenses becomes prohibitive with exponential computing growth  “Friendly” resource profile  Code runs not just within the machine, but within the general limits its neighboring jobs use HTCondor Week

HTC/HPC Divide  Not a false dichotomy, but surely an increasingly blurry line  Several users in our experience fit in middle-ground (albeit with considerable help from the RACF to fit workload into an exclusive HTC environment) 1. Biology Group: 800 cores for 5 months simple dedicated scheduler 2. Wisconsin group at CFN: successfully ran opportunistically on RHIC resources  Key factors  How much state transfer and with what IO patterns?  Size  10 years to now: 2 racks collapse into 1 machine  How many problems fit inside one machine today? HTCondor Week

Scheduling  HTCondor recently can submit to SLURM via grid universe  Different sharing models 1. Condor-as-SLURM-job (glidein-style) 2. Coexist, mutually exclude via policy 3. Flocking/Routing (needs work for our users)  Ideal: transparent for users who know the requirements of their workload HTCondor Week

The End THANK YOU FOR YOUR TIME! QUESTIONS? COMMENTS? HTCondor Week