Univ. of Texas at Arlington BigPanDA Workshop, ORNL

Slides:

Advertisements

Similar presentations

Integrating Network Awareness in ATLAS Distributed Computing Using the ANSE Project J.Batista, K.De, A.Klimentov, S.McKee, A.Petroysan for the ATLAS Collaboration.

Advertisements

Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.

D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.

PanDA A New Paradigm for Computing in HEP Kaushik De Univ. of Texas at Arlington NRC KI, Moscow January 29, 2015.

DDM-Panda Issues Kaushik De University of Texas At Arlington DDM Workshop, BNL September 29, 2006.

PanDA Summary Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.

GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.

PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.

ALICE Offline Week | CERN | November 7, 2013 | Predrag Buncic AliEn, Clouds and Supercomputers Predrag Buncic With minor adjustments by Maarten Litmaath.

Experience and possible evolution Danila Oleynik (UTA), Sergey Panitkin (BNL), Taylor Childers (ANL) ATLAS TIM 2014.

Event Service Intro, plans, issues and objectives for today Torre Wenaus BNL US ATLAS S&C/PS Workshop Aug 21, 2014.

Overview of ASCR “Big PanDA” Project Alexei Klimentov Brookhaven National Laboratory September 4, 2013, Arlington, TX PanDA UTA.

A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.

Post-DC2/Rome Production Kaushik De, Mark Sosebee University of Texas at Arlington U.S. Grid Phone Meeting July 13, 2005.

Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,

SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.

PD2P The DA Perspective Kaushik De Univ. of Texas at Arlington S&C Week, CERN Nov 30, 2010.

PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.

PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.

ALICE-PanDA Pilot Factorizations Kaushik De Nov. 7, 2014.

SDN Provisioning, next steps after ANSE Kaushik De Univ. of Texas at Arlington US ATLAS Planning, CERN June 29, 2015.

MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.

Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,

Big PanDA on HPC/LCF Update Sergey Panitkin, Danila Oleynik BigPanDA F2F Meeting. March

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Alessandro De Salvo CCR Workshop, ATLAS Computing Alessandro De Salvo CCR Workshop,

Production System 2 manpower and funding issues Alexei Klimentov Brookhaven National Laboratory Aug 19, 2013 Production System Technical Meeting CERN.

May 23, 2007ALICE DOE Review - Computing1 ALICE-USA Computing Overview of Hard and Soft Computing Resources Needed to Achieve Research Goals 1.Calibration.

Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.

J. Templon Nikhef Amsterdam Physics Data Processing Group Large Scale Computing Jeff Templon Nikhef Jamboree, Utrecht, 10 december 2012.

ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.

BigPanDA Status Kaushik De Univ. of Texas at Arlington Alexei Klimentov Brookhaven National Laboratory OSG AHM, Clemson University March 14, 2016.

PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.

Panda Monitoring, Job Information, Performance Collection Kaushik De (UT Arlington), Torre Wenaus (BNL) OSG All Hands Consortium Meeting March 3, 2008.

LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.

Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,

Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.

Testbed Monitoring Kaushik De Univ. of Texas at Arlington

Dynamic Extension of the INFN Tier-1 on external resources

Review of the WLCG experiments compute plans

Status of WLCG FCPPL project

Computing Operations Roadmap

Ian Bird WLCG Workshop San Francisco, 8th October 2016

BigPanDA Workflow Management on Titan

Virtualization and Clouds ATLAS position

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017

U.S. ATLAS Grid Production Experience

U.S. ATLAS Tier 2 Computing Center

PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL

HPC DOE sites, Harvester Deployment & Operation

“The HEP Cloud Facility: elastic computing for High Energy Physics” Outline Computing Facilities for HEP need to evolve to address the new needs of the.

David Cameron ATLAS Site Jamboree, 20 Jan 2017

PanDA in a Federated Environment

Readiness of ATLAS Computing - A personal view

The ADC Operations Story

Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)

BigPanDA WMS for Brain Studies

WLCG dedicated accounting portal (proposal)

ALICE Computing Model in Run3

WLCG and support for IPv6-only CPU

U.S. ATLAS Testbed Status Report

This work is partially supported by projects InterExcellence (LTT17018), Research infrastructure CERN (CERN-CZ, LM ) and OP RDE CERN Computing (CZ /0.0/0.0/1.

Grid Computing 6th FCPPL Workshop

Backfilling the Grid with Containerized BOINC in the ATLAS computing

Exploit the massive Volunteer Computing resource for HEP computation

A Possible OLCF Operational Model for HEP (2019+)

Welcome to (HT)Condor Week #19 (year 34 of our project)

Workflow Management Software For Tomorrow

Workflow and HPC erhtjhtyhy Doug Benjamin Argonne National Lab.

HPC resources LHC experiments are using (as seen from ATLAS - CMS lens) erhtjhtyhy Doug Benjamin Argonne National Lab.

Presentation transcript:

Univ. of Texas at Arlington BigPanDA Workshop, ORNL BigPanDA for ATLAS Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, ORNL March 31, 2017

Overview ATLAS has provided a testbed for BigPanDA development and commissioning for the past 4+ years First to run production in DOE HPC – Titan Scaling up operations on Titan steadily over past 2 years First to integrate network stack with workflow management system Testing of BigPanDA packaging and distribution Many other accomplishments – see Alexei’s talk yesterday This is a natural role for ATLAS since PanDA grew up here ATLAS is unique in it’s ability to federate resources Started with grid resources – WLCG Integrated cloud resources 5 years ago – research, Google, AWS… Integrated HPC resources about 3 years ago – DOE, NSF, EU... All of the federation is done through PanDA/BigPanDA Kaushik De Mar 31, 2017

Many Speakers Showed Similar Slide Yesterday 74 Million Titan Core Hours used in calendar year 2016 In backfill mode – no allocation 374 Million Hours remained unused. Kaushik De Mar 31, 2017

And Details of Backfill Like This Kaushik De Mar 31, 2017

The View from ATLAS (US Centric) Full Geant4 Simulation Total CPU Usage by ATLAS in US for 2016 Titan in Dark Green Titan only runs G4 – other workflows not shown Kaushik De Mar 31, 2017

And in 2017 Full Geant4 Simulation Total CPU Usage by ATLAS in US for 2017 Titan in Dark Green Titan only runs G4 – other workflows not shown Kaushik De Mar 31, 2017

Working on Error Rates 2016 2017 Kaushik De Mar 31, 2017

View from the Top of ATLAS Full Geant4 Simulation Total CPU Usage by ATLAS for 2017 Includes all sites – not only US Titan only runs G4 – other workflows not shown Kaushik De Mar 31, 2017

Very Good Last Week Titan 4.6% Cori 4.4% Kaushik De Mar 31, 2017

What can be done Better? Exploiting 1-2% of unused cycles on Titan provided about 5% of all ATLAS computing – same as large US sites Great for Titan – improved efficiency at almost no cost Great for ATLAS – huge impact on physics results But Titan has a factor of 2-3 more unused cycles ATLAS cannot use them yet – waiting for Event Service New PanDA server at OLCF can be used by others Plan for the next 6-12 months Continue ATLAS usage of Titan – LHC is taking data Many of the improvements discussed yesterday (and later today) Harvester on Titan Yoda on Titan AES (using OS) on Titan Kaushik De Mar 31, 2017