PANDA: Networking Update Kaushik De Univ. of Texas at Arlington SC15 Demo November 18, 2015.

Slides:



Advertisements
Similar presentations
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Advertisements

Integrating Network Awareness in ATLAS Distributed Computing Using the ANSE Project J.Batista, K.De, A.Klimentov, S.McKee, A.Petroysan for the ATLAS Collaboration.
Integrating Network and Transfer Metrics to Optimize Transfer Efficiency and Experiment Workflows Shawn McKee, Marian Babik for the WLCG Network and Transfer.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
MultiJob PanDA Pilot Oleynik Danila 28/05/2015. Overview Initial PanDA pilot concept & HPC Motivation PanDA Pilot workflow at nutshell MultiJob Pilot.
LHCONE Point-to-Point Service Workshop - CERN Geneva Eric Boyd, Internet2 Slides borrowed liberally from Artur, Inder, Richard, and other workshop presenters.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Russian “MegaProject” ATLAS SW&C week Plenary session : status, problems and plans Feb 24, 2014 Alexei Klimentov Brookhaven National Laboratory.
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Enabling Grids for E-sciencE Overview of System Analysis Working Group Julia Andreeva CERN, WLCG Collaboration Workshop, Monitoring BOF session 23 January.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
PanDA A New Paradigm for Computing in HEP Kaushik De Univ. of Texas at Arlington NRC KI, Moscow January 29, 2015.
PanDA Summary Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.
PanDA: Exascale Federation of Resources for the ATLAS Experiment
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Experiment Support ANALYSIS FUNCTIONAL AND STRESS TESTING Dan van der Ster, CERN IT-ES-DAS for the HC team: Johannes Elmsheuser, Federica Legger, Mario.
PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Overview of ASCR “Big PanDA” Project Alexei Klimentov Brookhaven National Laboratory September 4, 2013, Arlington, TX PanDA UTA.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
ANSE: Advanced Network Services for Experiments Shawn McKee / Univ. of Michigan BigPANDA Meeting / UTA September 4, 2013.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Computing for LHC Physics 7th March 2014 International Women's Day - CERN- GOOGLE Networking Event Maria Alandes Pradillo CERN IT Department.
Julia Andreeva on behalf of the MND section MND review.
SDN Provisioning, next steps after ANSE Kaushik De Univ. of Texas at Arlington US ATLAS Planning, CERN June 29, 2015.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Network integration with PanDA Artem Petrosyan PanDA UTA,
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.
PanDA & Networking Kaushik De Univ. of Texas at Arlington ANSE Workshop, CalTech May 6, 2013.
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Site notifications with SAM and Dashboards Marian Babik SDC/MI Team IT/SDC/MI 12 th June 2013 GDB.
Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
BigPanDA Status Kaushik De Univ. of Texas at Arlington Alexei Klimentov Brookhaven National Laboratory OSG AHM, Clemson University March 14, 2016.
PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.
Panda Monitoring, Job Information, Performance Collection Kaushik De (UT Arlington), Torre Wenaus (BNL) OSG All Hands Consortium Meeting March 3, 2008.
Workload Management Workpackage
Shawn McKee, Marian Babik for the
Overview of the Belle II computing
U.S. ATLAS Tier 2 Computing Center
PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL
HEP Computing Tools for Brain Studies
PanDA in a Federated Environment
Readiness of ATLAS Computing - A personal view
BigPanDA WMS for Brain Studies
Univ. of Texas at Arlington BigPanDA Workshop, ORNL
Monitoring of the infrastructure from the VO perspective
Presentation transcript:

PANDA: Networking Update Kaushik De Univ. of Texas at Arlington SC15 Demo November 18, 2015

Overview  PanDA workload management system was developed for the ATLAS experiment at the Large Hadron Collider  Hundreds of petabytes of data per year, thousands of users worldwide, many dozens of complex applications…  Leading to ~500 scientific publications  Discovery of the Higgs boson, search for dark matter…  A new approach to distributed computing  A huge hierarchy of computing centers working together  Main challenge – how to provide efficient automated performance  Auxiliary challenge – make resources easily accessible to all users  Network Aware Data Management is crucial for systems like PanDA to work efficiently November 18, 2015Kaushik De2

The ATLAS Experiment at the LHC November 18, 2015Kaushik De3 Largest data sensor in the world

Business s sent 3000PB/year (Not managed as a coherent data set) Google search 100PB Facebook uploads 180PB/year Kaiser Permanente 30PB LHC data 15PB/yr YouTube 15PB/yr US Census Lib of Congress Climate DB Nasdaq Wired 4/2013 Current ATLAS data set, all data products: 160 PB 1+M files transferred per day Current ATLAS data set, all data products: 160 PB 1+M files transferred per day Big Data in 2013 ATLAS is Big Data November 18, 2015Kaushik De4 ATLAS data

Distributed Computing in ATLAS Workload Managemeent System November 18, 2015Kaushik De5

Paradigm Shift in HEP Computing  New Ideas from PanDA  Distributed resources are seamlessly integrated  All users have access to resources worldwide through a single submission system  Uniform fair share, priorities and policies allow efficient management of resources  Automation, error handling, and other features in PanDA improve user experience  All users have access to same resources  Old HEP paradigm  Distributed resources are independent entities  Groups of users utilize specific resources (whether locally or remotely)  Fair shares, priorities and policies are managed locally, for each resource  Uneven user experience at different sites, based on local support and experience  Privileged users have access to special resources November 18, 2015Kaushik De6

PanDA Scale Current scale – 35M jobs completed every month at >hundred sites First exascale system in HEP – 1.2 Exabytes processed in 2013 November 18, 2015Kaushik De7

CPU Consumption During LHC Run 1+2, per month, aggregated by federation November 18, 2015Kaushik De8

The Growing PanDA EcoSystem  ATLAS PanDA core  US ATLAS, CERN, UK, DE, ND, CA, Russia, OSG …  ASCR/HEP BigPanDA  DoE funded project at BNL, UTA – PanDA beyond HEP, at LCF  CC-NIE ANSE PanDA  NSF funded network project - CalTech, Michigan, Vanderbilt, UTA  HPC and Cloud PanDA – very active  Taiwan PanDA – AMS and other communities  Russian NRC KI PanDA, JINR PanDA – ATLAS, COMPASS, ALICE, NICA, Biology…  AliEn PanDA, LSST PanDA, other experiments November 18, 2015Kaushik De9

Resources Accessible via PanDA About 200,000 job slots used continuously 24x7x365 November 18, 2015Kaushik De10

PanDA References  PanDA – Production and Distributed Analysis System  Deployed on WLCG infrastructure  Standards based implementation  REST framework – HTTP/S  Oracle or MySQL backends  CondorG based pilot factories  Python packages available from SVN and GitHub  Command-line and GUI/Web interfaces  Reference November 18, 2015Kaushik De11

PanDA Networking Projects  DOE ASCR and HEP funded project “Next Generation Workload Management and Analysis System for Big Data”  Generalization of PanDA for HEP and other data-intensive sciences  Project participants from ANL, BNL, UT Arlington  WP3 (Leveraging intelligent networks): Integrating network services and real-time data access to the PanDA workflow  ANSE project – funded by NSF CC*NIE  CalTech, Michigan, Vanderbilt, UT Arlington  Advanced Network Services for Experiments November 18, 2015Kaushik De12

WP3 PanDA and Networking  PanDA as workload manager  PanDA automatically chooses job execution site  Multi-level decision tree – task brokerage, job brokerage, dispatcher  Also predictive workflows – like PD2P (PanDA Dynamic Data Placement)  Site selection is based on processing and storage requirements  Why not use network information in this decision?  Can we go even further – network provisioning?  Network knowledge useful for all phases of job cycle  Network as resource  Optimal site selection should take network capability into account  We do this already – but indirectly using job completion metrics  Network as a resource should be managed (i.e. provisioning)  We also do this crudely – mostly through timeouts, self throttling  Goal for PanDA  Direct integration of networking with PanDA workflow – never attempted before for large scale automated WMS systems November 18, 2015Kaushik De13

Job Workflow November 16, 2015Kaushik De14  Panda jobs go through a succession of steps tracked in central DB  Defined  Waiting  Assigned  Throttled  Activated  Sent  Starting  Running  Holding  Transferring  Finished/failed  Cancelled

Using Network Information  Pick a few use cases  Cases which are important to PanDA users  Enhance workload management through use of network  Case 1: Improve User Analysis workflow  Case 2: Improve Tier 1 to Tier 2 workflow  Step by step approach  Collect network information  Storage and access  Using network information  Using dynamic circuits November 18, 2015Kaushik De15

Sources of Network Information  DDM Sonar measurements  Actual transfer rates for files between all sites (Tier 1 and Tier 2)  This information is normally used for site white/blacklisting  Measurements available for small, medium, and large files  perfSonar (PS) measurements  perfSonar provides dedicated network monitoring data  All WLCG sites are being instrumented with PS boxes  US sites are already instrumented and monitored  Federated XRootD (FAX) measurements  Read-time of remote files are measured for pairs of sites  This is not an exclusive list – just a starting point November 18, 2015Kaushik De16

Network Data Repositories  Native data repositories  Historical data stored from collectors  SSB – site status board for sonar and perfSonar data  FAX data is kept independently and uploaded  AGIS (ATLAS Grid Information System)  Most recent / processed data only – updated periodically  Mixture of push/pull – depending on source of data  schedConfigDB  Internal Oracle DB used by PanDA for fast access  Uses standard ATLAS collector November 18, 2015Kaushik De17

November 18, 2015Kaushik De18

Case 1: Faster User analysis  First use case for network integration with PanDA  Goal - reduce waiting time for user jobs  User analysis jobs normally go to sites with local input data  This can occasionally lead to long wait times (jobs are re-brokered if possible, or PD2P data caching will make more copies eventually to reduce congestion)  While nearby sites with good network access may be idle  Brokerage uses concept of ‘nearby’ sites  Use cost metric generated with Hammercloud tests  Calculate weight based on usual brokerage criteria (availability of CPU resources, data location, release…) plus new network transfer cost  Jobs will be sent to the site with best overall weight  Throttling is used to manage load on network November 18, 2015Kaushik De19

November 18, 2015Kaushik De20

Measure of Success November 18, 2015Kaushik De21 Many sites used for overflow Failure rate is manageable

First Tests  Tested in production for ~1 day in March, 2014  Useful for debugging and tuning direct access infrastructure  We got first results on network aware brokerage  Job distribution  4748 jobs from 20 user tasks which required data from congested U.S. Tier 1 site were automatically brokered to U.S. Tier 1/2 sites November 18, 2015Kaushik De22

Brokerage Results November 18, 2015Kaushik De23

Early Example from Oct, 2014 How do we measure success? Let’s look at example task. November 18, 2015Kaushik De24 This Year

Jobs from task on oct 3 November 18, 2015Kaushik De25

Job wait times for example task November 18, 2015Kaushik De26

Conclusions for Case 1  Network data collection working well  Additional algorithms to combine network data will be tried  HC tests working well – PS data not robust yet  PanDA brokerage worked well  Achieved goal of reducing wait time – though anecdotally  Well balanced local vs remote access  Need fine tuning – we have a lot of data now  We have overflow working for FAX  We need deeper study to optimize November 18, 2015Kaushik De27

Case 2: Cloud Selection  Second use case for network integration with PanDA  Optimize choice of T1-T2 pairings (cloud selection)  In ATLAS, production tasks are assigned to Tier 1’s  Tier 2’s are attached to a Tier 1 cloud for data processing  Any T2 may be attached to multiple T1’s  Currently, operations team makes this assignment manually  This could/should be automated using network information  For example, each T2 could be assigned to a native cloud by operations team, and PanDA will assign to other clouds based on network performance metrics November 18, 2015Kaushik De28

DDM Sonar Data November 18, 2015Kaushik De29

Tier 1 View November 18, 2015Kaushik De30

Tier 2 View November 18, 2015Kaushik De31

Improving Site Association November 18, 2015Kaushik De32

Conclusion for Case 2  Worked well as a demonstrator  Migrating to concept of “World Cloud”  Any site can be potentially “Nuclues” for data aggregation  Any other site can provide processing  Removes Tier1-2 hierarchy  Network integration is crucial for this new implementation November 18, 2015Kaushik De33

Operationalizing perfSONAR  WLCG has formed Working Group for LHC experiments  Network and Transfer Metrics WG – chaired by Marian Babik & Shawn McKee   Mandate  Ensure all relevant network and transfer metrics are identified, collected and published  Ensure sites and experiments can better understand and fix networking issues  Enable use of network-aware tools to improve transfer efficiency and optimize experiment workflows  Report 05/11/2015  perfSONAR collector, datastore, publisher and dashboard now in production (stable operations) November 18, 2015Kaushik De34

From Marian Babik November 18, 2015Kaushik De35

From Marian Babik November 18, 2015Kaushik De36

From Marian Babik November 18, 2015Kaushik De37

Conclusion  Rich area for future R&D  Moving from “timeouts” to “metrics” based approach  Direct integration of network information in decision making  Next step – direct interaction with network elements November 18, 2015Kaushik De38

Live Pages  job.cern.ch/dashboard/request.py/dailysummary job.cern.ch/dashboard/request.py/dailysummary job.cern.ch/dashboard/request.py/dailysummary   T2SitesPerCloud/ T2SitesPerCloud/ T2SitesPerCloud/