PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.

Slides:



Advertisements
Similar presentations
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Advertisements

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Feasibility Study on a Common Analysis Framework for ATLAS & CMS.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Integrating Network Awareness in ATLAS Distributed Computing Using the ANSE Project J.Batista, K.De, A.Klimentov, S.McKee, A.Petroysan for the ATLAS Collaboration.
NGNS Program Managers Richard Carlson Thomas Ndousse ASCAC meeting 11/21/2014 Next Generation Networking for Science Program Update.
Advanced Data Mining and Integration Research for Europe ADMIRE – Framework 7 ICT ADMIRE Overview European Commission 7 th.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
MultiJob PanDA Pilot Oleynik Danila 28/05/2015. Overview Initial PanDA pilot concept & HPC Motivation PanDA Pilot workflow at nutshell MultiJob Pilot.
The PanDA Distributed Production and Analysis System Torre Wenaus Brookhaven National Laboratory, USA ISGC 2008 Taipei, Taiwan April 9, 2008 Torre Wenaus.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
The Panda System Mark Sosebee (for K. De) University of Texas at Arlington dosar workshop March 30, 2006.
Welcome to HTCondor Week #14 (year #29 for our project)
Assessment of Core Services provided to USLHC by OSG.
Russian “MegaProject” ATLAS SW&C week Plenary session : status, problems and plans Feb 24, 2014 Alexei Klimentov Brookhaven National Laboratory.
UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.
UTA Site Report Jae Yu UTA Site Report 2 nd DOSAR Workshop UTA Mar. 30 – Mar. 31, 2006 Jae Yu Univ. of Texas, Arlington.
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
K. De UTA Grid Workshop April 2002 U.S. ATLAS Grid Testbed Workshop at UTA Introduction and Goals Kaushik De University of Texas at Arlington.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
PanDA A New Paradigm for Computing in HEP Kaushik De Univ. of Texas at Arlington NRC KI, Moscow January 29, 2015.
PanDA: Exascale Federation of Resources for the ATLAS Experiment
PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
Event Service Intro, plans, issues and objectives for today Torre Wenaus BNL US ATLAS S&C/PS Workshop Aug 21, 2014.
Overview of ASCR “Big PanDA” Project Alexei Klimentov Brookhaven National Laboratory September 4, 2013, Arlington, TX PanDA UTA.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
MultiJob pilot on Titan. ATLAS workloads on Titan Danila Oleynik (UTA), Sergey Panitkin (BNL) US ATLAS HPC. Technical meeting 18 September 2015.
Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
PD2P The DA Perspective Kaushik De Univ. of Texas at Arlington S&C Week, CERN Nov 30, 2010.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
Data Management: US Focus Kaushik De, Armen Vartapetian Univ. of Texas at Arlington US ATLAS Facility, SLAC Apr 7, 2014.
Construction of Computational Segment at TSU HEPI Erekle Magradze Zurab Modebadze.
PANDA: Networking Update Kaushik De Univ. of Texas at Arlington SC15 Demo November 18, 2015.
HEP and NP SciDAC projects: Key ideas presented in the SciDAC II white papers Robert D. Ryne.
SDN Provisioning, next steps after ANSE Kaushik De Univ. of Texas at Arlington US ATLAS Planning, CERN June 29, 2015.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
OSG Report for DOE/NSF Joint Oversight Group U.S. Large Hadron Collider Program OSG Report for DOE/NSF Joint Oversight Group U.S. Large Hadron Collider.
Shifters Jamboree Kaushik De ADC Jamboree, CERN December 4, 2014.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
UTA Site Report Jae Yu UTA Site Report 7 th DOSAR Workshop Louisiana State University Apr. 2 – 3, 2009 Jae Yu Univ. of Texas, Arlington.
PanDA & Networking Kaushik De Univ. of Texas at Arlington ANSE Workshop, CalTech May 6, 2013.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Danila Oleynik (On behalf of ATLAS collaboration)
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Production System 2 manpower and funding issues Alexei Klimentov Brookhaven National Laboratory Aug 19, 2013 Production System Technical Meeting CERN.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
BigPanDA Status Kaushik De Univ. of Texas at Arlington Alexei Klimentov Brookhaven National Laboratory OSG AHM, Clemson University March 14, 2016.
PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.
BigPanDA US ATLAS SW&C Technical Meeting August 1, 2016 Alexei Klimentov Brookhaven National Laboratory.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
BigPanDA Workflow Management on Titan
Joslynn Lee – Data Science Educator
PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL
HEP Computing Tools for Brain Studies
Readiness of ATLAS Computing - A personal view
BigPanDA WMS for Brain Studies
Univ. of Texas at Arlington BigPanDA Workshop, ORNL
U.S. ATLAS Testbed Status Report
A Possible OLCF Operational Model for HEP (2019+)
Welcome to (HT)Condor Week #19 (year 34 of our project)
Workflow Management Software For Tomorrow
Presentation transcript:

PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013

Introduction  PanDA genesis  WMS (Workload Management System) for ATLAS applications running on distributed computing resources for ~8 years  New computing paradigm in HEP  Provides access to WLCG sites with high efficiency for Central Production, Group Production, and User Production and Analysis  Extraordinary PanDA scalability at the LHC  PanDA evolution  Success in ATLAS has sparked interest among other communities  Any organization with large data volumes, distributed resources and a large user base can benefit from PanDA WMS  During past year new PanDA collaborators, and new sources of funding have led to the question – what is the future of PanDA? October 21, 2013Kaushik De

October 21, 2013Kaushik De PanDA can manage complex workflows for: PanDA can manage complex workflows for: Large data volume – hundreds of petabytes Distributed resources – hundreds of computing centers worldwide Collaborative work – thousands of scientific users PanDA is built on two key concepts: PanDA is built on two key concepts: Central job queue for distributed resources Pilot based job execution environment What can PanDA do?

Jobs/week Completed by PanDA October 21, 2013Kaushik De Current scale – 25 million jobs completed monthly at ~hundred sites

Recent Meetings on PanDA Evolution  A series of recent meetings have been held to define the scope of future directions for PanDA -> BigPanDA  Workshop at BNL June 4-6,  Workshop at UTA September 3-4,  Focused meetings at ORNL, LBNL (EsNet) and others  This meeting at CERN October 21, 2013Kaushik De

Scope of BigPanDA  What is BigPanDA?  Is it improved PanDA + ProdSys2 after LS1 shutdown?  Is it PanDA for CMS/AMS?  Is it DoE funding for BigData usage of PanDA WMS?  Is it PanDA integration with networking?  Is it PanDA moving to HPC and other platforms?  Is it PanDA for non-HEP sciences (eg. BioInfomatics)?  Is it something else?  BigPanDA is all of the above – not just PanDA for BigData!  At this meeting we will hear from new stakeholders in PanDA – the people who are at the forefront of BigPanDA  We will also hear about complimentary visions October 21, 2013Kaushik De

The Growing PanDA Ecosystem  ATLAS PanDA  US ATLAS, CERN, UK, DE, NorduGrid, Dubna, Protvino, OSG …  ASCR BigPanDA  DoE funded project at BNL, UTA – 3 years  ANSE PanDA  NSF funded network project - CalTech, Michigan, Vanderbilt, UTA  HPC and Cloud PanDA  Taiwan PanDA  CMS PanDA  AliEn PanDA  (MegaPanDA) … October 21, 2013Kaushik De

October 21, 2013Kaushik De “Next Generation Workload Management and Analysis System for Big Data” “Next Generation Workload Management and Analysis System for Big Data” 3 year DoE ASCR funding 3 year DoE ASCR funding Lead Institution: Brookhaven National Laboratory Lead PI: Alexei Klimentov Principal Investigators: Brookhaven National Laboratory: Alexei Klimentov, Sergei Panitkin, Torre Wenaus, Dantong Yu Argonne National Laboratory: Alexandre Vaniachine The University of Texas at Arlington: Kaushik De, Gergely Zaruba New hires New hires Jaroslava BNL, Danila UTA, Artem UTA, Mikhaill UTA DoE ASCR Project

ASCR BigPanDA Work Plan October 21, 2013Kaushik De  WP1 (Factorizing the core):  Factorizing the core components of PanDA to enable adoption by a wide range of exascale scientific communities  WP2 (Extending the scope):  Evolving PanDA to support extreme scale computing clouds and Leadership Computing Facilities  WP3 (Leveraging intelligent networks):  Integrating network services and real-time data access to the PanDA workflow  WP4 (Usability and monitoring):  Real time monitoring and visualization package for PanDA

WP1 – Factorizing the Core  PanDA server factorizations  Modular structure to support multiple organizations  Ease of packaging new releases  Work ongoing  PanDA pilot factorizations  Started >one year ago, in collaboration with CERN IT  Useful for Common Analysis Framework development  DB backend factorizations  Oracle DB backend is used for ATLAS  MySQL backend now running on EC2 PanDA server October 21, 2013Kaushik De

October 21, 2013Kaushik De From Tadashi Maeno

October 21, 2013Kaushik De

October 21, 2013Kaushik De

WP2 – Extending PanDA to HPC  Extensive progress during past 6 months  First demonstration: running ATLAS jobs on Google (GCE)  Second demonstration: Joint project with ORNL – Oakridge Leadership Computing Facility  No show stoppers, basic functions working  Expect ATLAS payload running soon at Titan October 21, 2013Kaushik De

October 21, 2013Kaushik De From Sergey Panitkin

October 21, 2013Kaushik De From Ken Read

October 21, 2013Kaushik De From Ken Read

HPC Testing at ORNL LCF PanDA UTA18 From Danila Oleynik

WP3 – Intelligent Networking  Much recent progress – Armen’s talk later in the morning  Network data automatically imported to PanDA  Code for initial use cases ready for testing  Meeting with ESNET two weeks ago  Action items being prepared for joint activities  ASCR and ANSE projects are working together  Rich community expertise  Network Element (NE) will soon be an equal partner to CE and SE October 21, 2013Kaushik De

WP4 - Monitoring  New framework for rapid development of new monitoring  Common code for PanDA, ProdSys2  Easy packaging and deployment for other experiments  Newest area of development – pages coming soon October 21, 2013Kaushik De

Synergies  ATLAS PanDA  Core PanDA effort, primarily for ATLAS, but benefits the entire PanDA ecosystem  ASCR  While ASCR funding is for use beyond ATLAS, clear benefit to ATLAS in core development, sustainability, networking  ASCR team works closely with ATLAS PanDA team  ANSE  Networking – 50% of benefits/directed to ATLAS  CERN/European/Asian PanDA efforts  A lot of new developments, new partners October 21, 2013Kaushik De

Conclusion  Baby PanDA is growing up to become BigPanDA October 21, 2013Kaushik De