A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.

Slides:



Advertisements
Similar presentations
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
Advertisements

ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Current Monte Carlo calculation activities in ATLAS (ATLAS Data Challenges) Oxana Smirnova LCG/ATLAS, Lund University SWEGRID Seminar (April 9, 2003, Uppsala)
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Analysis demos from the experiments. Analysis demo session Introduction –General information and overview CMS demo (CRAB) –Georgia Karapostoli (Athens.
The PanDA Distributed Production and Analysis System Torre Wenaus Brookhaven National Laboratory, USA ISGC 2008 Taipei, Taiwan April 9, 2008 Torre Wenaus.
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL
Tarball server (for Condor installation) Site Headnode Worker Nodes Schedd glidein - special purpose Condor pool master DB Panda Server Pilot Factory -
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Direct gLExec integration with PanDA Fernando H. Barreiro Megino CERN IT-ES-VOS.
ARDA Prototypes Andrew Maier CERN. ARDA WorkshopAndrew Maier, CERN2 Overview ARDA in a nutshell –Experiments –Middleware Experiment prototypes (basic.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Interactive Workflows Branislav Šimo, Ondrej Habala, Ladislav Hluchý Institute of Informatics, Slovak Academy of Sciences.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
HammerCloud Functional tests Valentina Mancinelli IT/SDC 28/2/2014.
PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
LCG Support for Pilot Jobs John Gordon, STFC GDB December 2 nd 2009.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
GRID Security & DIRAC A. Casajus R. Graciani A. Tsaregorodtsev.
DIRAC Pilot Jobs A. Casajus, R. Graciani, A. Tsaregorodtsev for the LHCb DIRAC team Pilot Framework and the DIRAC WMS DIRAC Workload Management System.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL May 8 th, 2008.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
Dan van der Ster for the Ganga Team
Design rationale and status of the org.glite.overlay component
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
The Ganga User Interface for Physics Analysis on Distributed Resources
Cloud Computing R&D Proposal
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität München, 2: Vienna Academy of Science, 3: Brookhaven National Laboratory, 4: University of Texas Arlington, 5: CERN Ganga provides a uniform interface for running ATLAS user analyses on a number of local, batch, and grid backends. PanDA is a pilot-based production and distributed analysis system developed and used extensively by ATLAS. This work presents the implementation and usage experiences of a PanDA backend for Ganga. Built upon reusable application libraries from GangaAtlas and PanDA, the Ganga PanDA backend allows users to run their analyses on the worldwide PanDA resources, while providing the ability for users to develop simple or complex analysis workflows in Ganga. Further, the backend allows users to submit and manage "personal" PanDA pilots: these pilots run under the user's grid certificate and provide a secure alternative to shared pilot certificates while enabling the usage of local resource allocations. 2. Ganga and PanDA Overviews 3. A PanDA Backend for Ganga 4. Personal Pilots 5. Conclusion and Future Work Abstract 1. Introduction and Overview Figure 1. ATLAS Distributed Analysis Workflows. As shown in Figure 1, users are presently required to use both end-user tools if they need to access all of the ATLAS grids. With the ATLAS decision to move toward PanDA-based analysis on all grids, it is important to implement a full- featured PanDA backend for Ganga. This poster presents the design and implementation of the GangaPanda module in Ganga. Distributed analysis tools are necessarily being developed in order to enable physicists to exploit the vast distributed computing resources known as grids. The presently-used distributed analysis workflows in ATLAS are depicted in Figure 1 (dashed lines indicate connections which are in progress or under development). Two front-end tools are available to users, namely Ganga [1] and the PanDA Client [2-3] (i.e. pathena, prun, etc...); because each of these tools has a comparable user community, ATLAS has decided to continue support of the two front-ends. These tools can access the resources of three grids: the Open Science Grid [4], the EGEE Grid [5], and NorduGrid [6]. Ganga [1] is a front-end user tool which is jointly developed by LHCb and ATLAS to allow users to run analyses on arbitrary resources (e.g. locally, on a batch system, or on the grid). It is an open source project and is used and extended by many communities outside of high-energy-physics. The ganga philosophy is "configure once, run anywhere"; this model allows users to incrementally scale up their analyses from test jobs on a laptop to full scale analysis on the grid. Figure 2 depicts the key components of a ganga job; ATLAS has modules developed to enable Athena analyses of the ATLAS datasets on all three of the ATLAS grids. One issue which is delaying the deployment of PanDA on the EGEE and NorduGrid sites is the potential for security issues related to running the pilots using any grid certificate other than the related user’s. GLexec [7] is the proposed solution to this problem, though delays in its deployment have necessitated alternative solutions. The personal pilot application distributed with Ganga is one such alternative. The functionality of GangaPanda has been developed and is in production for Athena analyses. GangaPanda is being improved to handle other ganga application types, namely AthenaMC (for private monte carlo productions), ROOT (for Root analyses), and Executable for all other jobs. Starting with Ganga version 5.0.0, users have been able to submit Athena analyses on the PanDA system with Ganga. In most cases, ganga jobs that are configured to run with j.backend=LCG() or j.backend=NG() can be trivially modified to run on the PanDA system by setting j.backend=Panda(). Figure 4 shows the workflow of personal pilots. When a user submits a job to the Panda Server, the job waits until a pilot retrieves it from the database. The pilot is then sent from Ganga via the LCG backend to the workernode. This presents a secure alternative to running pilots under the production role. The drawback of the approach is that it limits the fair share capabilities of the system. The example in Figure 5 presents the user interface to Ganga personal pilots. After submitting an Athena job to ANALY_FZK (a site which does not receive pilots from the pilot factories), the user submits a PandaPilot job to the same queue. 6. References [1] Ganga: [2] PanDA: [3] PanDA Client: [4] Open Science Grid: [5] EGEE: [6] NorduGrid: [7] glexec: Figure 2: A Ganga job is composed of 6 key modules. The Production ANd Distributed Analysis (PANDA) [2] system is a workload management system developed to meet the needs of ATLAS distributed computing. The pilot-based system (e.g. depicted in Figure 4) has proven to be very successful in managing the ATLAS distributed production requirements, and is now being extended to manage distributed analysis on all three ATLAS grids. Figure 4: The PanDA pilot model. With GangaPanda, “personal pilots” can be sent to worker nodes on other grids or clouds via Ganga. myjob=Job() myjob.application=Athena() myjob.backend=Panda() myjob.backend.site=‘ANALY_FZK’ myjob.submit() pilot=Job() pilot.application=PandaPilot() pilot.application.queue=‘ANALY_FZK’ pilot.backend=LCG() pilot.submit() Figure 5: Personal pilot code example. Users can write more elaborate script to automatically send the required pilots to a queue. Figure 3: Workflow of the Athena-Panda Runtime Handler, and reuse of GangaAtlas and PandaClient components. The workflow of the Athena- Panda runtime (RT) handler is depicted in Figure 3. To reduce duplication of effort and enable user support, code from the Panda Client and GangaAtlas has been reused to a significant extent. Specifically, the RT handler reuses GangaAtlas components to process the ganga job parameters and package the user area for the grid, while the athena environment and I/O are offloaded to the AthenaUtils library distributed with the Panda Client. The eventual goal of the AthenaUtils library is to be a common library shared by pathena and Ganga for handling all phases of preparation of athena jobs for running on the grid.