Introduction to Distributed Analysis

Slides:

Advertisements

Similar presentations

User view Ganga classes and functions can be used interactively at a Python prompt, can be referenced in scripts, or can be used indirectly via a Graphical.

Advertisements

Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006

GANGA PANDA Dietrich Liko. Motivation Access to OSG resources by GANGA Collaboration with US colleagues Possibly an alternative way of submitting jobs.

Ganga Developments Karl Harrison (University of Cambridge) 18th GridPP Meeting University of Glasgow, 20th-21st March 2007

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.

Distributed Analysis using Ganga I.Ideas behind Ganga II.Getting started III.Running ATLAS applications Distributed Analysis Tutorial ATLAS Computing &

F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.

David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.

Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.

Introduction to Ganga Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007

ARDA Prototypes Andrew Maier CERN. ARDA WorkshopAndrew Maier, CERN2 Overview ARDA in a nutshell –Experiments –Middleware Experiment prototypes (basic.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

ATLAS Distributed Analysis Dietrich Liko. Thanks to … pathena/PANDA: T. Maneo, T. Wenaus, K. De DQ2 end user tools: T. Maneo GANGA Core: U. Edege, J.

A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.

D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.

Distributed Analysis K. Harrison LHCb Collaboration Week, CERN, 1 June 2006.

Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga User Interface EGEE Review Jakub Moscicki.

K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.

ANALYSIS TOOLS FOR THE LHC EXPERIMENTS Dietrich Liko / CERN IT.

Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007

2 June 20061/17 Getting started with Ganga K.Harrison University of Cambridge Tutorial on Distributed Analysis with Ganga CERN, 2.

David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.

ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.

INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.

The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.

Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.

Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.

D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.

INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.

ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko

A GANGA tutorial Professor Roger W.L. Jones Lancaster University.

Breaking the frontiers of the Grid R. Graciani EGI TF 2012.

ATLAS Physics Analysis Framework James R. Catmore Lancaster University.

Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.

Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.

Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.

LHCb computing model and the planned exploitation of the GRID Eric van Herwijnen, Frank Harris Monday, 17 July 2000.

ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.

Grid Computing: Running your Jobs around the World

The EDG Testbed Deployment Details

L’analisi in LHCb Angelo Carbone INFN Bologna

Dan van der Ster for the Ganga Team

Moving the LHCb Monte Carlo production system to the GRID

Key Activities. MND sections

Practical: The Information Systems

POW MND section.

CREAM Status and Plans Massimo Sgaravatto – INFN Padova

INFN-GRID Workshop Bari, October, 26, 2004

A full demonstration based on a “real” analysis scenario

Grid2Win: Porting of gLite middleware to Windows XP platform

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Readiness of ATLAS Computing - A personal view

Short update on the latest gLite status

A Messaging Infrastructure for WLCG

LCG middleware and LHC experiments ARDA project

The Ganga User Interface for Physics Analysis on Distributed Resources

D. van der Ster, CERN IT-ES J. Elmsheuser, LMU Munich

LHC Data Analysis using a worldwide computing grid

ATLAS DC2 & Continuous production

Short to middle term GRID deployment plan for LHCb

The LHCb Computing Data Challenge DC06

Presentation transcript:

Introduction to Distributed Analysis Dietrich Liko

Overview Introduction to Grid Computing Three grid flavors in ATLAS EGEE OSG Nordugrid Distributed Analysis Activities GANGA/LCG PANDA/OSG Other tools How to find your data ? Where is the data stored Which data is really available ?

Evolution of CERN computing 2 years to build 3 months to install 320 kBytes storage Less computing power than today’s calculators 1958: Ferranti Mercury 1967: CDC 6400 1976: IBM 370/168 The scope and complexity of particle-physics experiments has increased in parallel with increases in computing power Massive upsurge in computing requirements in going from LEP to LHC 1988: IBM MM 3090, DEC VAX, Cray X-MP 2001: PC Farm

Strategy for processing LHC data Majority of data processing (reconstruction/simulation/analysis) for LEP experiments performed at CERN About 50% of physics analyses run at collaborating institutes Similar approach might have been possible for LHC Increase data-processing capacity at CERN Take advantage of Moore’s Law increase in CPU power and storage LHC Computing Review (CERN/LHCC/2001-004) discouraged LEP-type approach Rules out access to funding not available to CERN Makes poor use of expertise and resources at collaborating institutes Require solution for managing distributed data and CPUs: Grid computing  Project for LHC Computing Grid (LCG) started 2002

Grid Computing Ideas behind Grid computing have been around since the 1970s, but became very fashionable around the turn of the century A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. Ian Foster and Carl Kesselman, The Grid: Blueprint for a New Computing Infrastructure (1998) First release of Globus Toolkit for Grid infrastructures made in 1998 World Wide Web commercially attractive by late 1990s e-Everything suddenly in vogue: e-mail, e-Commerce, e-Science Dot-com bubble 1998-2002 Grid proposed as evolution of World Wide Web: access to resources as well as to information Many projects: EGEE, OSG, Nordugrid GridPP, INFN Grid, D-Grid

Distributed Analysis Data Analysis User Production AOD & ESD analysis TAG based analysis pathena/PANDA GANGA/LCG User Production Prodsys LJSF GANGA (DQ2 Integration)

EGEE Job submission via LCG Resource Broker LFC File catalog The new gLite RB is on its way … LFC File catalog Also CondorG submission is possible Requires some expertise and has no support from the service provider New approach using Condor glideins is under investigation (Cronus)

Resource Broker Model CE RB CE RB CE

PANDA is an integrated production and distributed analysis system OSG/Panda PANDA is an integrated production and distributed analysis system Pilot job based Similar to DIRAC & Alien Simple File Catalogs at sites Will be supported by GANGA in release 4.3

Three grids …. ATLAS is using three large infrastructures EGEE OSG Nordugroid Grids have different middleware Different software to submit jobs Different catalogs to store the data We have to aim to hide this differences from the ATLAS user

Panda Model CE Task queue CE CE

Nordugrid ARC middleware for job submission RLS Filecatalog Powerful and simple RLS Filecatalog Will be supported by GANGA in release 4.3

ARC Model CE CE CE

How can we live with that ? Data management layer to hide this differences – Don Quixote 2 Tools that aim to hide the difficulties to submit jobs pathena/PANDA on OSG GANGA on LCG In the future better interoperability On level of the ATLAS tools On the level of the middleware

pathena/PANDA Lightweight client Integrated to Athena release Very nice work A lot of work has been done to support better user jobs Short queues, multitasking pilots etc. A large set of data is available Available since some time

GANGA/LCG Text UI & GUI Multiple backends A pathena-like interface is available Multiple backends LCG/EGEE LSF – works also with CAT queues PBS PANDA & Nordugrid for 4.3 And others

Dashboard Monitoring We are setting up a framework to monitor distributed analysis jobs MonaLisa based (OSG, LCG) RGMA Imperial collage DB Production system http://dashboard.cern.ch/atlas GANGA has been instrumented to understand its usage

Since September 1st …

Dataset distribution In principle data should be everywhere AOD & ESD during this year ~ 30 TB max Three steps Not all data can be consolidated Other grids, Tier-2 Distribution between Tier-1 not yet perfect Distribution to Tier-2’s can only be the next step

Latest number by Alexei – Feb 27 Files req/copied Transfers Waiting(*) Transfered in 7 days ASGC 5604 1883 33.6 53 1883 BNL 1891 1532 81.0 5 24 CERN 5587 5489 98.2 1 2581 CNAF 5610 2801 49.9 12 1111 FZK 5645 5541 98.2 0 2668 LYON 5529 5464 98.8 0 2643 NDGF 4822 3116 64.6 10 893 NIKHEF 5700 5471 96.0 1 2563 PIC 5787 2362 40.8 32 2617 RAL 5763 3903 67.7 12 30 TRIUMF 5744 3740 65.1 13 843 The milage is varying between 33.6 % to 98.2

Monitoring of transfers

Why can I not send the jobs to the data automatically ? I will advise you to send jobs to selected sites This is not the final word, this is just a way to address the current situation ATLAS is using a dataset concept Datasets have a content Datasets have one or more locations Datasets can be complete or incomplete at a location Only complete datasets can be used in a dataset based brokering process We are currently trying to understand How much data is available as complete datasets Can we do a file based brokering for incomplete datasets ? We have big progress in the last months, but not yet all is working as we would like

How to find out which data exists AMI Metadata http://lpsc1168x.in2p3.fr:8080/opencms/opencms/AMI/www/index.html Prodsys database http://cern.ch/atlas-php/DbAdmin/Ora/php-4.3.4/proddb/monitor/Datasets.php Dataset browser http://panda.atlascomp.org/?overview=dslist

How to access data ? Download with dq2_get, analyze locally Works (sometimes), is not scalable Data is distributed on sites, jobs are send to sites to analyze the data DA is promoting this way of working The process of finding the data will be fully automated in some time

Posix like IO DA wants to read data directly from the SE Prodsys is downloading the data using gridftp Use rfio, dcap, GFAL, xrootd We want to use posix like IO Size of the local disk for the job We do not need the full event We do not need all events As of today ATLAS AOD jobs read data with ~ 2 MB/sec

Analysis jobs Today on job 1 year of running of LHC Backnavigation 10 to 100 AOD files, 130 MB each 1 year of running of LHC 150 TB AOD according to ATLAS computing model Filesize 10 GB Still order of 10000 files Backnavigation Reduces IO Increase load on SE do to more “open”

Some measurements 10 files a 130 MB Local: 14:02 min Standard Analysis Example Local: 14:02 min DPM using rfio: 16:30 min Castor-2: 20:29 min 150 TB: about 1000 days

DPM in Glasglow

Athena jobs Athena uses POOL/ROOT Many issues concerning plugins and current configuration See Wiki page https://twiki.cern.ch/twiki/bin/view/Atlas/IssuesWithPosixIO

Highlights dCache DPM Castor Some issues will go away with v13 Wrong dCache library (except BNL) DPM Need to provide a symbolic link (libdpm.so -> libshift.so) Broken RFIO plugin DPM URLs not support Castor New castor sytntax not supported No files larger the 2BG Some issues will go away with v13 RFIO plugin will still be outdated New rfio library not yet released We need to do systematic test Proposed by Stephane

Backporting ROOT RFIO plugin Advantages New syntax a la Castor-2 Large Files > 2GB Problems with DPM A different URL format Some problems querying the file attributes Several patches required to make in work Security context required, but Grid UI clashes since last week with Athena due to python version New RFIO plugin is under development inside ROOT In generally new ROOT IO plugins should be backported to agreed ROOT versions

Short queues Distributed Analysis competes with Production Short queues can be used to speed up the analysis There is a lot of discussion going on how useful short queues are Empirically I prefer to send jobs to short queues https://twiki.cern.ch/twiki/bin/view/Atlas/DAGangaFAQ#How_to_find_out_suited_Computing Selection of queues is the easy part, selecting the dataset location is the complicated aspect Fully automatic for complete datasets

Summary We are learning how to access data everywhere Several tools are available to perform Distributed Analysis Integrated with DQ2 Data is being collected and also distributed Still a lot of work in front of us We are learning how to access data everywhere How to find data How to read data Not fully automatic yet But we aim for that We learn how to handle user jobs Job Priorities on LCG Short Queues

Next steps Increase the number of sites Interoperability We have to push getting the data at all Tier-1. They are the backbone of the ATLAS data distribution Interoperability Is for sure be an issue for this year GANGA will send jobs to other sites PANDA will run on LCG Cronus wants to bridge all resources

GANGA Introduction

Who is ATLAS GANGA ? GANGA Core Ulrik Egede, Karl Harrison, Jakub.Moscicki, A.Soroko, V.Romanovsky, Adrina Murao GANGA GUI Chun Lik Tan Athena AOD analysis Johannes Elmsheuser Tag Navigator Mike Kenyon, Caitherina Nicholson User production Fredric Brochu EGEE/LCG Hurng-Chun Lee, Dietrich Liko Nordugrid Pajchel Katarina, Bjoern Hallvard PANDA Dietrich Liko + support from PANDA Cronus Rod Walker AMI Integration Farida Fassi, Chun Lik Tan + support from AMI Mona Lisa Montoring Benjamin Gaidioz, Jae Yu, Tummalapalli Reddy

What is GANGA ? Ganga is an easy-to-use frontend for job definition and management Allows simple switching between testing on a local batch system and large-scale data processing on distributed resources (Grid) Developed in the context of ATLAS and LHC. For ATLAS Athena framework JobTransformations DQ2 data-management system EGEE/LCG For release 4.3 AMI PANDA/OSG Nordugrid Cronus Component architecture readily allows extension Implemented in Python

Users

Domains

GANGA Job Abstraction What to run Application Where to run Backend Data read by application Input Dataset Job Data written by application Output Dataset Rule for dividing into subjobs Splitter Rule for combining outputs Merger

Framework for plugins GangaObject IApplication ISplitter IDataset Interfaces Plugin IApplication ISplitter IDataset IMerger IBackend Athena atlas_release max_events options option_file user_setupfile user_area CE requirements jobtype middleware id status reason actualCE exitcode LCG User Example plugins and schemas System

Backends and Applications Gauss/Boole/Brunel/DaVinci (Simulation/Digitisation/ Reconstruction/Analysis) AthenaMC (Production) Athena (Simulation/Digitisation/ Reconstruction/Analysis) Executable PBS LSF OSG PANDA LHCb WMS US-ATLAS WMS Implemented Coming soon

Status Actual version: 4.2.11 Upcoming version 4.3 AOD analysis TAG based analysis Mona Lisa based Monitoring LCG/EGEE Batch handlers Upcoming version 4.3 Tag Navigator AMI Integration PANDA Nordugrid Cronus

How elements work together ? Ganga has built-in support for ATLAS and LHCb Component architecture allows customisation for other user groups LHCb applications ATLAS Other Applications Metadata catalogues Data storage and retrieval File Tools for data management GANGA User interface for job definition and management Local repository Remote repository Ganga job archives Ganga monitoring loop Experiment-specific workload-management systems Local batch systems Distributed (Grid) systems Processing systems (backends)

Different working styles Command Line Interface in Python (CLIP) provides interactive job definition and submission from an enhanced Python shell (IPython) Especially good for trying things out, and seeing how the system works Scripts, which may contain any Python/IPython or CLIP commands, allow automation of repetitive tasks Scripts included in distribution enable kind of approach traditionally used when submitting jobs to a local batch system Graphical User Interface (GUI) allows job management based on mouse selections and field completion Lots of configuration possibilities

Scripts provide pathena like interface ganga athena --inDS trig1_misal1_csc11.005033.Jimmy_jetsJ4.recon.AOD.v12000601 --outputdata AnalysisSkeleton.aan.root --split 3 --maxevt 100 --lcg --ce ce102.cern.ch:2119/jobmanager-lcglsf-grid_2nh_atlas AnalysisSkeleton_topOptions.py Monitoring the job status for example using GUI or CLI

IPython IPython CLIP How to define a job j=Job() Comfortable python shell Many useful extensions http://ipython.scipy.org/ CLIP GANGA Command line interface How to define a job j=Job() j.application=Executable() j.application.exe=‘/bin/echo’ j.applications.args=[‘Hello World’] j.backend=LCG() j.submit() Other commands jobs Jobs[20].kill() jobs[20].copy()

GUI

Exercises Subset adapted for today https://cern.ch/twiki/bin/view/Atlas/GangaTutorialAtCCIN2P3 Current Tutorial that explains more features https://cern.ch/twiki/bin/view/Atlas/GangaGUITutorial427 FAQ https://cern.ch/twiki/bin/view/Atlas/DAGangaFAQ User Support using hypernews https://hypernews.cern.ch/HyperNews/Atlas/get/GANGAUserDeveloper.html