Distributed Data Analysis with GANGA (Tutorial) Alexander Zaytsev Budker Institute of Nuclear Physics (BudkerINP), Novosibirsk On the basis of GANGA EGEE.

Slides:



Advertisements
Similar presentations
13/05/2004Janusz Martyniak Imperial College London 1 Using Ganga to Submit BaBar Jobs Development Status.
Advertisements

GANGA Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Computing Lectures Introduction to Ganga 1 Ganga: Introduction Object Orientated Interactive Job Submission System –Written in python –Based on the concept.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Tutorial How to get started.
GRID INTEROPERABILITY USING GANGA Soonwook Hwang (KISTI) YoonKee Lee and EunSung Kim (Seoul National Uniersity) KISTI-CCIN2P3 FKPPL Workshop December 1,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GANGA Hurng-Chun Lee 27 Feb.
Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Analysis demos from the experiments. Analysis demo session Introduction –General information and overview CMS demo (CRAB) –Georgia Karapostoli (Athens.
GETTING STARTED ON THE GRID: W. G. SCOTT (RAL/PPD) RAL PHYSICS MEETING TUES 15 MAY GENERATED 10K SAMPLES IN EACH CHANNEL ON LXPLUS (IN 2006) SIMULATED/DIGITISDED.
Ganga 3 CLIP Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, April 2005.
SA1 / Operation & support Enabling Grids for E-sciencE Integration of heterogeneous computational resources in.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
INFSO-RI Enabling Grids for E-sciencE The GENIUS Grid portal Tony Calanducci INFN Catania - Italy First Latin American Workshop.
Distributed Analysis using Ganga I.Ideas behind Ganga II.Getting started III.Running ATLAS applications Distributed Analysis Tutorial ATLAS Computing &
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Interoperability Shibboleth - gLite Christoph.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
Building an Athena Job with GANGA a step-by-step GUI approach Tutorial Material by C L Tan.
Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.
Grid job submission using HTCondor Andrew Lahiff.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
Job handling in Ganga Jakub T. Moscicki ARDA/LHCb GANGA-DIRAC Meeting, June, 2005.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
INFSO-RI Enabling Grids for E-sciencE ATLAS Distributed Analysis A. Zalite / PNPI.
INFSO-RI Enabling Grids for E-sciencE WMS + LB Installation Emidio Giorgio Giuseppe La Rocca INFN EGEE Tutorial, Rome November.2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GILDA and gaining access.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Apr. 25, Grid Computing Hands On Training for Users Faculty of Sciences, University.
INFSO-RI Enabling Grids for E-sciencE SCDB C. Loomis / Michel Jouvin (LAL-Orsay) Quattor Tutorial LCG T2 Workshop June 16, 2006.
Introduction to Alexander Richards Thanks to Mike Williams, ICL for many of the slides content.
Introduction to Ganga Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, November 2005.
Introduction to Taverna Online and Interaction service Aleksandra Pawlik University of Manchester.
INFSO-RI Enabling Grids for E-sciencE User Interface (UI) Installation Giuseppe La Rocca INFN Catania - Italy First Latin American.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, September 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga Tutorial From: Jakub T. Moscicki (CERN)
April 27, 2006 The New GANGA GUI 26th LHCb Software Week C L Tan
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Distributed Computing and Ganga Karl Harrison (University of Cambridge) 3rd LHCb-UK Software Course National e-Science Centre, Edinburgh, 8-10 January.
INFSO-RI Enabling Grids for E-sciencE Charon Extension Layer. Modular environment for Grid jobs and applications management Jan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga User Interface EGEE Review Jakub Moscicki.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
2 June 20061/17 Getting started with Ganga K.Harrison University of Cambridge Tutorial on Distributed Analysis with Ganga CERN, 2.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Ganga development - Theory and practice - Ganga 3 - Ganga 4 design - Ganga 4 components and framework - Conclusions K. Harrison CERN, 25th May 2005.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
A GANGA tutorial Professor Roger W.L. Jones Lancaster University.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
Starting Analysis with Athena (Esteban Fullana Torregrosa) Rik Yoshida High Energy Physics Division Argonne National Laboratory.
User view Ganga classes and functions can be used interactively at a Python prompt, can be referenced in scripts, or can be used indirectly via a Graphical.
Grid User Interface:Ganga Farida Fassi Master de Physique Informatique Rabat, Maroc th, May, 2011.
Distrubuited Analysis using GANGA
Dan van der Ster for the Ganga Team
Submit BOSS Jobs on Distributed Computing System
Jakub T. Moscicki (KUBA) CERN
gLite User Interface Installation and configuration
Presentation transcript:

Distributed Data Analysis with GANGA (Tutorial) Alexander Zaytsev Budker Institute of Nuclear Physics (BudkerINP), Novosibirsk On the basis of GANGA EGEE Tutorial RDIG-ATLAS IHEP (20 September 2007)

20/09/20072 Overview Logging to the LCG UI, install usercert and tuning GRID environment Short introduction to Ganga Introduction to Ganga CLIP and GUI Submitting the simple local jobs Setup Athena environment and tune Ganga to submit Athena jobs Submitting the test jobs Cleaning GRID environment

20/09/20073 Setting up GRID Environment Log to the ui0001.m45.ihep.su under a test account or lxplus.cern.ch under your CERN account Copy your usercert.pem and userkey.pem to the local ~/.globus directory (if necessary) Tune LCG tools (if necessary), e.g.: bash> which grid-proxy-init bash> /afs/cern.ch/project/gd/LCG-share/sl3/etc/profile.d/grid_env.sh bash> export LFC_HOST='prod-lfc-atlas-local.cern.ch‘ bash> export LCG_CATALOG_TYPE=lfc Create a GRID proxy : bash> grid-proxy-init bash> grid-proxy-info

Enabling Grids for E-sciencE EGEE-II INFSO-RI Ganga Ganga: Job Management Interface –a utility which you download to your computer  or it is already installed in your institute in a shared area for example: /nfs/sw/ganga/install/  it is an add-on to installed software –comes with a set of plugins for Atlas and LHCb  open - other applications and backend may be easily added even by users GangaFramework.... Backend Plugins Application Plugins Ganga Application Software LSF Client LCG UI

Enabling Grids for E-sciencE EGEE-II INFSO-RI Ganga Job Where the Ganga journey starts … Mandatory Optional

Enabling Grids for E-sciencE EGEE-II INFSO-RI Plug-in based design Common interface Specific implementation ✦ Ease user’s experience in switching between different technologies ✦ Concentrate developer’s effort in specific domain

Enabling Grids for E-sciencE EGEE-II INFSO-RI Applications & backends

20/09/20078 Basic Ganga Installation From AFS at CERN (lxplus): bash> source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh or explicitely bash> source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh From AFS at ui0001.m45: bash> export GANGA_CONFIG_PATH=GangaAtlas/Atlas.ini bash> export PATH=/afs/cern.ch/sw/ganga/install/4.4.1/bin:$PATH From /tmp test installation at ui0001.m45: bash> export GANGA_CONFIG_PATH=GangaAtlas/Atlas.ini bash> export PATH=/tmp/atlasprd/opt/ganga/install/4.4.1/bin:$PATH Try to start Ganga CLIP: bash> ganga

Enabling Grids for E-sciencE EGEE-II INFSO-RI Download, Install, First launch wget python ganga-install \ --prefix=~/opt/ganga \ --extern=GangaAtlas,GangaGUI,GangaPlotter \ Download & Install export PATH $HOME/opt/ganga/install/4.4.1/bin:$PATH ganga -o’[LCG]ENABLE_EDG=False’ -o’[LCG]ENABLE_GLITE=False’ *** Welcome to Ganga *** Version: Ganga Documentation and support: Type help() or help('index') for online help. In [1]: Do you really want to exit ([y]/n)? First Launch download installer installation prefix Installation of external modules Ganga version start Ganga with inline configurations Ganga CLIP -D to exit Ganga CLIP

Enabling Grids for E-sciencE EGEE-II INFSO-RI Explore basic features of the CLIP  Job().submit() submit and run a test job on local machine  Job(backend=LCG()).submit() submit and run a a test job on LCG  jobs browse the created jobs (job history)  j = jobs[1] get the first job from the job history  j print the details of the job and see what you can set for a job  j.copy().submit() make a copy of the job and submit the new job  j. see what you can do with the job

Enabling Grids for E-sciencE EGEE-II INFSO-RI Configurations [Configuration] TextShell = IPython... [LCG] EDG_ENABLE = True... Syntax Hardcoded configurations export GANGA_CONFIG_PATH = /some/physics/subgroup.ini:GangaLHCb/LHCb.ini ganga --config-path=/some/phsycis/subgroup.ini:GangaLHCb/LHCb.ini ~/.gangarc ganga -o How to set configurations user config > site config > release config Override sequence Python ConfigParser standard release config site config user config

Enabling Grids for E-sciencE EGEE-II INFSO-RI The ‘gangadir’ Created at the first launch within $HOME directory To locate it in different directory: – [DefaultJobRepository] local_root = /alternative/gangadir/repository – [FileWorkspace] topdir = /alternative/gangadir/workspace Metadata of jobs Data of jobs

Enabling Grids for E-sciencE EGEE-II INFSO-RI *** Welcome to Ganga *** Version: Ganga Documentation and support: Type help() or help('index') for online help. In [1]: jobs Out[1]: Statistics: 1 jobs # id status name subjobs application backend backend.actualCE # 1 completed Executable LCG lcg- compute.hpc.unimelb.edu.au:2119/jobmanage CLIP User interfaces GUI #!/usr/bin/env ganga #-*-python-*- import time j = Job() j.backend = LCG() j.submit() while not j.status in [‘completed’,’failed’]: print(‘job still running’) time.sleep(30)./myjob.exec ganga./myjob.exec In [1]:execfile(“myjob.exec”) GPI & Scripting

Enabling Grids for E-sciencE EGEE-II INFSO-RI Some handy functions completion for cmd history system command integration Job template In[1]: plugins() – plugins(‘backends’) In[2]: help() etc. In[1]: j = jobs[1] In[2]: cat $j.outputdir/stdout Hello World In[1]: t = JobTemplate(name=’lcg_simple’) In[2]: t.backend = LCG(middleware=’EDG’) In[3]: templates Out[3]: Statistics: 1 templates # id status name subjobs application backend backend.actualCE # 3 template lcg_simple Executable LCG In[4]: j = Job(templates[3]) In[5]: j.submit()

Enabling Grids for E-sciencE EGEE-II INFSO-RI Step 1: Your first Ganga job - an arbitrary shell script In [1]: !vi myscript.sh In [2]: !chmod +x myscript.sh In [2]: j = Job() In [3]: j.application = Executable() In [4]: j.application.exe = File(‘myscript.sh’) In [5]: j.application.args = [‘ganga’] In [6]: j.backend = Local() In [7]: j.submit() In [8]: jobs In [9]: j.peek() In [10]:cat $j.outputdir/stdout #!/bin/sh echo "hello! ${1}” echo $HOSTNAME cat /proc/cpuinfo | grep 'model name’ cat /proc/meminfo | grep 'MemTotal'./myscript.sh ganga

Enabling Grids for E-sciencE EGEE-II INFSO-RI Step 2: your first Ganga job on the Grid In [11]:j = j.copy() In [12]:j.backend = LCG() In [13]:j.application.args = [‘grid’] In [14]:j.submit() In [15]:j In [16]:cat $j.backend.loginfo(verbosity=1) In [17]:jobs More samples on

20/09/ Tuning Ganga to Submit ATLAS Jobs Edit $HOME/.gangarc as follows: –In the section labelled [Configuration] add the line: RUNTIME_PATH = GangaAtlas –In the section labelled [LCG] add the line: VirtualOrganisation = atlas –In the section labeled [Athena] add the lines: # local path to base paths of dist-kits (lxplus example) ATLAS_SOFTWARE = /afs/cern.ch/project/gd/apps/atlas/slc3/software Setup Athena environment (if necessary): Tune up DQ2 environment: bash> source /afs/usatlas.bnl.gov/Grid/Don-Quijote/dq2_user_client/setup.[c,z]sh.CERN bash> dq2_ls -f users.JoeUser.ganga bash> dq2_get -r users.JoeUser.ganga Many job examples on:

Enabling Grids for E-sciencE EGEE-II INFSO-RI j = Job() j.application = Athena() j.application.option_file = ‘myOpts.py’ j.application.prepare(athena_compile = False) j.inputdata = DQ2Dataset() j.inputdata.dataset = ‘interestingDataset.AOD.v ’ j.inputdata.type = ‘DQ2_Local’ j.outputdata = AthenaOutputDataset() j.outputdata.outputdata = ‘myOutput.root’ j.splitter = AthenaSplitterJob(numsubjobs=2) j.merger = AthenaOutputMerger() j.backend = LCG( CE=’ce102.cern.ch:2119/jobmanager-lcglsf-grid_2nh_atlas’ ) j.submit() Real Application: the ATLAS data analysis application Application Input data Output dataSplitter & Merger Scripting mode CLIP

Enabling Grids for E-sciencE EGEE-II INFSO-RI Behind the scene...

20/09/ Submitting ATLAS LCG Jobs Submitting Local Athena Analysis Job: Submitting DQ2Dataset input and ATLASOutputDataset Job Monitoring job execution Retrieving output

20/09/ Cleaning up GRID Environment Destroy your GRID proxy : bash> grid-proxy-destroy Remove your usercert.pem and userkey.pem out of local ~/.globus directory

20/09/ The End