Jakub T. Moscicki (KUBA) CERN

Slides:



Advertisements
Similar presentations
User view Ganga classes and functions can be used interactively at a Python prompt, can be referenced in scripts, or can be used indirectly via a Graphical.
Advertisements

GANGA Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Computing Lectures Introduction to Ganga 1 Ganga: Introduction Object Orientated Interactive Job Submission System –Written in python –Based on the concept.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GANGA Hurng-Chun Lee 27 Feb.
Experiment Support Introduction to HammerCloud for The LHCb Experiment Dan van der Ster CERN IT Experiment Support 3 June 2010.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Analysis demos from the experiments. Analysis demo session Introduction –General information and overview CMS demo (CRAB) –Georgia Karapostoli (Athens.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
K. Harrison CERN, 25th September 2003 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Project news - Ganga release 1 - Work towards Ganga release 2 - Interaction.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Job handling in Ganga Jakub T. Moscicki ARDA/LHCb GANGA-DIRAC Meeting, June, 2005.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Introduction to Ganga Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, November 2005.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Distributed Analysis K. Harrison LHCb Collaboration Week, CERN, 1 June 2006.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, September 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga Tutorial From: Jakub T. Moscicki (CERN)
April 27, 2006 The New GANGA GUI 26th LHCb Software Week C L Tan
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga User Interface EGEE Review Jakub Moscicki.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
Distributed Data Analysis with GANGA (Tutorial) Alexander Zaytsev Budker Institute of Nuclear Physics (BudkerINP), Novosibirsk On the basis of GANGA EGEE.
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
2 June 20061/17 Getting started with Ganga K.Harrison University of Cambridge Tutorial on Distributed Analysis with Ganga CERN, 2.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Ganga development - Theory and practice - Ganga 3 - Ganga 4 design - Ganga 4 components and framework - Conclusions K. Harrison CERN, 25th May 2005.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
K. Harrison CERN, 21st February 2005 GANGA: ADA USER INTERFACE - Ganga release Python client for ADA - ADA job builder - Ganga release Conclusions.
K. Harrison BNL, 29th August 2003 THE GANGA PROJECT -Project objectives and organisation - Ganga design - Current status of software - Conclusions.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
A GANGA tutorial Professor Roger W.L. Jones Lancaster University.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
User view Ganga classes and functions can be used interactively at a Python prompt, can be referenced in scripts, or can be used indirectly via a Graphical.
Grid User Interface:Ganga Farida Fassi Master de Physique Informatique Rabat, Maroc th, May, 2011.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Advanced Computing Facility Introduction
Fundamental of Databases
Practical using C++ WMProxy API advanced job submission
Distrubuited Analysis using GANGA
Simulation Production System
Integrating Scientific Tools and Web Portals
Dan van der Ster for the Ganga Team
Getting Started with R.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
LCG middleware and LHC experiments ARDA project
The Ganga User Interface for Physics Analysis on Distributed Resources
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

Jakub T. Moscicki (KUBA) CERN Ganga Tutorial Jakub T. Moscicki (KUBA) CERN

Agenda Part I: Ganga introduction Part II: Ganga hands-on Part III: More about Ganga EGEE - SEE Tutorial, Budapest

Part I: Ganga Overview

The Ganga Project DIRAC Batch Grids Panda “configure once, run anywhere” user applications resources GUI Ganga Interface Localhost Batch Condor Grids DIRAC Panda cmd line shell scripting jobs Interface helping users to configure applications and submit jobs to the Grid The primary focus is on the Atlas and LHCb applications. The project was initiated by HEP collaborations because they appreciated the need of a consistent and easy to use environment which would allow users to focus on their tasks and hide all technicalities of the underlying systems. Ganga is an open system which interfaces to generic resources and middleware (Batch FARMS, Grid MIDDLEWARE, Condor...) as well as the experiment specific systems (DIRAC, Panda). user may move between different resources and systems easily tool is adapted to the needs of the users and their different working styles and habits: GUI shell scripts Ganga provides access to auxiliary services such as: data management monitoring Data Job Parameters Monitoring

Introduction Goals: provide a simple and consistent way of preparing, organising and executing jobs on different computing infrastructures provide a clean interface which can be used: interactively (CLI / python interpreter) as a Python API for scripting through a GUI Make it easy and integrated with application environment Allow quick transition between local PC, cluster, Grid... Organize work, keep history of jobs,... EGEE - SEE Tutorial, Budapest

Motivation FULL RUN TEST DEBUG In practice users deal with multiple computing backends FULL RUN PANDA TEST PB S SG E LS F Local PC DEBUG EGEE - SEE Tutorial, Budapest

Motivation FAQ: running applications on multiple computing backends I must learn many interfaces PB S LocalP C LS F SG E PANDA How to configure my applications? Do I get a consistent view on all my jobs? EGEE - SEE Tutorial, Budapest 7

Ganga Ganga: Job Management Tool a utility which you download to your computer or it is already installed in your institute in a shared area for example: /nfs/sw/ganga/install/4.3.2 it is an add-on to installed software comes with a set of plugins for some applications open - other applications and backend may be easily added even by users Ganga Application Software LSF Client LCG UI GangaFramework .... Backend Plugins Application Plugins EGEE - SEE Tutorial, Budapest 8

Why not portal? Many (not all) scientific applications are: developed/modified/extended on local machine use local resources (files, environment) are scripted (e.g. higher-level logic is built around them) What user mobility (desktop vs laptop)? Ganga may use remote repository (and workspace) So user may see his jobs submitted from a different machine and interact with them (resubmit, kill)... Ganga approach: lean and neat utility with interface in Python + GUI if you like + scripting if you like more EGEE - SEE Tutorial, Budapest 9

User interfaces CLIP GUI GPI & Scripting *** Welcome to Ganga *** Version: Ganga-4-2-8 Documentation and support: http://cern.ch/ganga Type help() or help('index') for online help. In [1]: jobs Out[1]: Statistics: 1 jobs -------------- # id status name subjobs application backend backend.actualCE # 1 completed Executable LCG lcg-compute.hpc.unimelb.edu.au:2119/jobmanage CLIP GUI #!/usr/bin/env ganga #-*-python-*- import time j = Job() j.backend = LCG() j.submit() while not j.status in [‘completed’,’failed’]: print(‘job still running’) time.sleep(30) ./myjob.exec ganga ./myjob.exec In [1]:execfile(“myjob.exec”) GPI & Scripting We will focus on the Ganga CLIP (Command Line Interface for Python) EGEE - SEE Tutorial, Budapest

What Ganga can do for you help configure applications if you teach it about your application organize work job history: keep track of what user did save job outputs in a consistent way reuse configuration of previously submitted jobs you do not have to learn many interfaces one set of (simple) commands which are always the same localy and on the Grid Ganga will not hide anything from you you still have access to automatically generated jdls, job id etc. ganga will tell you what it does to submit jobs Works with no change on Grid production systems and GILDA, ... EGEE - SEE Tutorial, Budapest 11

GANGA User Communities More than 900 users use Ganga Garfield HARP EGEE - SEE Tutorial, Budapest

The Ganga development team Ganga is supported by HEP Support for development work Core team: F.Brochu (Cambridge), U.Egede (Imperial), J. Elmsheuser (Munich), K.Harrison (Cambridge), H.C.Lee (ASGC Taipei), D.Liko (CERN), A.Maier (CERN), J.T.Moscicki (CERN), A.Muraru (Bucharest), W.Reece (Imperial), A.Soroko (Oxford), CL.Tan (Birmingham) EGEE - SEE Tutorial, Budapest

Part I: Practical Ganga

Download, Install, First launch wget http://cern.ch/ganga/download/ganga-install python ganga-install \ --prefix=/usr/local/ganga/prefix \ --extern=GangaGUI,GangaPlotter \ 4.3.2 Download & Install download installer installation prefix Installation of external modules Ganga version export PATH=$HOME/opt/ganga/install/4.3.2/bin:$PATH $ ganga *** Welcome to Ganga *** Version: Ganga-4-3-2 Documentation and support: http://cern.ch/ganga Type help() or help('index') for online help. In [1]: Do you really want to exit ([y]/n)? First Launch start Ganga with inline configurations Ganga CLIP <ctrl>-D to exit Ganga CLIP EGEE - SEE Tutorial, Budapest

Where the Ganga journey starts … Ganga Job Where the Ganga journey starts … Mandatory Executable EGEE Optional The first thing to talk is about the Ganga job Ganga adopts the traditional way the scientists run their application: everything starts from “Job” But what’s different is that this time Ganga puts on top the physical job a layer of abstraction. The abstraction allows users to decided how their job should be executed. There are 6 building blocks of the job abstraction EGEE - SEE Tutorial, Budapest

Get your hands dirty ... submit and run a test job on local machine: Job().submit() submit and run a a test job on EGEE/LCG: Job(backend=LCG()).submit() browse the created jobs (job history): jobs get the first job from the job history: j = jobs[0] print the details of the job and see what you can set for a job: j make a copy of the job and submit the new job: j.copy().submit() see what you can do with the job: j.<tab> get interactive help: help EGEE - SEE Tutorial, Budapest

Python ConfigParser standard Configurations [Configuration] TextShell = IPython ... ... [LCG] EDG_ENABLE = True Syntax Python ConfigParser standard Hard-coded default configurations export GANGA_CONFIG_PATH=/usr/local/ganga/prefix/install/etc/Gilda.ini:GangaTutorial/Tutorial.ini ganga --config-path= =/usr/local/ganga/prefix/install/etc/Gilda.ini:GangaTutorial/Tutorial.ini ~/.gangarc ganga -o How to set configurations release config site config user config user config > site config > release config Override sequence EGEE - SEE Tutorial, Budapest

`gangadir` gangadir folder is created at the first launch within $HOME directory To locate it in different directory: [DefaultJobRepository] local_root = /alternative/gangadir/repository [FileWorkspace] topdir = /alternative/gangadir/workspace Job Repository may also be stored remotely in a database Metadata of jobs Data of jobs Possible to use the same Ganga instance to maintain multiple repositories (quite useful to separate project jobs) EGEE - SEE Tutorial, Budapest

Some handy functions <tab> completion <page up/down> for cmd history system command integration Job template In[1]: plugins() plugins(‘backends’) In[2]: help() etc. In[1]: j = jobs[1] In[2]: cat $j.outputdir/stdout Hello World In[1]: t = JobTemplate(name=’lcg_simple’) In[2]: t.backend = LCG(middleware=’EDG’) In[3]: templates Out[3]: Statistics: 1 templates -------------- # id status name subjobs application backend backend.actualCE # 3 template lcg_simple Executable LCG In[4]: j = Job(templates[3]) In[5]: j.submit() EGEE - SEE Tutorial, Budapest

Behind the scenes ... User has access to functionality of GANGA components through GUI, CLI or batch scripts storage and recovery of job information in a local or remote DB storage of sandbox files What happens when user types some job operation commands on the Ganga user interface. application configuration Data Management input data selection, output location specification job submission: selection between Grid and local system EGEE - SEE Tutorial, Budapest

What's next? NOT FOR THIS TUTORIAL Changes in Ganga 4.4.2 release jobs(id) instead of jobs[id] Ganga 5.0 many useful improvements and some interface changes EGEE - SEE Tutorial, Budapest

Part II: Ganga hands-on

Step 0: launch Ganga CLIP https://twiki.cern.ch/twiki/bin/view/ArdaGrid/EGEETutorialPackage Skip the installation step Start your Ganga CLIP session: shell> ganga EGEE - SEE Tutorial, Budapest

Step 1: Your first Ganga job - an arbitrary shell script In [1]: !pico myscript.sh In [2]: !chmod +x myscript.sh In [2]: j = Job() In [3]: j.application = Executable() In [4]: j.application.exe = File(‘myscript.sh’) In [5]: j.application.args = [‘Budapest’] In [6]: j.backend = Interactive() In [7]: j.submit() In [8]: jobs #!/bin/sh echo “Hello ${1} !” echo $HOSTNAME cat /proc/cpuinfo | grep 'model name’ cat /proc/meminfo | grep 'MemTotal‘ echo “Run on `date`” EGEE - SEE Tutorial, Budapest

Step 2: Your first Ganga job - an arbitrary shell script In [9]: j = j.copy() In [10]: j.backend = Local() In [11]: j.submit() In [12]: jobs In [13]: j.peek() In [14]:cat $j.outputdir/stdout ./myscript.sh Budapest EGEE - SEE Tutorial, Budapest

Step 3: your first Ganga job on the Grid In [15]:j = j.copy() In [16]:j.backend = LCG() In [17]:j.application.args = [‘Somewhere in the world...’] In [18]:j.submit() In [19]:j In [20]:cat $j.backend.loginfo(verbosity=1) In [21]:jobs EGEE - SEE Tutorial, Budapest

Exercise: Prime factorization Please follow the instructions from: https://twiki.cern.ch/twiki/bin/view/ArdaGrid/EGEETutorialPackage#Exercise_Prime_number_factorizat Table lookup EGEE - SEE Tutorial, Budapest

Part III: More

GANGA User Communities Garfield HARP EGEE - SEE Tutorial, Budapest

More than 900 unique Users ~ till June: 650 different users, ~120 users weekly the monitoring started end 2006 ~60% Atlas ~25% LHCb ~15% others - This plot shows the establishment of the Ganga user community. - ATLAS and LHCb users dominated the user community, the users coming from other communities are growing up - on average, we have roughly 40-50 unique users using Ganga everyday . Based on the trend, the number is growing. - Overall 300 unique users start using Ganga in recent 2 months EGEE - SEE Tutorial, Budapest

Ganga usage over 50 local sites CLIP and scripts most popular EGEE - SEE Tutorial, Budapest

Real Application: the ATLAS data analysis application $ ganga athena \ --inDS myInputDataset.txt\ --outputdata myOutput.root \ --split 3 \ --maxevt 100 \ --lsf \ jobOptions.py Scripting mode quick mix j = Job() j.application=Athena() j.application.prepare() j.application.option_file='jobOptions.py‘ j.inputdata=DQ2Dataset() j.inputdata.type='DQ2_LOCAL' j.inputdata.dataset=“myInputDataset.txt” j.outputdata=DQ2OutputDataset() j.outputdata.outputdata=[‘myOutput.root'] j.splitter = AthenaSplitterJob(numsubjobs=3) j.merger = AthenaOutputMerger() j.backend = LSF() j.submit() j2 = j.copy() j2.backend=LCG( CE=’ce102.cern.ch:2119/jobmanager-lcglsf-grid_2nh_atlas’ ) j2.submit() CLIP mode application inputdata flexible How a typical “real application” job looks like. Very easy to translate it to “scripting mode”, which immediately give users another way to run their applications outputdata Splitter & Merger EGEE - SEE Tutorial, Budapest

More than job submission: Monitoring & Accounting This is application level monitoring and accounting … how many jobs are submitted by whom, how many dataset has been analyzed … EGEE - SEE Tutorial, Budapest

Dashboard monitoring analysis and simulations jobs run in ATLAS one month (mid-March to mid-April 2007) 20K analysis jobs, 10K simulation jobs only LCG jobs, others not shown data collected by a ARDA Dashboard sensor integrated with Ganga EGEE - SEE Tutorial, Budapest

Integration with frameworks 200K short jobs in 12 hours (500CPUh) - Diane is an grid job optimization layer (application level scheduling, pull-mode job scheduling …) which has been used for application tasks should be done in a short deadline. The weakest part of DIANE is that it doesn’t have the information about the Grid jobs … it has just the working information … so for some statistic analysis, the DIANE worker has to be instrumented to collect the Grid job information … instead of doing that, we simply delegate the Grid job management work to Ganga since Ganga has implemented a quite nice mechanism to monitor and to keep track the progress of the Grid jobs. At the end, we could simply do some statistic analysis using Ganga and its job repository. parallelism on task level (beyond DAG) customized failure recovery semi-interactivity EGEE - SEE Tutorial, Budapest

Web portal for biologists Interface created by biologists (Model-View-Controller design pattern) The Model makes use of Ganga as a submission tool and DIANE to better handle docking jobs on the Grid The Controller organizes a set of actions to perform the virtual screening pipeline; The View represents biological aspects EGEE - SEE Tutorial, Budapest

specific backend binding Extending Ganga Splitter Merger Datasets Application plugin generic logic specific backend binding Detailed guide available at: https://twiki.cern.ch/twiki/bin/view/ArdaGrid/HowToPlugInNewApplicationType A good example: python/GangaTutorial/Lib/*.py EGEE - SEE Tutorial, Budapest

More info. Ganga Home: http://cern.ch/ganga Official Ganga User’s Guide: http://cern.ch/ganga/user/html/GangaIntroduction/ GangaTutorial GPI Reference Manual : http://ganga.web.cern.ch/ganga/release/4.3.2/reports/html/Manuals/GangaTutorialManual.html Looking for help: project-ganga-developers@cern.ch EGEE - SEE Tutorial, Budapest