Distrubuited Analysis using GANGA

Slides:



Advertisements
Similar presentations
User view Ganga classes and functions can be used interactively at a Python prompt, can be referenced in scripts, or can be used indirectly via a Graphical.
Advertisements

1 CRAB Tutorial 19/02/2009 CERN F.Fanzago CRAB tutorial 19/02/2009 Marco Calloni CERN – Milano Bicocca Federica Fanzago INFN Padova.
GANGA Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Computing Lectures Introduction to Ganga 1 Ganga: Introduction Object Orientated Interactive Job Submission System –Written in python –Based on the concept.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to GANGA Hurng-Chun Lee 27 Feb.
Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.
Experiment Support Introduction to HammerCloud for The LHCb Experiment Dan van der Ster CERN IT Experiment Support 3 June 2010.
A tool to enable CMS Distributed Analysis
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Ganga Developments Karl Harrison (University of Cambridge) 18th GridPP Meeting University of Glasgow, 20th-21st March 2007
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
Resource Management and Accounting Working Group Working Group Scope and Components Progress made Current issues being worked Next steps Discussions involving.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
Distributed Analysis using Ganga I.Ideas behind Ganga II.Getting started III.Running ATLAS applications Distributed Analysis Tutorial ATLAS Computing &
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Job handling in Ganga Jakub T. Moscicki ARDA/LHCb GANGA-DIRAC Meeting, June, 2005.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Introduction to Ganga Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
D. Adams, D. Liko, K...Harrison, C. L. Tan ATLAS ATLAS Distributed Analysis: Current roadmap David Adams – DIAL/PPDG/BNL Dietrich Liko – ARDA/EGEE/CERN.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, November 2005.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, September 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga Tutorial From: Jakub T. Moscicki (CERN)
Distributed Computing and Ganga Karl Harrison (University of Cambridge) 3rd LHCb-UK Software Course National e-Science Centre, Edinburgh, 8-10 January.
INFSO-RI Enabling Grids for E-sciencE Charon Extension Layer. Modular environment for Grid jobs and applications management Jan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ganga User Interface EGEE Review Jakub Moscicki.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
David Adams ATLAS ATLAS Distributed Analysis: Overview David Adams BNL December 8, 2004 Distributed Analysis working group ATLAS software workshop.
Distributed Data Analysis with GANGA (Tutorial) Alexander Zaytsev Budker Institute of Nuclear Physics (BudkerINP), Novosibirsk On the basis of GANGA EGEE.
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
2 June 20061/17 Getting started with Ganga K.Harrison University of Cambridge Tutorial on Distributed Analysis with Ganga CERN, 2.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
A GANGA tutorial Professor Roger W.L. Jones Lancaster University.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
User view Ganga classes and functions can be used interactively at a Python prompt, can be referenced in scripts, or can be used indirectly via a Graphical.
Grid User Interface:Ganga Farida Fassi Master de Physique Informatique Rabat, Maroc th, May, 2011.
L’analisi in LHCb Angelo Carbone INFN Bologna
Dan van der Ster for the Ganga Team
Practical: The Information Systems
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
A full demonstration based on a “real” analysis scenario
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Jakub T. Moscicki (KUBA) CERN
The Ganga User Interface for Physics Analysis on Distributed Resources
ADA aodhisto transformation
Presentation transcript:

Distrubuited Analysis using GANGA Farida Fassi CCIN2P3/CNRS , Lyon, France

Outline Ganga Overview Ganga Architecture How to use Ganga More on Ganga usage

Ganga Overview Ganga is a Gaudi/Athena and Grid Alliance project jointly developed by ATLAS and LHCb experiments Ganga is an easy-to-use front-end for job definition and management enabling a user to: Configure – Prepare – Monitor – Submit Ganga tries to answer the questions: How to minimize user’s effort in running applications?

Ganga Overview The naive idea of submitting jobs to Grid assume the following steps: Prepare the “Job Description Language” file for job configuration Find suitable Athena software application Locate the datasets on different storage elements Job splitting, monitoring and book-keeping Ganga combines the components to provide a front-end client for interacting with Grid infrastructures

Ganga 5

Ganga Overview Ganga allows simple switching between testing on a local batch system and large-scale data processing on Grid distributed resources Jobs look the same whether they run locally or on the Grid Configure once, run anywhere GANGA Local Batch Local Machine GRID LSF PBS EGEE OSG Nordugrid

Architecture Job Object is where the Ganga journey starts: A job in Ganga is constructed from a set of building blocks, not all required for every job Mandatory Optional The first thing to talk is about the Ganga job Ganga adopts the traditional way the scientists run their application: everything starts from “Job” But what’s different is that this time Ganga puts on top the physical job a layer of abstraction. The abstraction allows users to decided how their job should be executed. There are 6 building blocks of the job abstraction

Specific implementation Architecture Customized application, plug-in based design , eases job creation Incremental analysis development switching between different technologies: First test on local machine Intermediate sample analyzed on batch Full sample run using GRID backends Common interface Common part is taken care by normal object, specific part is based on “schema”, a python technology to expose specific implementation to it’s abstraction Specific implementation

User interfaces CLIP GUI GPI & Scripting *** Welcome to Ganga *** Version: Ganga-4-4-2 Documentation and support: http://cern.ch/ganga Type help() or help('index') for online help. In [1]: jobs Out[1]: Statistics: 1 jobs -------------- # id status name subjobs application backend backend.actualCE # 27 completed TestGroupArea-ific Athena LCG ce01.ific.uv.es:2119/jobmanager-pbs-short CLIP GUI #!/usr/bin/env ganga #-*-python-*- import time j = Job() j.backend = LCG() j.submit() while not j.status in [‘completed’,’failed’]: print(‘job still running’) time.sleep(30) ./myjob.exec ganga ./myjob.exec In [1]:execfile(“myjob.exec”) GPI & Scripting We will focus on the Ganga CLIP (Command Line Interface for Python)

How to use Ganga

Python ConfigParser standard Configurations [configuration] TextShell = IPython ... ... [LCG] VirtualOrganisation=atlas [athena]  LCGOutputLocation = srm://lsrm.ific.uv.es/lustre/ific.uv.es/grid/atlas/dq2/users/ LocalOutputLocation = srm://lsrm.ific.uv.es/lustre/ific.uv.es/grid/atlas/dq2/users/ ATLAS_SOFTWARE = /opt/exp_software/atlas/prod/releases/rel_12-0_2 …. …. Syntax Python ConfigParser standard Hardcoded configurations setenv GANGA_CONFIG_PATH GangaAtlas/Atlas.ini set path = (/afs/ific.uv.es/project/atlas/software/ganga/install/4.4.2/bin/ $path) ~/.gangarc ganga -g How to set configurations release config site config user config user config > site config > release config Sequence

Configurations Ganga processes, in the order they are specified, any configuration files pointed to by the environment variable GANGA_CONFIG_PATH  and then processes “.gangarc” configure file This makes possible the use of group configuration files  But allows settings to be overridden by user config

Ganga Workspace Ganga creates a directory gangadir in your home directory and uses this for storing job-related files and information created at the first launch [DefaultJobRepository] local_root = /alternative/gangadir [ Metadata of jobs Data of jobs Possible to use the same Ganga instance to maintain multiple repositories (quite useful to separate project jobs)

“Hello World” example”: CLIP From a Ganga CLIP session, a job that writes “Hello World” can be created, and submitted to LCG, as follows app = Executeable() app.exe = “/bin/echo” app.env = {} app.args = [“Hello World”] # Property values set above are in fact the defaults # for Executable application j = Job(application = app, backend = LCG()) j.submit() # Check on job progress jobs # When job has completed, check the output j.peek(“stdout”)

ATLAS Analysis Job See Santi’s talk for ATLAS Analysis Model and Data Format ATLAS Applications: Athena and AthenaMC Data input: DQ2Dataset: all DQ2 dataset handling in client, LFC/SE interaction on worker node, used by all backends ATLASDataset: LFC file access ATLASLocalDataset: local file system, Local/Batch backend Data output: DQ2OutputDataset: stores files on Grid SE, registration in DQ2 AtlasOutputDataset: multipurpose for Grid and Local output

Athena example: CLIP This assumes you are in the ATLAS VO, your cmt area set up and have checked out, built your package into a work area: see Demo next j = Job() j.name='Test-AthenaJob-IFIC' j.application = Athena() j.application.exclude_from_user_area=["*.o","*.root.*","*.exe"] j.application.prepare(athena_compile=False) j.application.option_file='$HOME/AthenaTerstArea/12.0.6/PhysicsAnalysis/AnalysisCommon/UserAnalysis/UserAnalysis-00-09-10/run/AnalysisSkeleton_topOptions.py' j.application.atlas_release='12.0.6' j.inputdata.type='DQ2_LOCAL' j.application.max_events='10‘ j.inputdata=DQ2Dataset() j.inputdata.dataset="trig1_misal1_mc12.005186.PythiaZmumu_pt100_fixed.recon.AOD.v12000601_tid005906" j.splitter = AthenaSplitterJob(numsubjobs=2) j.merger = AthenaOutputMerger() j.outputdata=DQ2OutputDataset() j.outputdata.outputdata=['AnalysisSkeleton.aan.root'] j.backend=LCG() j.backend.CE='ce01.ific.uv.es:2119/jobmanager-pbs-short' j.submit() Aplication InputData How a typical “real application” job looks like. Very easy to translate it to “scripting mode”, which immediately give users another way to run their applications Splitter & Merger OutputData Submission

Ganga CLIP commands (1) Useful commands list_plugins( “type”) # List plugins of specified type: # “applications”, “backends”, etc j1 = Job(backend =LSF()) # Create a new job for LSF a1 = Executable() # Create Executable application j1.application = a1 # Set value for job’s application j1.backend = LCG() # Change job’s backend to LCG export(j1, “myJob.py”) # Write job to specified file load( “myJob.py” ) # Load job(s) from specified file j2 = j1.copy() # Create j2 as a copy of job j1 jobs # List jobs jobs[i].subjobs # List subjobs for split job i

Ganga CLIP commands (2) When a job j has been defined, the following methods can be used j.submit() # Submit the job j.kill() # Kill the job (if running) j.remove() # Kill the job and delete associated files j.peek() # List files in job’s output directory Once a job has been submitted, it can no longer be modified, it cannot be resubmitted, but the job can be copied and the copy can be modified/submitted

hands-on: using Ganga CLIP Will be based on the next URL https://twiki.ific.uv.es/twiki/bin/view/Atlas/AtlasTier2

Ganga beyond ATLAS and LHCb More about Ganga Ganga beyond ATLAS and LHCb

GANGA Activities Main Users Other activities Garfield HARP

Ganga Activities About 50 domains

More then 300 Ganga Users - This plot shows the establishment of the Ganga user community. - ATLAS and LHCb users dominated the user community, the users coming from other communities are growing up - on average, we have roughly 40-50 unique users using Ganga everyday . Based on the trend, the number is growing. - Overall 300 unique users start using Ganga in recent 2 months

More info. Ganga Home: http://cern.ch/ganga Official Ganga User’s Guide: http://ganga.web.cern.ch/ganga/user/html/GangaIntroduction/ Tutorial for ATLAS data analysis using Ganga: https://twiki.cern.ch/twiki/bin/view/Atlas/DistributedAnalysisUsingGanga Looking for helps: ATLAS user support: hn-atlasGANGAUserDeveloper@cern.ch direct support from developers: project-ganga-developers@cern.ch