L’analisi in LHCb Angelo Carbone INFN Bologna

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',
Computing and LHCb Raja Nandakumar. The LHCb experiment  Universe is made of matter  Still not clear why  Andrei Sakharov’s theory of cp-violation.
CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Introduction to Alexander Richards Thanks to Mike Williams, ICL for many of the slides content.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
Data Access for Analysis Jeff Templon PDP Groep, NIKHEF A. Tsaregorodtsev, F. Carminati, D. Liko, R. Trompert GDB Meeting 8 march 2006.
ARDA Prototypes Andrew Maier CERN. ARDA WorkshopAndrew Maier, CERN2 Overview ARDA in a nutshell –Experiments –Middleware Experiment prototypes (basic.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Ganga Status Update Will Reece. Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in Upcoming Features.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
Successful Distributed Analysis ~ a well-kept secret K. Harrison LHCb Software Week, CERN, 27 April 2006.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
EGEE is a project funded by the European Union under contract IST “Interfacing to the gLite Prototype” Andrew Maier / CERN LCG-SC2, 13 August.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Distributed Analysis K. Harrison LHCb Collaboration Week, CERN, 1 June 2006.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
1 LHCb computing for the analysis : a naive user point of view Workshop analyse cc-in2p3 17 avril 2008 Marie-Hélène Schune, LAL-Orsay for LHCb-France Framework,
LHCbDirac and Core Software. LHCbDirac and Core SW Core Software workshop, PhC2 Running Gaudi Applications on the Grid m Application deployment o CVMFS.
Proxy management mechanism and gLExec integration with the PanDA pilot Status and perspectives.
LHCb status and plans Ph.Charpentier CERN. LHCb status and plans WLCG Workshop 1-2 Sept 2007, Victoria, BC 2 Ph.C. Status of DC06  Reminder:  Two-fold.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
LHCb 2009-Q4 report Q4 report LHCb 2009-Q4 report, PhC2 Activities in 2009-Q4 m Core Software o Stable versions of Gaudi and LCG-AA m Applications.
The EDG Testbed Deployment Details
ATLAS Use and Experience of FTS
U.S. ATLAS Grid Production Experience
INFN GRID Workshop Bari, 26th October 2004
Moving the LHCb Monte Carlo production system to the GRID
Design rationale and status of the org.glite.overlay component
Akiya Miyamoto KEK 1 June 2016
Joint JRA1/JRA3/NA4 session
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
INFN-GRID Workshop Bari, October, 26, 2004
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
ALICE Physics Data Challenge 3
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
WLCG Service Report 5th – 18th July
LCG middleware and LHC experiments ARDA project
WLCG Collaboration Workshop;
The Ganga User Interface for Physics Analysis on Distributed Resources
Vandy Berten Luc Goossens Alvin Tan
LHCb Computing Philippe Charpentier CERN
LHCb status and plans Ph.Charpentier CERN.
Production Manager Tools (New Architecture)
Dirk Duellmann ~~~ WLCG Management Board, 27th July 2010
The LHCb Computing Data Challenge DC06
Presentation transcript:

L’analisi in LHCb Angelo Carbone INFN Bologna

Introduction The analysis in LHCb is handled by GANGA an Atlas/LHCb project enabling a user to perform the complete life cycle of a job Build – Configure – Prepare – Monitor – Submit – Merge – Plot It allows to run jobs on the local machine, either interactive or in background on batch systems (LSF, PBS, …) on the Grid Jobs look the same whether the run locally or on the Grid Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

LHCb jobs For LHCb the main use of Ganga is for running Gaudi jobs This means: Configure analysis applications Specify the datasets Split and submit the jobs Managing the output data Merge n-tuples and histogram files Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

The Ganga job object Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

The Ganga job object Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Application There is a specific application handler for each Gaudi app: ['Brunel', 'Moore', 'DaVinci‘, 'Gauss', 'Boole‘, Root,…] # Define a DaVinci application object d = DaVinci() d.optsfile = d.user_release_area + ’myopts.py' ApplicationMgr().EvtMax = 1000 HistogramPersistencySvc().OutputFile = "DVHistos_1.root“ myopts.py  include the configuration of the user analysis Algorithms, variable cuts, input data sets, etc… Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

The Ganga job object Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Backend There are 4 backends of interest for running LHCb jobs: Interactive – in the foreground on the client Local – in the background on the client LSF – on the LSF batch system (SGE/PBS/Condor systems supported as well) Dirac – on the Grid # Define a Dirac backend object d = Dirac() print d Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Access to the Grid User sends job to DIRAC WMS sends a pilot agent as a WLCG job When pilot agent runs safely on a worker node it fetches job from DIRAC Small data files returned in the sendbox Large files registered in LFC file catalogue User queries DIRAC for the status and finally retrieves the output Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

The Ganga job object Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Input dataset Use the LHCb bookkeeping to get a list of files to run over j.inputdata = browseBK() # opens BK browser Only LFN are accessible Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

The Ganga job object Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Output Dataset When a job is finished the output dir will contain the stdout and stderr of the job and your output sandbox files. Output data files are stored in a storage element on the Grid. Large files are uploaded to a storage element - Download with j.backend.getOutputData You can build a list of LFNs of these files – j.backend.getOutputDataLFNs Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

The Ganga job object Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Job splitting and data drive submission Splitter main Job List of LFN catalog LFC Job splitting and data drive submission GANGA List of PFN sub-jobs sub-jobs sub-jobs CNAF RAL CERN IN2P3 GRIDKA PIC NIKHEF Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Merging Jobs produce lots of output files that need to be merged together to obtain final results Different file merging root  RootMerger text  TextMerger DST  DSTMerger Want something really special? CustomMerger Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Monitoring Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Ganga End-Users Over 1000 unique users in the past 6 months: Dip caused by monitoring outage Over 1000 unique users in the past 6 months: Generally 50% ATLAS (blue), 25% LHCb (green), 25% other Monthly ~500 unique ~2000 unique since January 2007 Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Job efficiency Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Failure Data access failure (19%). There are two main causes for the 19% jobs failing to access input data from the WN. The first is due to instability in the site SRM layer at the Tier-1 sites. not being able to construct TURLs for the software application t access input datasets The other cause of such problems are zero-size or incorrectly registered dataset replicas for which it is impossible to obtain a correct TURL. Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Stalled Stalled (8%) A job is ‘stalled’ if the Job Monitoring Service stops receiving signal of life One of the main causes of this is user proxy expiration on the WN. Submitted Pilot Agents may wait in a site batch queue for several hours, which is a significant portion of a default (12 hour) proxy validity. application failure loss of open data connections at sites and also user code crashes, all of which can result in expending the available wall-clock time of the resource. Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Other minor failures Failed to upload output data (1%) This caused by the transfer and register operation to the LFC failing. It can happen due to network outages, power cuts, site mis-configurations, and also during LFC downtime. Application failure (1%) The Job Wrapper can identify the exit state of the software applications running on the Grid. A common cause of this type of failure is corrupted software shared-areas at the sites. Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone

Conclusion The LHCb distributed analysis framework allows users to transparently submit jobs to the Grid Real job efficiency measured so far ~70% Main source of failures data inconsistencies service instabilities Although usable (and used), GRID analysis for LHCb is not yet at production quality Still far from 99.99999999999%... Workshop CCR e INFN-GRID 2009 13rd May 2009 - Angelo Carbone