Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.

Slides:



Advertisements
Similar presentations
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Advertisements

Millennium Create Lists Claudia Conrad Product Manager, Cataloging Northwest IUG October 2003.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
1 Grid services based architectures Growing consensus that Grid services is the right concept for building the computing grids; Recent ARDA work has provoked.
6/4/20151 Introduction LHCb experiment. LHCb experiment. Common schema of the LHCb computing organisation. Common schema of the LHCb computing organisation.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
Introduction To Form Builder
SiS Technical Training Development Track Technical Training(s) Day 1 – Day 2.
Tutorial 11: Connecting to External Data
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
ROOT An object oriented HEP analysis framework.. Computing in Physics Physics = experimental science =>Experiments (e.g. at CERN) Planning phase Physics.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
11/10/2015S.A.1 Searches for data using AMI October 2010 Solveig Albrand.
Overview of LHCb applications and software environment LHCb software tutorial - March
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
ALICE, ATLAS, CMS & LHCb joint workshop on
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
COMPREHENSIVE Access Tutorial 3 Maintaining and Querying a Database.
CNGrid GOS 3.0 Practice OMII-Euro & CNGrid Joint Training Material QiaoJian Jan
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Reconstruction Configuration with Python Chris Jones University of Cambridge.
DØ Data Handling & Access The DØ Meta-Data Browser Pushpa Bhat Fermilab June 4, 2001.
LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
DBS/DLS Data Management and Discovery Lee Lueking 3 December, 2006 Asia and EU-Grid Workshop 1-4 December, 2006.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
FRErator – the Bridge between FRE and Curator DB.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
1 A lightweight Monitoring and Accounting system for LHCb DC'04 production V. Garonne R. Graciani Díaz J. J. Saborido Silva M. Sánchez García R. Vizcaya.
Data Management The European DataGrid Project Team
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 LHCb computing for the analysis : a naive user point of view Workshop analyse cc-in2p3 17 avril 2008 Marie-Hélène Schune, LAL-Orsay for LHCb-France Framework,
LHCbDirac and Core Software. LHCbDirac and Core SW Core Software workshop, PhC2 Running Gaudi Applications on the Grid m Application deployment o CVMFS.
CERN Tutorial, September Overview of LHCb applications and software environment.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
VOX Project Tanya Levshina. 05/17/2004 VOX Project2 Presentation overview Introduction VOX Project VOMRS Concepts Roles Registration flow EDG VOMS Open.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
DDM Central Catalogs and Central Database Pedro Salgado.
The GridPP DIRAC project DIRAC for non-LHC communities.
LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.
ConTZole Tomáš Kubeš, 2010 atlas-tz-monitoring.cern.ch An Interactive ATLAS Tier-0 Monitoring.
Page 1Imperial College LondonUlrik Egede21 October 2004 Status of distributed analysis for LHCb Presented by Ph.Charpentier 21 October 2004 U. Egede Imperial.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
 1- Definition  2- Helpdesk  3- Asset management  4- Analytics  5- Tools.
L’analisi in LHCb Angelo Carbone INFN Bologna
DIRAC Production Manager Tools
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
New developments on the LHCb Bookkeeping
LHCb Grid Computing LHCb is a particle physics experiment which will study the subtle differences between matter and antimatter. The international collaboration.
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
Status and plans for bookkeeping system and production tools
Presentation transcript:

Bookkeeping Tutorial

Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production jobs  Job:  In fact technically a “step” in a workflow  E.g. “Gauss step”, “Brunel step”…  For real RAW data: the “job” is in fact a DAQ run  Has input files (except runs and Gauss)  Has output files  Note that files may not be kept (i.e. have a replica)  All files are registered in order to keep the full history  Has metadata  Location, production number, application, CPUTime, etc…  Files:  Always defined as output of a “job”  Files are defined by an LFN (Logical File Name)  Contain metadata  Number of events, size, event type, etc…

Bookkeeping purpose  Provenance database  Contains the full history of productions  Traceability of datasets  User dataset search  Select a list of files from selection criteria  Only files with a replica!  Generate Gaudi configuration file  Give also access to the job/file tree  E.g. investigate history of a file  Production datasets search  Select the dataset to be processed by production jobs  Ensures consistency of input files for a production  Uses directly the BK API to get the list of files Bookkeeping & Monitoring Tutorial3

Bookkeeping partitioning  Configuration Name / version  Real data  /  Simulated data  “MC” /  : “DC06” / “MC09” …  Conditions  Parameters of initial data  All subsequent processed data inherit the “conditions”  Real data  DAQ conditions  Beam conditions, energy, magnetic field, detector conditions…  Simulated data  Simulation conditions  Beam energy, magnetic field, luminosity, generator settings… Bookkeeping & Monitoring Tutorial4

Processing pass  Associated to a level of processing  Within a given partition (config name / version + conditions)  Corresponds to the whole processing workflow  Single workflow for a given processing pass  Compatible versions of applications  Specifies the processing pass of input data when applicable  Sequence of processing  Re-processing creates branches Bookkeeping & Monitoring Tutorial5 Gauss SIM Boole DIGI Brunel DST DaVinci ETC Brunel DST SimReco Stripping

Other query parameters  Event type  File property  Real data  : real data full stream  : real data express stream  Types to be defined for stripping streams  Simulated data  LHCb convention for decay tree  File type  Data content / format  Format not yet used Bookkeeping & Monitoring Tutorial6

Running the bookkeeping GUI  Needs a valid Grid certificate   Needs an X server  On lxplus: lhcb-bkk  SetupProject Dirac  Sets up the environment  If needed: lhcb-proxy-init  Creates a valid Grid proxy  dirac-bookkeeping-gui  Individual commands can be issued from the prompt!  You can also install Dirac locally on your Linux machine:  - Installing_DIRAC_on_non_CERN_mac - Installing_DIRAC_on_non_CERN_mac Bookkeeping & Monitoring Tutorial7

The query tree Bookkeeping & Monitoring Tutorial8

More info  Right click on  Conditions  Processing pass Bookkeeping & Monitoring Tutorial9

Event type and file type Bookkeeping & Monitoring Tutorial10

Dataset selection Bookkeeping & Monitoring Tutorial11 Logical File name

Limit number of files per page Bookkeeping & Monitoring Tutorial12

Saving configuration (a.k.a. options) file  Python configuration (default)  Still possible to create.opts (discouraged!) .txt file for just a list of LFNs  All files or selected files (if any) Bookkeeping & Monitoring Tutorial13

Advanced saving  Select files for a site (for local usage, not Grid job)  LFN+XML catalog  Next slide  PFN Bookkeeping & Monitoring Tutorial14

Advanced saving (LFN)  LFNs + XML catalog Bookkeeping & Monitoring Tutorial15

Other queries  Select another “tree”  Different order for the query  Production lookup  If you are interested in a particular production number  Run lookup  For real data (currently FEST) Bookkeeping & Monitoring Tutorial16

Dealing with PFNs or XML catalogs  Using ganga + DIRAC  Bookkeeping integrated in ganga:  dataset = browseBK()  LFN handling is then automatic…  genXMLCatalog  Same functionality as “Advanced save” of GUI  Ensures files are available on the specified site  Gets the PFN from the Storage Element  Not constructed “by hand” Bookkeeping & Monitoring Tutorial17

DIRAC Monitoring web portal

General information  Entry point to the DIRAC web portal   Web implementation of (almost) a full desktop application  Monitoring of productions / jobs  Accounting (jobs, data management)  Allows to take actions on jobs  Authentication / authorisation is mandatory  Anonymous access gives minimal information  Get a certificate and load it in our in your browser  DIRAC authorisation through “DIRAC groups”  Default: lhcb_user  Other groups: lhcb_prod, dirac_admin…  Future: specific groups per physics groups, PPG (for production authorisation)…  Capabilities depends on the group Bookkeeping & Monitoring Tutorial19

The DIRAC portal home page Bookkeeping & Monitoring Tutorial20 Identity DIRAC group DIRAC instance Menus

Job Monitoring Bookkeeping & Monitoring Tutorial21 Selection Monitoring infoActions

Job Monitoring (cont’d)  Selection  For group lhcb_user, only see your own jobs  Can select with  Status  Site  Date  …  Columns  Can tailor the columns to be displayed  Clicking toggles the sorting in the column  Rows  Jobs displayed in pages (default 25 rows, don’t exceed 100)  Can scroll pages Bookkeeping & Monitoring Tutorial22

Logging info Bookkeeping & Monitoring Tutorial23

Output peeking Bookkeeping & Monitoring Tutorial24

Attributes Bookkeeping & Monitoring Tutorial25

Parameters Bookkeeping & Monitoring Tutorial26

Job statistics Bookkeeping & Monitoring Tutorial27

Accounting  Gives you access to your jobs  Select parameters:  Plot  Time range  Item to plot against§ (site, status…)  Selection criteria  Site  (Final) Status Bookkeeping & Monitoring Tutorial28

Accounting screenshots Bookkeeping & Monitoring Tutorial29

Accounting (cont’d) Bookkeeping & Monitoring Tutorial30

Job CPU efficiency Bookkeeping & Monitoring Tutorial31