Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.

Slides:



Advertisements
Similar presentations
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Advertisements

1 Grid services based architectures Growing consensus that Grid services is the right concept for building the computing grids; Recent ARDA work has provoked.
6/4/20151 Introduction LHCb experiment. LHCb experiment. Common schema of the LHCb computing organisation. Common schema of the LHCb computing organisation.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
Introduction To Form Builder
SiS Technical Training Development Track Technical Training(s) Day 1 – Day 2.
Tutorial 11: Connecting to External Data
Information systems and databases Database information systems Read the textbook: Chapter 2: Information systems and databases FOR MORE INFO...
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
11/10/2015S.A.1 Searches for data using AMI October 2010 Solveig Albrand.
Overview of LHCb applications and software environment LHCb software tutorial - March
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
NGS data analysis CCM Seminar series Michael Liang:
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
LHCb week, 27 May 2004, CERN1 Using services in DIRAC A.Tsaregorodtsev, CPPM, Marseille 2 nd ARDA Workshop, June 2004, CERN.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
ALICE, ATLAS, CMS & LHCb joint workshop on
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
COMPREHENSIVE Access Tutorial 3 Maintaining and Querying a Database.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Reconstruction Configuration with Python Chris Jones University of Cambridge.
DØ Data Handling & Access The DØ Meta-Data Browser Pushpa Bhat Fermilab June 4, 2001.
LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
David Adams ATLAS Virtual Data in ATLAS David Adams BNL May 5, 2002 US ATLAS core/grid software meeting.
DBS/DLS Data Management and Discovery Lee Lueking 3 December, 2006 Asia and EU-Grid Workshop 1-4 December, 2006.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
LHCb File-Metadata: Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 04 July 2006.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
Data Management The European DataGrid Project Team
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
EValid LoadTest, eV.manger and Validation. Agenda Load Test capability of eValid How to execute load test by using eValid Introduction to eV.manager Validation.
1 LHCb computing for the analysis : a naive user point of view Workshop analyse cc-in2p3 17 avril 2008 Marie-Hélène Schune, LAL-Orsay for LHCb-France Framework,
LHCbDirac and Core Software. LHCbDirac and Core SW Core Software workshop, PhC2 Running Gaudi Applications on the Grid m Application deployment o CVMFS.
CERN Tutorial, September Overview of LHCb applications and software environment.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
VOX Project Tanya Levshina. 05/17/2004 VOX Project2 Presentation overview Introduction VOX Project VOMRS Concepts Roles Registration flow EDG VOMS Open.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
The GridPP DIRAC project DIRAC for non-LHC communities.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
0 Copyright 2012 FUJITSU Interstage BOP SQL Query Tutorial Todd Palmer October 2012.
LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.
Page 1Imperial College LondonUlrik Egede21 October 2004 Status of distributed analysis for LHCb Presented by Ph.Charpentier 21 October 2004 U. Egede Imperial.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
L’analisi in LHCb Angelo Carbone INFN Bologna
DIRAC Production Manager Tools
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
New developments on the LHCb Bookkeeping
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
Status and plans for bookkeeping system and production tools
Presentation transcript:

Bookkeeping Tutorial

2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically a “step” in a workflow  E.g. “Gauss step”, “Brunel step”…  For real RAW data: the “job” is in fact a DAQ run  Has input files (except runs and Gauss)  Has output files  Note that files may not be kept (i.e. have a replica)  All files are registered in order to keep the full history  Has metadata  Location, production number, application, CPUTime, etc…  Files:  Always output of a “job”  Files are defined by an LFN (Logical File Name)  Contain metadata  Number of events, size, event type, etc…

Bookkeeping purpose  Provenance database  Contains the full history of productions  Traceability of datasets  User dataset search  Select a list of files from selection criteria  Only files with a replica!  Generate Gaudi configuration file  Give also access to the job/file tree  E.g. investigate history of a file  Production datasets search  Select the dataset to be processed by production jobs  Ensures consistency of input files for a production  Uses directly the BK API to get the list of files Bookkeeping Tutorial3

Bookkeeping partitioning  Configuration Name / version  Real data  /  Simulated data  “MC” /  : “2008” / “DC06” / …  Conditions  Parameters of initial data  All subsequent processed data inherit the “conditions”  Real data  DAQ conditions  Beam conditions, energy, magnetic field, detector conditions…  Simulated data  Simulation conditions  Beam energy, magnetic field, luminosity, generator settings… Bookkeeping Tutorial4

Processing pass  Associated to a level of processing  Within a given partition (config name / version + conditions)  Corresponds to the whole processing workflow  Single workflow for a given processing pass  Compatible versions of applications  Specifies the processing pass of input data when applicable  Sequence of processing  Re-processing creates branches Bookkeeping Tutorial5 Gauss SIM Boole DIGI Brunel DST DaVinci ETC Brunel DST SimReco Stripping

Other query parameters  Event type  File property  Real data  : real data full stream  : real data express stream  Types to be defined for stripping streams  Simulated data  LHCb convention for decay tree  File type  Data content / format  Format not yet used Bookkeeping Tutorial6

Running the bookkeeping GUI  Needs a valid Grid certificate  Needs an X server  lhcb-bkk  SetupProject Dirac  Sets up the environment  If needed: lhcb-proxy-init  Creates a proxy  dirac-bookkeeping-gui  Individual commands can be issued from the prompt! Bookkeeping Tutorial7

The query tree Bookkeeping Tutorial8

More info  Right click on  Conditions  Processing pass Bookkeeping Tutorial9

Event type and file type Bookkeeping Tutorial10

Dataset selection Bookkeeping Tutorial11 Logical File name

Saving configuration (a.k.a. options) file  Python configuration (default)  Still possible to create.opts (discouraged!) .txt file for just a list of LFNs  All files or selected files (if any) Bookkeeping Tutorial12

Dealing with PFNs or XML catalogs  Using ganga + DIRAC  Bookkeeping integrated in ganga:  dataset = browseBK()  LFN handling is then automatic…  If you really need XML catalog or PFNs, use genXMLCatalog  Ensures files are available on the specified site  Gets the PFN from the Storage Element  Not constructed “by hand” Bookkeeping Tutorial13

Dealing with XML catalog and PFNs Bookkeeping Tutorial14

DIRAC Monitoring web portal 15

General information  Entry point to the DIRAC web portal   Web implementation of (almost) a full desktop application  Monitoring of productions / jobs  Accounting (jobs, data management)  Allows to take actions on jobs  Authentication / authorisation is mandatory  Anonymous access gives minimal access  Get a certificate and load it in our in your browser  DIRAC authorisation through “DIRAC groups”  Default: lhcb_user  Other groups: lhcb_prod, dirac_admin…  Future: specific groups per physics groups, PPG (for production authorisation)…  Capabilities depends on the group DIRAC Monitoring Tutorial16

The DIRAC portal home page DIRAC Monitoring tutorial17 Identity DIRAC group DIRAC instance Menus

Job Monitoring DIRAC Monitoring tutorial18 Selection Monitoring infoActions

Job Monitoring (cont’d)  Selection  For group lhcb_user, only see your own jobs  Can select with  Status  Site  Date  …  Columns  Can tailor the columns to be displayed  Clicking toggles the sorting in the column  Rows  Jobs displayed in pages (default 25 rows, don’t exceed 100)  Can scroll pages DIRAC Monitoring Tutorial19

Logging info DIRAC Monitoring Tutorial20

Output peeking DIRAC Monitoring Tutorial21

Attributes DIRAC Monitoring Tutorial22

Parameters DIRAC Monitoring Tutorial23