4th February 2004GRIDPP91 LHCb Development Glenn Patrick Rutherford Appleton Laboratory.

Slides:



Advertisements
Similar presentations
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
Advertisements

Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Grid services based architectures Growing consensus that Grid services is the right concept for building the computing grids; Recent ARDA work has provoked.
6/4/20151 Introduction LHCb experiment. LHCb experiment. Common schema of the LHCb computing organisation. Common schema of the LHCb computing organisation.
Exploiting the Grid to Simulate and Design the LHCb Experiment K Harrison 1, N Brook 2, G Patrick 3, E van Herwijnen 4, on behalf of the LHCb Grid Group.
K. Harrison CERN, 15th May 2003 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Development strategy - Ganga prototype - Release plans - Conclusions.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
SLICE Simulation for LHCb and Integrated Control Environment Gennady Kuznetsov & Glenn Patrick (RAL) Cosener’s House Workshop 23 rd May 2002.
12th November 2003LHCb Software Week1 UK Computing Glenn Patrick Rutherford Appleton Laboratory.
Marianne BargiottiBK Workshop – CERN - 6/12/ Bookkeeping Meta Data catalogue: present status Marianne Bargiotti CERN.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
Nick Brook Current status Future Collaboration Plans Future UK plans.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
K. Harrison CERN, 25th September 2003 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Project news - Ganga release 1 - Work towards Ganga release 2 - Interaction.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
WP8 Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid WP8 Meeting, 16th November 2000 Glenn Patrick (RAL)
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
LHCb Software Week November 2003 Gennady Kuznetsov Production Manager Tools (New Architecture)
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
UK Grid Meeting Glenn Patrick1 LHCb Grid Activities in UK Grid Prototype and Globus Technical Meeting QMW, 22nd November 2000 Glenn Patrick (RAL)
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
The GridPP DIRAC project DIRAC for non-LHC communities.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
K. Harrison BNL, 29th August 2003 THE GANGA PROJECT -Project objectives and organisation - Ganga design - Current status of software - Conclusions.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
The GridPP DIRAC project DIRAC for non-LHC communities.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Moving the LHCb Monte Carlo production system to the GRID
DIRAC Production Manager Tools
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Building a UK Computing Grid for Particle Physics
LCG middleware and LHC experiments ARDA project
The Ganga User Interface for Physics Analysis on Distributed Resources
Status and plans for bookkeeping system and production tools
Production Manager Tools (New Architecture)
Presentation transcript:

4th February 2004GRIDPP91 LHCb Development Glenn Patrick Rutherford Appleton Laboratory

4th February 2004GRIDPP92 LHCb - Reminder RICH1 RICH2 Calorimeters Muon System VELO Magnet Tracking stations (inner and outer) 20 m 1.2M electronic channels Weight ~4,000 tonnes b d B meson b d Anti-B meson

4th February 2004GRIDPP93 LHCb GridPP Development LHCb development has been taking place on three fronts: MC Production Control and Monitoring Gennady Kuznetsov (RAL) Data Management Carmine Cioffi (Oxford) Karl Harrison (Cambridge) GANGA Alexander Soroko (Oxford) Karl Harrison (Cambridge) All developed in tandem with LHCb Data Challenges

4th February 2004GRIDPP94 Data Challenge DC03 VELOTT RICH2 RICH1  65M events processed.  Distributed over 19 different centres.  Averaged 830,000 events/day.  Equivalent to 2,300 × 1.5GHz computers.  34% processed in UK at 7 different institutes.  All data written to CERN. “Physics” Data Challenge. Used to redesign and optimise detector …

4th February 2004GRIDPP95 The LHCb Detector Changes were made for material reduction and L1 trigger improvement Reduced number of layers for M1 (4  2) Reduced number of tracking stations behind the magnet (4  3) No tracking chambers in the magnet No B field shielding plate Full Si station Reoptimized RICH-1 design Reduced number of VELO stations (25  21)

4th February 2004GRIDPP96 “Detector” TDRs completed Only Computing TDR remains

4th February 2004GRIDPP97 Data Challenge 2004 “Computing” Data Challenge. April – June 2004 Produce 10 × more events. At least 50% to be done via LCG. Store data at nearest Tier-1 (i.e. RAL for UK institutes) Try out distributed analysis. Test computing model and write computing TDR. Require stable LCG2 release with SRM interfaced to RAL DataStore

4th February 2004GRIDPP98 DC04: UK Tier-2 Centres NorthGrid Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid Birmingham, Bristol, Cambridge, Oxford, RAL PPD ScotGrid Durham, Edinburgh, Glasgow LondonGrid Brunel, Imperial, QMUL, RHUL, UCL

4th February 2004GRIDPP99 DIRAC Architecture Information Service Authentication Authorisation Auditing Grid Monitoring Workload Management Metadata Catalogue File Catalogue Data Management Computing Element Storage Element Job Provenance Package Manager User Interface API Accounting DIRAC components Other project components: AliEn, LCG, … Resources: LCG, LHCb production sites

4th February 2004GRIDPP910 MC Control Status Gennady Kuznetsov Control toolkit breaking down production workflow into components – modules, steps. To be deployed in DC04. SUCCESS! DIRAC Distributed Infrastructure with Remote Agent Control

4th February 2004GRIDPP911 Bookkeeping data Monitoring info Get jobs Site A Site B Site C Site D Agent Production service Monitoring service Bookkeeping service Agent DIRAC v1.0 Original scheme “Pull” rather than “Push”

4th February 2004GRIDPP912 Components – MC Control Module Step Workflow Job Production Levels of usage: 1.Module – Programmer 2.Step – Production Manager 3.Workflow – User/Production manager Module is the basic component of the architecture Each step generates job as a Python program. This structure allow the Production Manager to construct any algorithm as a combination of modules. Gennady Kuznetsov

4th February 2004GRIDPP913 Module Editor Python code of single module. Can be many classes. Module variables. Description Module Name Stored as XML file Gennady Kuznetsov

4th February 2004GRIDPP914 Step Editor Step Name Description Definitions of Modules Instances of Modules Variables of currently selected instance Selected instance Stored as XML file, where all modules are embedded Gennady Kuznetsov Step variables.

4th February 2004GRIDPP915 Workflow Editor Gennady Kuznetsov Workflow Name Description Step Definitions Step Instances Variables of currently selected Step Instance Selected Step Instance Workflow Variables. Stored as XML file

4th February 2004GRIDPP916 Job Splitting Gennady Kuznetsov Step Workflow Definition Job Production Python List The input value for the job splitting is a Python list object. Every single (top level) element of this list applies to the Workflow Definition and propagates through the code and generates a single element of the production (one or several jobs).

4th February 2004GRIDPP917 Future: Production Console Once an agent has received a workflow, the Production Manager has no control over any function in a remote centre. Local Manager must perform all of the configurations and interventions at individual site. Develop ”Production Console” which will provide extensive control and monitoring functions for the Production Manager. Monitor and configure remote agents. Data replication control. Intrusive system – need to address Grid security mechanisms and provide robust environment.

4th February 2004GRIDPP918 DIRAC v1.0 Architecture Production Manager

4th February 2004GRIDPP919 DIRAC v2.0 WMS Architecture Production Service Also data stored remotely Based on central queue service

4th February 2004GRIDPP920 Data Management Status Carmine Cioffi File catalogue browser for POOL Integration of POOL persistency framework into GAUDI  new EventSelector interface. SUCCESS!

4th February 2004GRIDPP921 List of LFNs Tabs for LFN / PFN mode selection List of PFNs associated to the LFN selected from the list of LFNs on the left sub-panel Read the next and previous bunch of files from the catalog Write mode selection Import the fragment of a catalog Reload the catalog Shows the metadata schema, with the possibility to change it List all the metadata value of the catalog List the files selected Search text bar Filter text bar. Main Panel, LFN Mode Browsing POOL file catalogue provides LFN & PFN association. Browser allows user to interact with catalogue via GUI. Can save list of LFNs for job sandbox

4th February 2004GRIDPP922 Main Panel, PFN Mode Browsing Sub menu with three operations to be done on the file selected. In PFN mode, the files are browsed in the same way as Windows Explorer. The folders are shown on the left sub-panel and the value of the folder on the right sub-panel. Write mode button opens WrFCBrowser frame allowing user to write to the catalogue…

4th February 2004GRIDPP923 Register a PFN Add a PFN replica Delete a PFN Add LFN Remove a LFN Add metadata value Rollback Commit Show the action performed Write Mode Panel

4th February 2004GRIDPP924 PFN register frame Frame to show and change the metadata schema of the catalog This frame allows setting of the metadata value

4th February 2004GRIDPP925 This frame shows the metadata value of the PFN Myfile Shows the list of the files selected This frame shows the attribute value of the PFN

4th February 2004GRIDPP926 Benefit from investment in LCG Retire parts of Gaudi  reduce maintenance. Designed and implemented a new interface for the LHCb EventSelector. Criteria:  One or more “datasets” (e.g. list of runs, list of files matching a given criteria).  One or more “EventTagCollections” with extra selection based on Tag values.  One or more physical files. Result of an event selection is a virtual list of event pointers. GAUDI/POOL Integration

4th February 2004GRIDPP927 Physicist’s View of Event Data Gaudi Bookkeeping Dataset Event 1 Event 2 … Event 3 Dataset Event 1 Event 2 … Event 3 File Event 1 Event 2 … Event N Files RAW2-1/1/2008 RAW3-22/9/2007 RAW4-2/2/2008 … Dataset Event 1 Event 2 … Event 3 Dataset Event 1 Event 2 … Event 3 Event tag collctn Tag Tag … Tag M8 3.1 Collection Set B -> ππ Candidates (Phy) B -> J/Ψ (μ + μ - ) Candidates …

4th February 2004GRIDPP928 Future: Data to Metadata File catalogue holds only a minimal amount of metadata. LHCb deploys a separate “bookkeeping” database service to store the metadata for datasets and event collections. Based on central ORACLE server at CERN with query service through XML-RPC interface. Not scaleable, particularly for Grid, and completely new metadata solution required. ARDA based system will be investigated. Vital that this is development is optimised for LHCb and synchronised with data challenges. Corresponds to ARDA Job Provenance DB and Metadata Catalogue

4th February 2004GRIDPP929 DIRAC Metadata: Data Production Information Flow Job.xml Build new configuration Selection of Defaults Production done Prod.Mgr Configuration Bookkeeping Data Production Production Jobs File Catalogue

4th February 2004GRIDPP930 Metadata: Data Analysis User Job Information Flow Job.opts Modify Defaults User Select input data Pick-up default configuration Bookkeeping Configuration File Catalogue DIRAC

4th February 2004GRIDPP931 LHCb GANGA Status Alexander Soroko, Karl Harrison User Grid Interface. First prototype released in April To be deployed for LHCb 2004 Data Challenge. SUCCESS! LHCb ATLAS BaBar + Alvin Tan Janusz Martyniak

4th February 2004GRIDPP932 GANGA for LHCb GANGA will allow LHCb user to perform standard analysis tasks: Data queries. Configuration of jobs, defining the job splitting/merging strategy. Submitting jobs to the chosen Grid resources. Following the progress of jobs. Retrieval of job output. Job bookkeeping.

4th February 2004GRIDPP933 GANGA User Interface Database of Standard Job Options Job Options Editor Strategy Database (Splitting scripts) Strategy SelectionData Selection (Input/Output Files) Job Requirements (LSF Resources, etc) Job Factory (Job Registry Class) Ganga Job object Local Client Grid/Batch System Gatekeeper Submit job Send job output Worker nodes Get job output Send Job script JDL file Job Options file Get Monitoring Info Storage Element File Transfer

4th February 2004GRIDPP934 Software Bus User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software Bus Software Bus itself is a Ganga component implemented in Python Components used by Ganga fall into 3 categories: Ganga components of general applicability or Core Components (to right in diagram) Ganga components providing specialised functionality (to left in diagram) External components (at bottom in diagram) User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software Bus Software Bus itself is a Ganga component implemented in Python Components used by Ganga fall into 3 categories: Ganga components of general applicability or Core Components (to right in diagram) Ganga components providing specialised functionality (to left in diagram) External components (at bottom in diagram) Job Definition Job Registry Job Handling File Transfer Python Native Software Bus CLI GUI Python Root Gaudi Python PyCMT PyAMI Py Magda BaBar Job Definition and Splitting Gaudi/Athena Job Options Editor Gaudi/Athena Job Definition

4th February 2004GRIDPP935 GUIs Galore

4th February 2004GRIDPP936 DIRAC WMS Architecture GANGA

4th February 2004GRIDPP937 Future Plans Database of Standard Job Options Job-Options Editor Job-Options Template Job-Options Knowledge Base Dataset Dataset Catalogue Dataset Selection Job Factory (Machinery for Generating XML Descriptions of Multiple Jobs) Strategy Selection Job Collection (XML Description) User Requirements Database of Job Requirements Derived Requirements Job Requirements Strategy Database (Splitter Algorithms) DispatcherScheduler Proxy Scheduler Service Remote-Client Scheduler Grid/ Batch-System Scheduler Agent (Runs/Validates Job) Software Cache Component Cache Software/Component Server Remote Client Local Client Execution node NorduGrid Local DIAL DIRAC Other JDL, Classads, LSF Resources, etc LSF PBS EDG USG Refactorisation of Ganga, with submission on remote client Motivation Ease integration of external components Facilitate multi- person, distributed development Increase customizability/ flexibility Permit GANGA components to be used externally more simple Motivation Ease integration of external components Facilitate multi- person, distributed development Increase customizability/ flexibility Permit GANGA components to be used externally more simple 2 nd GANGA prototype ~ April 2004

4th February 2004GRIDPP938 Future: GANGA Develop into generic front-end capable of submitting a range of applications to the Grid. Requires central core and modular structure (started with version 2 re-factorisation) to allow new frameworks to be plugged in. Enable GANGA to be used in complex analysis environment over many years for many users. Hierarchical structure, import/export facility, schema evolution, etc. Interact with multiple Grids (e.g. LCG, NorduGrid, EGEE…). Needs to keep pace with development of Grid services. Synchronise with ARDA developments. Interactive analysis? ROOT, PROOF