Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',

Slides:



Advertisements
Similar presentations
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Advertisements

CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Experiment Support Introduction to HammerCloud for The LHCb Experiment Dan van der Ster CERN IT Experiment Support 3 June 2010.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Computing and LHCb Raja Nandakumar. The LHCb experiment  Universe is made of matter  Still not clear why  Andrei Sakharov’s theory of cp-violation.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
Service Architecture of Grid Faults Diagnosis Expert System Based on Web Service Wang Mingzan, Zhang ziye Northeastern University, Shenyang, China.
Grid Initiatives for e-Science virtual communities in Europe and Latin America DIRAC TEAM CPPM – CNRS DIRAC Grid Middleware.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
Grid infrastructure analysis with a simple flow model Andrey Demichev, Alexander Kryukov, Lev Shamardin, Grigory Shpiz Scobeltsyn Institute of Nuclear.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
1 LHCb File Transfer framework N. Brook, Ph. Charpentier, A.Tsaregorodtsev LCG Storage Management Workshop, 6 April 2005, CERN.
CHEP 2006, February 2006, Mumbai 1 LHCb use of batch systems A.Tsaregorodtsev, CPPM, Marseille HEPiX 2006, 4 April 2006, Rome.
Distributed Analysis K. Harrison LHCb Collaboration Week, CERN, 1 June 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Abel Carrión Ignacio Blanquer Vicente Hernández.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
 CMS data challenges. The nature of the problem.  What is GMA ?  And what is R-GMA ?  Performance test description  Performance test results  Conclusions.
Transformation System report Luisa Arrabito 1, Federico Stagni 2 1) LUPM CNRS/IN2P3, France 2) CERN 5 th DIRAC User Workshop 27 th – 29 th May 2015, Ferrara.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
1 A lightweight Monitoring and Accounting system for LHCb DC'04 production V. Garonne R. Graciani Díaz J. J. Saborido Silva M. Sánchez García R. Vizcaya.
DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.
Your university or experiment logo here Performance Monitoring Gidon Moont e-Science, HEP, Imperial College London Talk to JRA1.
GRID Security & DIRAC A. Casajus R. Graciani A. Tsaregorodtsev.
DIRAC Pilot Jobs A. Casajus, R. Graciani, A. Tsaregorodtsev for the LHCb DIRAC team Pilot Framework and the DIRAC WMS DIRAC Workload Management System.
DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
CHEP 2006, February 2006, Mumbai 1 DIRAC, the LHCb Data Production and Distributed Analysis system A.Tsaregorodtsev, CPPM, Marseille CHEP 2006,
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
DataTAG is a project funded by the European Union International School on Grid Computing, 23 Jul 2003 – n o 1 GridICE The eyes of the grid PART I. Introduction.
1 DIRAC WMS & DMS A.Tsaregorodtsev, CPPM, Marseille ICFA Grid Workshop,15 October 2006, Sinaia.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
ECGI meeting on job priorities on May 15th 2006, CNAF Bologna How LHCb thinks to use/integrate g-PBox (or single components) and when Gianluca Castellani.
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Practical using C++ WMProxy API advanced job submission
L’analisi in LHCb Angelo Carbone INFN Bologna
Moving the LHCb Monte Carlo production system to the GRID
DIRAC Production Manager Tools
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Grid Deployment Board meeting, 8 November 2006, CERN
LCG middleware and LHC experiments ARDA project
R. Graciani for LHCb Mumbay, Feb 2006
Presentation transcript:

Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci', 'v12r11') job.setInputSandbox(['Application_DaVinci_v12r11/DV_Pi0Cali br.opts', 'Application_DaVinci_v12r11/lib ']) job.setInputData(['/lhcb/production/DC04/v2/DST/ _ _9.dst','/lhcb/production/DC04/v2/DST/ _ _9.dst ']) job.setOutputSandbox(['pi0calibr.hbook', 'pi0histos.hbook']) jobid = dirac.submit(job,verbose=1) print "Job ID = ",jobid DIRAC Infrastructure For Distributed Analysis CHEP 2006 Mumbai, India

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson2 Overview Introduction to DIRAC DIRAC API Workload Management System WMS Strategies & Performance Outlook

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson3 Introduction to DIRAC The DIRAC Workload & Data Management System (WMS) is made up of Central Services and Distributed Agents Realizes PULL scheduling paradigm Agents are requesting jobs whenever the corresponding resource is available Execution environment is checked before job is delivered to WN Service Oriented Architecture masks underlying complexity

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson4 The DIRAC API from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci', 'v12r15') job.setInputSandbox(['DaVinci.opts', 'lib ']) job.setInputData(['/lhcb/production/DC04/v2/DST/ _ _10.dst']) job.setOutputSandbox(['DVNtuples.root', 'DaVinci_v12r15.log']) jobid = dirac.submit(job,verbose=1) print "Job ID = ",jobid The DIRAC API provides a transparent way for users to submit production or analysis jobs to LCG Can be single application or complicated DAGs While it may be exploited directly, the DIRAC API also serves as the interface for the GANGA Grid front-end to perform distributed user analysis for LHCb

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson5 LHCb DIRAC Monitoring

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson6 LHCb DIRAC Monitoring

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson7 DIRAC Workload Management System DIRAC API interfaces with the client securely See talk by A.Casajus [112] Data Optimizer queries the LCG File Catalogue (LFC) to determine suitable sites for job execution Agent Director and Agent Monitor services handle submission to LCG Pilot Agents go through the LCG Resource Broker as normal jobs

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson8 DIRAC Agent on WN Dynamically creates Job Wrapper Handles Input/Output Sandboxes Transfers and registers Output Data as required, according to LHCb conventions Sends accounting and site specific information back to DIRAC WMS Runs a Watchdog process in parallel to the application Provides ‘heartbeat’ for DIRAC Monitoring Provides access to any requested Input Data

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson9 Comparison of DIRAC Modes A study was performed to compare the possible WMS strategies for analysis jobs We know DIRAC can deal with long production jobs, what about the other extreme? Consider shortest possible ‘useful’ job Analysis of 500 Events These short jobs are of high priority since results must be back as soon as possible

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson10 DIRAC WMS Pilot Agent Strategies There are several ways to use the DIRAC infrastructure but the end goal is to minimise the start time of user analysis jobs The following explores some of the possibilities DIRAC Modes of submission ‘Resubmission’ Pilot Agent submission to LCG with monitoring Multiple Pilot Agents may be sent in case of LCG failures ‘Filling Mode’ Pilot Agents may request several jobs from the same user, one after the other ‘Multi-Threaded’ Same as ‘Filling’ Mode above except two jobs can be run in parallel on the Worker Node

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson11 Experiments Performed Short (5 Mins) analysis jobs running at Tier 1 Centres Jobs for different modes were intermixed in order to ensure the same Grid ‘weather’ Submission of jobs was at the pace of the Resource Broker Job start time is defined as the time between submission and the application starting on the WN For each experiment 3 Users submitted 100 jobs Each user submitted jobs for one of the three different modes described above These experiments were repeated several times to see if results are reproducible

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson12 Start Times for 10 Experiments, 30 Users

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson13 Start Times for 10 Experiments, 30 Users LCG limit on start time, minimum was about 9 Mins

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson14 Result of 10 Experiments from 30 Users

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson15 Analysis of Results All 3000 jobs successfully completed within ~40 Mins of starting Can see clear improvement on start times for Multi- Threaded and Filling modes Even when LCG performance is optimal, see significant reduction A factor of 3 to 4 fewer LCG jobs need to be sent These modes therefore reduce the load on LCG Optimization of workload is performed for each user independently Here each user submitted 100 jobs Can this be further improved?

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson16 Same Number of Jobs With Less Users Two cases, 1 user submitting 1000 jobs 10 users submitting 100 jobs

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson17 Same Number of Jobs With Less Users Two cases, 1 user submitting 1000 jobs 10 users submitting 100 jobs 1 User shows considerable improvement of average job start time compared to 10 Users

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson18 Analysis of Results There are clear gains in efficiency when the number of users is reduced Thus, workload optimization per user is effective but less effective than optimization per Virtual Organisation could be

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson19 Added Value of DIRAC for Users DIRAC masks the inefficiencies of LCG In these tests LCG performed well, this is not always the case… Can execute LHCb VO policy in one central place for the benefit of all users e.g. troublesome LCG sites can be banned for all users at once DIRAC WMS currently demonstrated ~1Hz throughput for analysis jobs 150Kb Input Sandbox with 10 Input Data files See talk by U.Egede [317] Actual throughput limited by available LCG resources and capacity of Resource Broker

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson20 Conclusions The DIRAC API provides a simple yet powerful tool for users Access to LCG resources is provided in a simple and transparent way DIRAC Multi-Threaded and Filling modes show significant reductions on the job start times Also reduce the load on LCG Workload management on the level of the user is effective Can be more powerful on the level of the VO DIRAC infrastructure for distributed analysis is in place Now have real users

CHEP 2006 (13 th –17 th February 2006) Mumbai, IndiaStuart K. Paterson21 Backup Slide - Total Times For 30 Users