RunJob in CMS Greg Graham Discussion Slides. RunJob in CMS RunJob is an Application Configuration and Job Creation Tool –RunJob uses metadata to abstract.

Slides:



Advertisements
Similar presentations
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
Advertisements

CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
A. Arbree, P. Avery, D. Bourilkov, R. Cavanaugh, S. Katageri, G. Graham, J. Rodriguez, J. Voeckler, M. Wilde CMS & GriPhyN Conference in High Energy Physics,
Runjob: A HEP Workflow Planner for Production Processing Greg Graham CD/CMS Fermilab CD Project Status Meeting 13-Nov-2003.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Introduction to SQL Programming Techniques.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Usage of the Python Programming Language in the CMS Experiment Rick Wilkinson (Caltech), Benedikt Hegner (CERN) On behalf of CMS Offline & Computing 1.
Chapter 10: Architectural Design
Chapter 1 Overview of Databases and Transaction Processing.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.
LexEVS 101 Craig Stancl Rick Kiefer February, 2010.
Microsoft ® Official Course Module XA Using Windows PowerShell ®
MCRunjob: An HEP Workflow Planner for Grid Production Processing Greg Graham CD/CMS Fermilab Iain Bertram and Dave Evans Lancaster university CHEP 2003.
Capture and Replay Often used for regression test development –Tool used to capture interactions with the system under test. –Inputs must be captured;
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May Proposal.
Distribution After Release Tool Natalia Ratnikova.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
CPT Demo May Build on SC03 Demo and extend it. Phase 1: Doing Root Analysis and add BOSS, Rendezvous, and Pool RLS catalog to analysis workflow.
LHCb Software Week November 2003 Gennady Kuznetsov Production Manager Tools (New Architecture)
CHEP /21/03 Detector Description Framework in LHCb Sébastien Ponce CERN.
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
Transformation System report Luisa Arrabito 1, Federico Stagni 2 1) LUPM CNRS/IN2P3, France 2) CERN 5 th DIRAC User Workshop 27 th – 29 th May 2015, Ferrara.
Analysis Tools at D0 PPDG Analysis Grid Computing Project, CS 11 Caltech Meeting Lee Lueking Femilab Computing Division December 19, 2002.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
: Information Retrieval อาจารย์ ธีภากรณ์ นฤมาณนลิณี
DZero Monte Carlo Production Ideas for CMS Greg Graham Fermilab CD/CMS 1/16/01 CMS Production Meeting.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
CMS Production Management Software Julia Andreeva CERN CHEP conference 2004.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Chapter 1 Overview of Databases and Transaction Processing.
DZero Monte Carlo Status, Performance, and Future Plans Greg Graham U. Maryland - Dzero 10/16/2000 ACAT 2000.
Shahkar/MCRunjob: An HEP Workflow Planner for Grid Production Processing Greg Graham CD/CMS Fermilab GruPhyN 15 October 2003.
 1- Definition  2- Helpdesk  3- Asset management  4- Analytics  5- Tools.
Introduction to DBMS Purpose of Database Systems View of Data
Simulation Production System
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Vandy Berten Luc Goossens Alvin Tan
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Data, Databases, and DBMSs
Design Model Like a Pyramid Component Level Design i n t e r f a c d s
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
Introduction to DBMS Purpose of Database Systems View of Data
Chapter 10 ADO.
Chapter 2 Database Environment Pearson Education © 2009.
Presentation transcript:

RunJob in CMS Greg Graham Discussion Slides

RunJob in CMS RunJob is an Application Configuration and Job Creation Tool –RunJob uses metadata to abstract application steps and bolt them together into a workflow (Configurators) –RunJob uses metadata to abstract service invocations and iterators to parallelize job creation (other Configurators) –RunJob has an abstract model of jobs. These can be translated into concrete jobs for a variety of environments (local or grid) later depending on what modules are plugged in

RunJob in CMS RunJob does job creation in scripts (parallelized or not) –RunJob scripts support external service invocations, looping over data things, inline job submission, metadata constraint modeling among applications and services, persistence, and conventional provenance. –Scripts can be factored into simpler scripts and contexts. Contexts contain site level details, administrative details, logical details about constraints on the simpler scripts. Saving the context scripts can enable “meta-provenance” in the sense that recording constraints provides clues as to why certain parameters are set the way they are instead of just their value.

RunJob in CMS Specific Uses –Official Monte Carlo Generation Over 30 sites, over 75 million GEANT events, controlled centrally by central RefDB (pull model) –Data Challenge 2004 Used at CERN to govern online reconstruction of simulated data at 25 Hz. Used at FNAL to create physics plots from reconstructed data transferred there.

RunJob in CMS Extensions to RunJob –XMLP: An XML persistency mechanism –shREEK: Runtime extensions to RunJob, monitoring, and late job building. –logger: An event logger –scriptObjects: Abstract descriptions of jobs that serve as input to code generators producing actual jobs for specific environments –CAST: A GUI for construction Configurators and workflows.

Databases & Catalogs Local/Grid Resources Linker: A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts Configurator: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments Applications Execution Managers RunJob ScriptsRunJob Contexts UserAdmin ScriptObjects

Relationship to Physics Physics Parameters Specification –By physics group: in a context file written by a special user designated by the physics group Members working within a physics group will inherit these parameters by using a common context Constrains multiple applications according to the same set of constants –By knowledgeable user: in a personal MCRunjob script file or in a personal context file. –By novice user: on the command line of a pre-existing MCRunjob script written by a knowledgeable user. In this case, the novice is restricted to specifying parameters foreseen by the author of the MCRunjob script in question.

Linker: A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts RunJob ScriptsRunJob Contexts User Admin -User’s Job Description eg- attach Pythia attach BatchManager -Physics Parameters of User Interest eg- TopMass=275 GeV/c2 -Specification of Which Context To Use -Physics Parameters set by Physics Group eg- HiggsMass=115 GeV/c2 TopMass=273 GeV/c2 Decay Parameters from a Control DB -Physical Description of Site Resources eg- BatchManager=FBS -Configuration of Site Resources eg- FBSQueueName=CMS; Submit Jobs Inline -Combine all specified contexts -Fully Specified Workflow Description eg- attach ControlDB attach Pythia HiggsMass=115 GeV/c2 TopMass=275 GeV/c2 Decay Parameters from a Control DB attach FBS FBS QueueName=CMS; Submit Jobs Inline -Read User Script-Read Context(s) specified by UserResult Text Results from Configurators

Control DB Linker: A container for Configurators that enables constraint propagation, workflow modeling, provenance, scripting, and context resolution of scripts Configurators: An Object Oriented, Metadata based description of services or applications with APIs for extending behaviors to many environments PythiaFBS -Fully Specified Workflow Description eg- attach ControlDB attach Pythia HiggsMass=115 GeV/c2 TopMass=275 GeV/c2 Decay Parameters from a Control DB attach FBS FBS QueueName=CMS; Submit Job MySQL attach ControlDB Get decay parameters return decay parameters: {x} attach Pythia HiggsMass=115 TopMass=275 DecayParameters {x} return Pythia scriptObject attach FBS FBSQueueName=CMS Submit Job Now FBS Submit Results FBS Submit Results Queue CMS

Relationship to Physics Provenance –By specification of fully reduced workflow specification (all contexts resolved, job level) –By specification of contexts actually used in modifying a workflow (why does a constant have value x?) –By specification of