Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.

Slides:



Advertisements
Similar presentations
Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Advertisements

ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
NorduGrid Grid Manager developed at NorduGrid project.
GUMS status Gabriele Carcassi PPDG Common Project 12/9/2004.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
A Computation Management Agent for Multi-Institutional Grids
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Experience with ATLAS Data Challenge Production on the U.S. Grid Testbed Kaushik De University of Texas at Arlington CHEP03 March 27, 2003.
Workload Management Massimo Sgaravatto INFN Padova.
Grids and Globus at BNL Presented by John Scott Leita.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
K. Harrison CERN, 15th May 2003 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Development strategy - Ganga prototype - Release plans - Conclusions.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
Dave Newbold, University of Bristol24/6/2003 CMS MC production tools A lot of work in this area recently! Context: PCP03 (100TB+) just started Short-term.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
K.Harrison CERN, 21st November 2002 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Background and scope - Project organisation - Technology survey - Design -
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
ATLAS Data Challenge Production Experience Kaushik De University of Texas at Arlington Oklahoma D0 SARS Meeting September 26, 2003.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
First attempt for validating/testing Testbed 1 Globus and middleware services WP6 Meeting, December 2001 Flavia Donno, Marco Serra for IT and WPs.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Grid Production Experience in the ATLAS Experiment Horst Severini University of Oklahoma Kaushik De University of Texas at Arlington D0-SAR Workshop, LaTech.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
RENKEI:UGI Takashi Sasaki. Project history The RENKEI project led by Prof. Ken Miura of NII is funded by MEXT during JFY The goal of the project.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
U.S. ATLAS Grid Production Experience
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
The Ganga User Interface for Physics Analysis on Distributed Resources
Vandy Berten Luc Goossens Alvin Tan
Module 01 ETICS Overview ETICS Online Tutorials
Production Manager Tools (New Architecture)
Presentation transcript:

Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003

RWL Jones, Lancaster University Grid in ATLAS Grid in ATLAS ATLAS is a global collaboration, so the various Grid flavours are important ATLAS is a global collaboration, so the various Grid flavours are important Both US ATLAS and NorduGrid provide their own production tools Both US ATLAS and NorduGrid provide their own production tools US-ATLAS EDG Testbed Prod NorduGrid US-ATLAS EDG Testbed Prod NorduGrid

RWL Jones, Lancaster University  All the services are either taken from Globus, or written using Globus libraries and API  Should be fairly compatible with Globus-based solutions  Information system knows everything  Substantially re-worked and patched Globus MDS  Distributed and multi-rooted  Allows for a mesh topology  The server (“Grid manager”) on each gatekeeper does most of the job  No need for a centralized broker  Pre- and post- stages files  Interacts with PBS  Keeps track of job status  Cleans up the mess  Sends mails to users  The client (“User Interface”) does the Grid job submission, monitoring, termination, retrieval, cleaning etc  Interprets user’s job task  Gets the testbed status from the information system  Forwards the task to the best Grid Manager  Does some file uploading, if requested

RWL Jones, Lancaster University Features and problems  Features:  Relatively simple to join, expands rapidly  Installation is done on a single machine  Hides complexity of the distributed resources  Very convenient Replica Catalog implementation  Highly stable and reliable  Non-intrusive middleware  Accepts EDG certificates  Almost any runtime environment can be set up  Problems:  Standard (a la Globus2) authentication and authorization mechanisms  Simplified (not more than in Globus2) data management system  No persistent book-keeping service  Simplified recovery mechanisms (as much as LRMS provides)  Lacks big storage facilities  Only command-line interface  No standardized procedure for runtime environment installation and validation

RWL Jones, Lancaster University US GRAT Software  GRid Applications Toolkit  Used for U.S. Data Challenge production  Based on Globus, Magda, AMI & MySQL  Shell & Python scripts, modular design  Rapid development platform  Essentially scripts  Quickly develop packages as needed by DC  Single particle production  Higgs & SUSY production  Pileup production & data management  Reconstruction  Test grid middleware, test grid performance  Modules can be easily enhanced or replaced by Condor-G, EDG resource broker, Chimera, replica catalogue, OGSA… (in progress)

RWL Jones, Lancaster University GRAT Execution Model 1. Resource Discovery 2. Partition Selection 3. Job Creation 4. Pre-stage 5. Batch Submission 6. Job Parameterization 7. Simulation DC1 Prod. (UTA) Remote Gatekeeper Replica (local) MAGDA (BNL) Param (CERN) Batch Execution scratch 1,4,5, Post-stage 9. Cataloging 10. Monitoring

RWL Jones, Lancaster University US Middleware Evolution Used in current production software (GRAT & Grappa) Tested successfully (not yet used for large scale production) Under development and testing Tested for simulation (may be used for large scale reconstruction)

RWL Jones, Lancaster University  What is the Atlas Commander? –graphical interactive tool to support production manager define jobs in large quantities submit and monitor progress scan log files for (un)known errors update bookkeeping Databases (AMI, Magda) clean up in case of failures –Test bed for GANGA MC production components  AtCom has its own web site  atlas-project-atcom/  contains user guide, developer’s guide, documentation, downloads, relevant contact s, etc.

RWL Jones, Lancaster University  Architecture: application + plug-ins AtCom core AMIMgt MagdaMgt Bookkeeping DBs Magda AMI LSFComputingSystem EDGComputingSystem NGComputingSystem PBSComputingSystem Plug-ins... Clusters  Two main functions of AtCom  definition of jobs  job submission/monitoring

RWL Jones, Lancaster University  Architecture (continued) –plug-in implements abstract ‘cluster’ interface for specific clusters e.g. LSF –a plug-in is a Java class + configuration parameters e.g. –the AtCom configuration file defines all existing plug-ins and allows each to have its own configuration section they are loaded at run-time

RWL Jones, Lancaster University  Available plug-ins  LSF  well understood and supported  NorduGrid  development suspended  PBS  developed by Alvin Tan  EDG  working, but no EDG based clusters used in production  BQS  developed by Jerome Fulachier

RWL Jones, Lancaster University  Bookkeeping databases  5 logical database domains, two physical databases physics meta-data permanent production log recipe catalog transient production log replica catalog AMI (Atlas Meta-data Interface) - mySQL DB hosted at Grenoble Magda (Manager for grid-based data) - mySQL DB hosted at BNL

RWL Jones, Lancaster University  Monitoring  jobs you submit are automatically added to list of monitored jobs  running jobs can be recovered from the part_run_info table if needed  e.g. after having closed AtCom  any other partition can be added to the list as well  using SQL query composer  allows you to “see” also finished, defined jobs  for the bar charts of course

RWL Jones, Lancaster University

 When a job moves from RUNNING to DONE post processing commences –resolve validation script logical name into physical name and apply it to stdout/stderr in temp locations returns 1=OK, 2=Undecided or 3=Failed –if OK register output files with Magda replica catalog resolve extract script and apply it to stdout copy/move logfiles to final destination set status of partition to Validated  if Failed  delete output files  if Undecided  mark job as such  production manager can look at output of validation script or at the logfiles themselves and then force a decision as OK or Failed

RWL Jones, Lancaster University The Future  GANGA is starting to provide the required functionality  For DC2, a new tool is being built, and the GANGA core should be its basis.  DCs require immediate solutions  Robust tools require slow development