Tech talk 20th June 2003 1 Andrey Grid architecture at PHENIX Job monitoring and related stuff in multi cluster environment.

Slides:



Advertisements
Similar presentations
Globus Workshop at CoreGrid Summer School 2006 Dipl.-Inf. Hamza Mehammed Leibniz Computing Centre.
Advertisements

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
1 OBJECTIVES To generate a web-based system enables to assemble model configurations. to submit these configurations on different.
13/05/2004Janusz Martyniak Imperial College London 1 Using Ganga to Submit BaBar Jobs Development Status.
DataGrid is a project funded by the European Union 22 September 2003 – n° 1 EDG WP4 Fabric Management: Fabric Monitoring and Fault Tolerance
1 GRID Workshop – Toulouse IAS – October 20th Ground segment & products Department Globus TK4 experiment for image data processing : security architecture,
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
Use of R-GMA in BOSS Henry Nebrensky (Brunel University) VRVS 26 April 2004 Some slides stolen from various talks at EDG 2 nd Review (
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Computational grids and grids projects DSS,
PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)
LHCb and DataGRID - the workplan for 2001 Eric van Herwijnen Wednesday, 28 march 2001.
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
HEPD sem 14-Dec Andrey History photos: A. Shevel reports on CSD seminar about new Internet facilities at PNPI (Jan 1995)
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
The ALICE short-term use case DataGrid WP6 Meeting Milano, 11 Dec 2000Piergiorgio Cerello 1 Physics Performance Report (PPR) production starting in Feb2001.
Andrey Meeting 7 October 2003 General scheme: jobs are planned to go where data are and to less loaded clusters SUNY.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
Creating and running an application.
Portal Update Plan Ashok Adiga (512)
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
Tier 3 Status at Panjab V. Bhatnagar, S. Gautam India-CMS Meeting, July 20-21, 2007 BARC, Mumbai Centre of Advanced Study in Physics, Panjab University,
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
STAR Scheduler Gabriele Carcassi STAR Collaboration.
User Interface UI TP: UI User Interface installation & configuration.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
9 Copyright © 2004, Oracle. All rights reserved. Getting Started with Oracle Migration Workbench.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Vandy Berten Luc Goossens Alvin Tan
Initial job submission and monitoring efforts with JClarens
Alice Software Demonstration
Michael P. McCumber Task Force Meeting April 3, 2006
Production Manager Tools (New Architecture)
Presentation transcript:

Tech talk 20th June Andrey Grid architecture at PHENIX Job monitoring and related stuff in multi cluster environment

Andrey Meeting 13 October 2003 Plan  General PHENIX grid scheme  Available Grid components  Conceptions and scenario for multi cluster environment  Job submission and job monitoring  Live demonstration

Andrey Meeting 13 October 2003 General scheme: jobs are planned to go where data are and to less loaded clusters SUNY RAM RCF Main Data Repository Partial Data Replica File Catalog

Andrey Meeting 13 October 2003 GridFTP (Globus-url-copy) Globus job-manager/fork Package GSUNY Cataloging engine BOSS BODE User Jobs GT latest Base subsystems for PHENIX Grid

Andrey Meeting 13 October 2003 Input/Output Sandbox(es) Master Job (script) submitted by user Major Data Sets ( physics or simulated data) Minor Data Sets (Parameters, scripts, etc.) Satellite Job (script) Submitted by Master Job Conceptions

Andrey Meeting 13 October 2003 The job submission scenario at remote Grid cluster  To determine (to know) qualified computing cluster: available disk space, installed software, etc.  To copy/replicate the major data sets to remote cluster.  To copy the minor data sets (scripts, parameters, etc.) to remote cluster.  To start the master job (script) which will submit many jobs with default batch system.  To watch the jobs with monitoring system – BOSS/BODE.  To copy the result data from remote cluster to target destination (desktop or RCF).

Andrey Meeting 13 October 2003 Master job-script  The master script is submitted from your desktop and performed on the Globus gateway (may be in group account) with using monitoring tool (it is assumed BOSS).  It is supposed that the master script will find the following information in the environment variables:  CLUSTER_NAME – name of the cluster;  BATCH_SYSTEM – name of the batch system;  BATCH_SUBMIT – command for job submission through BATCH_SYSTEM.

Andrey Meeting 13 October 2003 Local desktop Globus gateway Submission of MASTER job Through globus-jobmanager/fork Remote Cluster Job submission with Command $BATCH_SUBMIT MASTER job is performing On Globus gateway Job submission scenario

Andrey Meeting 13 October 2003 Transfer the major data sets  There are a number of methods to transfer major data sets:  The utility bbftp (whithout use of GSI) can be used to transfer the data between clusters;  The utility gcopy (with use of GSI) can be used to copy the data from one cluster to another one.  Any third party data transfer facilities (e.g. HRM/SRM).

Andrey Meeting 13 October 2003 Copy the minor data sets  There are at least two alternative methods to copy the minor data sets (scripts, parameters, constants, etc.):  To copy the data to /afs/rhic.bnl.gov/phenix/users/user_account/…  To copy the data with the utility CopyMinorData (part of package gsuny).

Andrey Meeting 13 October 2003 Package gsuny List of scripts  General commands ( ftp://ram3.chem.sunysb.edu/pub/suny-gt-2/gsuny.tar.gz)  GPARAM – configuration description for set of remote clusters;  gsub – to submit the job on less loaded cluster;  gsub-data – to submit the job where data are;  gstat – to get status of the job;  gget – to get the standard output;  ghisj – to show job history (which job was submitted, when and where);  gping – to test availability of the Globus gateways.

Andrey Meeting 13 October 2003 Package gsuny List of scripts (continued)  GlobusUserAccountCheck – to check the Globus configuration for local user account.  gdemo – to see the load of remote clusters.  gcopy – to copy the data from one cluster (local hosts) to another one.  CopyMinorData – to copy minor data sets from cluster (local host) to cluster.

Andrey Meeting 13 October 2003 Job monitoring  After the initial development of the description of required monitoring tool ( eeting4Aug2003/jobsub.pdf ) it was found the packages:  Batch Object Submission System (BOSS) by Claudio Grandi  Web interface BO SS D ATABASE E XPLORER (BODE) by Alexei Filine

Andrey Meeting 13 October 2003 Basic BOSS components boss executable: the BOSS interface to the user MySQL database: where BOSS stores job information jobExecutor executable: the BOSS wrapper around the user job dbUpdator executable: the process that writes to the database while the job is running Interface to Local scheduler

Andrey Meeting 13 October 2003 Basic job flow boss submit boss query boss kill BOSS DB BOSS Local Scheduler Wrapper Exec node n Exec node m Globus Space Globus gateway BODE (Web interface) Here is cluster N gsub master-script

Andrey Meeting 13 October 2003 gsub TbossSuny # submit to less loaded cluster shevel]$ cat TbossSuny. /etc/profile. ~/.bashrc echo " This is master JOB" printenv boss submit -jobtype ram3master -executable ~/andrey.shevel/TestRemoteJobs.pl -stdout \ ~/andrey.shevel/master.out -stderr ~/andrey.shevel/master.err shevel]$ CopyMinorData local:andrey.shevel unm: YOU are copying THE minor DATA sets --FROM-- --TO-- Gateway = 'localhost' 'loslobos.alliance.unm.edu' Directory = '/home/shevel/andrey.shevel' '/users/shevel/.' Transfer of the file '/tmp/andrey.shevel.tgz5558' was succeeded

Andrey Meeting 13 October 2003 Status of the PHENIX Grid  Live info is available on the page  The group account ‘phenix’ is available now at  SUNYSB (rserver1.i2net.sunysb.edu)  UNM (loslobos.alliance.unm.edu)  IN2P3 (in process now)

Andrey Meeting 13 October 2003 OrganizationGrid gatewayContact person Status BNL PHENIX (RCF) phenixgrid01.rcf.bnl.gov GT 2.2.4; LSF Dantong Yutested SUNYSB (RAM) rserver1.i2net.sunysb.edu GT 2.2.3; PBS Andrey Shevel tested New Mexico loslobos.alliance.unm.edu GT 2.2.4; PBS Tim Thomas No PHENIX software. IN2P3 (France) ccgridli03.in2p3.fr GT 2.2.3; BQS Albert Romana tested Vanderbilt Grid gateway is not yet available for testing Indrani Ojha Not tested

Andrey Meeting 13 October User: guest Pass: Guest101 Live Demo for BOSS Job monitoring