BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.

Slides:

Advertisements

Similar presentations

Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.

Advertisements

Configuration management

Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.

Security Q&A OSG Site Administrators workshop Indianapolis August Doug Olson LBNL.

Andrew McNab - EDG Access Control - 14 Jan 2003 EU DataGrid security with GSI and Globus Andrew McNab University of Manchester

The Community Authorisation Service – CAS Dr Steven Newhouse Technical Director London e-Science Centre Department of Computing, Imperial College London.

GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.

BaBarGrid: Some UK developments Roger Barlow Imperial College 13th September 2002.

David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.

Basic Grid Job Submission Alessandra Forti 28 March 2006.

The story of BaBar: an IT perspective Roger Barlow DESY 4 th September 2002.

DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.

Copyright B. Wilkinson, This material is the property of Professor Barry Wilkinson (UNC-Charlotte) and is for the sole and exclusive use of the students.

1 Use of the European Data Grid software in the framework of the BaBar distributed computing model T. Adye (1), R. Barlow (2), B. Bense (3), D. Boutigny.

K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.

The B A B AR G RID demonstrator Tim Adye, Roger Barlow, Alessandra Forti, Andrew McNab, David Smith What is BaBar? The BaBar detector is a High Energy.

Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.

The Internet & The World Wide Web Notes

The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.

QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.

M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,

AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.

A Web 2.0 Portal for Teragrid Fugang Wang Gregor von Laszewski May 2009.

5 Chapter Five Web Servers. 5 Chapter Objectives Learn about the Microsoft Personal Web Server Software Learn how to improve Web site performance Learn.

1. Introduction  The JavaScript Grid Portal is trying to find a way to access Grid through Web browser, while using Web 2.0 technologies  The portal.

3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.

9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.

1 In the good old days... Years ago… the WWW was made up of (mostly) static documents. –Each URL corresponded to a single file stored on some hard disk.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America GENIUS server installation and configuration.

INFSO-RI Enabling Grids for E-sciencE The GENIUS Grid portal Tony Calanducci INFN Catania - Italy First Latin American Workshop.

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.

Distribution After Release Tool Natalia Ratnikova.

8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.

PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)

LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.

Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.

Evolution of the Open Science Grid Authentication Model Kevin Hill Fermilab OSG Security Team.

Event Data History David Adams BNL Atlas Software Week December 2001.

November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.

Andrey Meeting 7 October 2003 General scheme: jobs are planned to go where data are and to less loaded clusters SUNY.

CHEP03 Mar 25Mary Thompson Fine-grained Authorization for Job and Resource Management using Akenti and Globus Mary Thompson LBL,Kate Keahey ANL, Sam Lang.

SkimData and Replica Catalogue Alessandra Forti BaBar Collaboration Meeting November 13 th 2002 skimData based replica catalogue RLS (Replica Location.

Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.

T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.

Role Based VO Authorization Services Ian Fisk Gabriele Carcassi July 20, 2005.

NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN

VO Privilege Activity. The VO Privilege Project develops and implements fine-grained authorization to grid- enabled resources and services Started Spring.

David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.

Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF:

David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.

INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.

Introduction to AFS IMSA Intersession 2003 An Overview of AFS Brian Sebby, IMSA ’96 Copyright 2003 by Brian Sebby, Copies of these slides.

Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

STAR Scheduler Gabriele Carcassi STAR Collaboration.

+ Support multiple virtual environment for Grid computing Dr. Lizhe Wang.

StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group

Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.

Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)

Classic Storage Element

Moving the LHCb Monte Carlo production system to the GRID

THE STEPS TO MANAGE THE GRID

Initial job submission and monitoring efforts with JClarens

EGEE Middleware: gLite Information Systems (IS)

Grid Computing Software Interface

Presentation transcript:

BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar computing group

Introduction BaBar computing is a evolving towards a distributed model rather than centralized one. The main goal is to allow to all the physicist of the collaboration to have access to all the resources. BaBar computing is a evolving towards a distributed model rather than centralized one. The main goal is to allow to all the physicist of the collaboration to have access to all the resources. As an exercise to highlight what we need to get from more sophisticated middleware we have tried to solve some of these problems with the existing technology in two ways. As an exercise to highlight what we need to get from more sophisticated middleware we have tried to solve some of these problems with the existing technology in two ways.

Introduction The first way, that resulted in the BaBarGrid demonstrator, is run through a WEB browser from the user laptop or desktop and doesn't require supplementary software on this platform. The first way, that resulted in the BaBarGrid demonstrator, is run through a WEB browser from the user laptop or desktop and doesn't require supplementary software on this platform. The second way is to use globus as an extended batch system command line on a system with afs access and the aim is to simplify the input output sandbox problem through a shared file system. The afs tokens are maintained using gsiklog. The second way is to use globus as an extended batch system command line on a system with afs access and the aim is to simplify the input output sandbox problem through a shared file system. The afs tokens are maintained using gsiklog.

Components Common to both Common to both BaBar VO BaBar VO Generic Accounts Generic Accounts Globus Authentication and Authorization Globus Authentication and Authorization globus command line tools globus command line tools Data Location according to user specifications done with BaBar metadata catalog Data Location according to user specifications done with BaBar metadata catalog Different: Different: WEB browser and http server WEB browser and http server AFS AFS

BaBar VO (Virtual Organization) Any BaBar grid user has by definition a Grid certificate from an accepted authority, and an account on the central SLAC system with BaBar authorisation in the afs acl list. Any BaBar grid user has by definition a Grid certificate from an accepted authority, and an account on the central SLAC system with BaBar authorisation in the afs acl list. Users can register for BaBarGrid use just by copying their DN (Distinguished Name) into a file in their home area at SLAC. Users can register for BaBarGrid use just by copying their DN (Distinguished Name) into a file in their home area at SLAC. A cron job then picks this up and sends it to the central BaBar VO machine after checking the afs acl lists. A cron job then picks this up and sends it to the central BaBar VO machine after checking the afs acl lists. With another cron job all participating sites pick up the list of authorized BaBar users and insert it into their gridmap files with the generic userid.babar. With another cron job all participating sites pick up the list of authorized BaBar users and insert it into their gridmap files with the generic userid.babar.

VO maintenance (2) Local system manager retains the power to modify the cron job that pulls the grid map file. Local system manager retains the power to modify the cron job that pulls the grid map file. With the generic userids there is no need to create accounts for each user at each site. With the generic userids there is no need to create accounts for each user at each site. It is straightforward to ensure that these generic accounts have low levels of privilege, and local users are given priority over ones from outside. It is straightforward to ensure that these generic accounts have low levels of privilege, and local users are given priority over ones from outside. This system has proved easy to operate and reliable. This system has proved easy to operate and reliable.

input sandbox (1) For each job one requires: For each job one requires: The binary, the data files The binary, the data files a set of.tcl files a set of.tcl files a.tcl file specifying all the data files for this job a.tcl file specifying all the data files for this job a small.tcl file that pulls in the others a small.tcl file that pulls in the others a large.tcl file containing standard procedural stuff a large.tcl file containing standard procedural stuff various other.dat various other.dat the calibration (conditions) database the calibration (conditions) database The setting of appropriate environment variables The setting of appropriate environment variables The presence of some dynamic (shared) libraries The presence of some dynamic (shared) libraries

input sandbox (2) For BaBar this is a particular problem because it is assumed that the job runs in a ‘test release directory' in which all these files are made available through pointers to a parent release. For BaBar this is a particular problem because it is assumed that the job runs in a ‘test release directory' in which all these files are made available through pointers to a parent release. Alternatives for this problem are: Alternatives for this problem are: Only to run at sites where the desired parent release is available. Too restrictive. Only to run at sites where the desired parent release is available. Too restrictive. Provide these files and ship them (demonstrator) Provide these files and ship them (demonstrator) To run from within an afs directory. Use gsiklog to gain access to the test and the parent releases and cd to the test release as the very first step of each job (job submission within afs) To run from within an afs directory. Use gsiklog to gain access to the test and the parent releases and cd to the test release as the very first step of each job (job submission within afs)

data location (1) Data location is done through a metadata catalog. Data location is done through a metadata catalog. Each site has a slightly modified replica of the central catalog in which collections (root files or objectivity collections) on local disk are flagged. Each catalog allows read access from outside. Each site has a slightly modified replica of the central catalog in which collections (root files or objectivity collections) on local disk are flagged. Each catalog allows read access from outside. Users can make their own specification for the data. Users can make their own specification for the data. They then provide an ordered list of sites. The system locates the matching data available at the first site querying its catalog. It then enquires the second site for matching the data that wasn’t at the first one, and this is repeated through the site list. They then provide an ordered list of sites. The system locates the matching data available at the first site querying its catalog. It then enquires the second site for matching the data that wasn’t at the first one, and this is repeated through the site list.

data location (2) The previous method has been improved adding to the metadata catalog the list of sites and the list indexes that uniquely identify in the catalog each collection on disk at each site. The previous method has been improved adding to the metadata catalog the list of sites and the list indexes that uniquely identify in the catalog each collection on disk at each site. This has improved the speed of the query because the data selection is done only once on a local database. This has improved the speed of the query because the data selection is done only once on a local database. A user doesn’t have to give anymore a list of sites but he can manually exclude them if needed. A user doesn’t have to give anymore a list of sites but he can manually exclude them if needed. Jobs are split accordingly to the sites with the data and to user specifications like the number of events to be processed in each job. Jobs are split accordingly to the sites with the data and to user specifications like the number of events to be processed in each job. If some data exist at more than one site this is reported in an index file that maps the tcl files with site names. If some data exist at more than one site this is reported in an index file that maps the tcl files with site names.

demonstrator job submission The user creates a grid-proxy and then uploads it into the server. The user creates a grid-proxy and then uploads it into the server. This provides a single entry authorisation point, as the browser then uses this certificate to authenticate the globus job submission. This provides a single entry authorisation point, as the browser then uses this certificate to authenticate the globus job submission. The server can then submit jobs to the remote sites on behalf of the user using globus-job-submit. The server can then submit jobs to the remote sites on behalf of the user using globus-job-submit. Job submission is done by a cgi perl script running in a web server. Job submission is done by a cgi perl script running in a web server. There is no other resource matching but the data. There is no other resource matching but the data.

demonstrator job submission Data selection in this case is done querying all the sites Data selection in this case is done querying all the sites Jobs are grouped accordingly to the sites where they will be submitted. Jobs are grouped accordingly to the sites where they will be submitted. For convenience when collecting the output For convenience when collecting the output Each of these groups is called superjob and is assigned an superjobid. Each of these groups is called superjob and is assigned an superjobid. The totality of superjobs is called hyperjob and is assigned a unique id hyperjobid The totality of superjobs is called hyperjob and is assigned a unique id hyperjobid For each superjob there is a Job0 in which the input sandbox is copied to the remote site. The other jobs part of the group just follow. For each superjob there is a Job0 in which the input sandbox is copied to the remote site. The other jobs part of the group just follow.

demonstrator output sandbox As each job finishes it moves its output file to one directory /path/ and then tars together all the files there. As each job finishes it moves its output file to one directory /path/ and then tars together all the files there. The user can then request that the outputs are collected on a machine local to the http server which has spare disk space and can run grid-ftp. This machine copies all the superjobs outputs in one directory /path/. The user can then request that the outputs are collected on a machine local to the http server which has spare disk space and can run grid-ftp. This machine copies all the superjobs outputs in one directory /path/. A link is provided and a specific MIME type given. When the link is clicked on the hyperjob directory is downloaded to the desktop machine where the application specific to this type has been arranged to unpack the directory and run a standard analysis job on it to draw the desired histograms. A link is provided and a specific MIME type given. When the link is clicked on the hyperjob directory is downloaded to the desktop machine where the application specific to this type has been arranged to unpack the directory and run a standard analysis job on it to draw the desired histograms.

Output collector User WEB browser demonstrator http server Remote site globus_job_submit output retrieval data location input sandbox transfer Metadata catalog

job submission with afs All BaBar software and the user test release (working directory) are in afs All BaBar software and the user test release (working directory) are in afs There is no need to provide and ship data or tcl files because those can be accessed through links to the parent releases. There is no need to provide and ship data or tcl files because those can be accessed through links to the parent releases. User locates data with the previously described method and data tcl files are stored in the test release. User locates data with the previously described method and data tcl files are stored in the test release. User create a proxy. User create a proxy. Jobs are submitted to different sites according to how the data have been split. Jobs are submitted to different sites according to how the data have been split. There is no need of categorizing jobs in hyper, super and effective jobs for collecting the output. There is no need of categorizing jobs in hyper, super and effective jobs for collecting the output.

job submission with afs Job submission requires: Job submission requires: gsiklog is copied and executed to gain access to the working directory and all the other software. gsiklog is copied and executed to gain access to the working directory and all the other software. Some environment variables like LD_LIBRAY_PATH are redefined to override the remote batch nodes setup. This happens also in the demonstrator Some environment variables like LD_LIBRAY_PATH are redefined to override the remote batch nodes setup. This happens also in the demonstrator The output sandbox is simply written back in the working directory and doesn’t require any special treatment. The output sandbox is simply written back in the working directory and doesn’t require any special treatment.

BaBar job submission with afs Remote site AFS cell Remote site Remote site farm globus_job_submit Local site farm globus_job_submit output User area in AFS cell gsiklog output Users desktops globus_job_submit Metadata catalog data location gsiklog

Conclusions The use of a shared file system as AFS has resulted in a great simplification of the input/output sandbox, especially in a complicated case like the user analysis one. The use of a shared file system as AFS has resulted in a great simplification of the input/output sandbox, especially in a complicated case like the user analysis one. There might be concern about the performance, but the comparison here should be done between running on a overloaded local system or running on a non- overloaded shared system. There might be concern about the performance, but the comparison here should be done between running on a overloaded local system or running on a non- overloaded shared system. The experience with the demonstrator has resulted in a nice GUI but it lacks of flexibility due to the fact that the http server has to be setup on purpose and it requires a three step data transfer to bring the output back to the user desktop. The experience with the demonstrator has resulted in a nice GUI but it lacks of flexibility due to the fact that the http server has to be setup on purpose and it requires a three step data transfer to bring the output back to the user desktop.