Presentation is loading. Please wait.

Presentation is loading. Please wait.

BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.

Similar presentations


Presentation on theme: "BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar."— Presentation transcript:

1 BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar computing group

2 Introduction BaBar computing is a evolving towards a distributed model rather than centralized one. The main goal is to allow to all the physicist of the collaboration to have access to all the resources. BaBar computing is a evolving towards a distributed model rather than centralized one. The main goal is to allow to all the physicist of the collaboration to have access to all the resources. As an exercise to highlight what we need to get from more sophisticated middleware we have tried to solve some of these problems with the existing technology in two ways. As an exercise to highlight what we need to get from more sophisticated middleware we have tried to solve some of these problems with the existing technology in two ways.

3 Introduction The first way, that resulted in the BaBarGrid demonstrator, is run through a WEB browser from the user laptop or desktop and doesn't require supplementary software on this platform. The first way, that resulted in the BaBarGrid demonstrator, is run through a WEB browser from the user laptop or desktop and doesn't require supplementary software on this platform. The second way is to use globus as an extended batch system command line on a system with afs access and the aim is to simplify the input output sandbox problem through a shared file system. The afs tokens are maintained using gsiklog. The second way is to use globus as an extended batch system command line on a system with afs access and the aim is to simplify the input output sandbox problem through a shared file system. The afs tokens are maintained using gsiklog.

4 Components Common to both Common to both BaBar VO BaBar VO Generic Accounts Generic Accounts Globus Authentication and Authorization Globus Authentication and Authorization globus command line tools globus command line tools Data Location according to user specifications done with BaBar metadata catalog Data Location according to user specifications done with BaBar metadata catalog Different: Different: WEB browser and http server WEB browser and http server AFS AFS

5 BaBar VO (Virtual Organization) Any BaBar grid user has by definition a Grid certificate from an accepted authority, and an account on the central SLAC system with BaBar authorisation in the afs acl list. Any BaBar grid user has by definition a Grid certificate from an accepted authority, and an account on the central SLAC system with BaBar authorisation in the afs acl list. Users can register for BaBarGrid use just by copying their DN (Distinguished Name) into a file in their home area at SLAC. Users can register for BaBarGrid use just by copying their DN (Distinguished Name) into a file in their home area at SLAC. A cron job then picks this up and sends it to the central BaBar VO machine after checking the afs acl lists. A cron job then picks this up and sends it to the central BaBar VO machine after checking the afs acl lists. With another cron job all participating sites pick up the list of authorized BaBar users and insert it into their gridmap files with the generic userid.babar. With another cron job all participating sites pick up the list of authorized BaBar users and insert it into their gridmap files with the generic userid.babar.

6 VO maintenance (2) Local system manager retains the power to modify the cron job that pulls the grid map file. Local system manager retains the power to modify the cron job that pulls the grid map file. With the generic userids there is no need to create accounts for each user at each site. With the generic userids there is no need to create accounts for each user at each site. It is straightforward to ensure that these generic accounts have low levels of privilege, and local users are given priority over ones from outside. It is straightforward to ensure that these generic accounts have low levels of privilege, and local users are given priority over ones from outside. This system has proved easy to operate and reliable. This system has proved easy to operate and reliable.

7 input sandbox (1) For each job one requires: For each job one requires: The binary, the data files The binary, the data files a set of.tcl files a set of.tcl files a.tcl file specifying all the data files for this job a.tcl file specifying all the data files for this job a small.tcl file that pulls in the others a small.tcl file that pulls in the others a large.tcl file containing standard procedural stuff a large.tcl file containing standard procedural stuff various other.dat various other.dat the calibration (conditions) database the calibration (conditions) database The setting of appropriate environment variables The setting of appropriate environment variables The presence of some dynamic (shared) libraries The presence of some dynamic (shared) libraries

8 input sandbox (2) For BaBar this is a particular problem because it is assumed that the job runs in a ‘test release directory' in which all these files are made available through pointers to a parent release. For BaBar this is a particular problem because it is assumed that the job runs in a ‘test release directory' in which all these files are made available through pointers to a parent release. Alternatives for this problem are: Alternatives for this problem are: Only to run at sites where the desired parent release is available. Too restrictive. Only to run at sites where the desired parent release is available. Too restrictive. Provide these files and ship them (demonstrator) Provide these files and ship them (demonstrator) To run from within an afs directory. Use gsiklog to gain access to the test and the parent releases and cd to the test release as the very first step of each job (job submission within afs) To run from within an afs directory. Use gsiklog to gain access to the test and the parent releases and cd to the test release as the very first step of each job (job submission within afs)

9 data location (1) Data location is done through a metadata catalog. Data location is done through a metadata catalog. Each site has a slightly modified replica of the central catalog in which collections (root files or objectivity collections) on local disk are flagged. Each catalog allows read access from outside. Each site has a slightly modified replica of the central catalog in which collections (root files or objectivity collections) on local disk are flagged. Each catalog allows read access from outside. Users can make their own specification for the data. Users can make their own specification for the data. They then provide an ordered list of sites. The system locates the matching data available at the first site querying its catalog. It then enquires the second site for matching the data that wasn’t at the first one, and this is repeated through the site list. They then provide an ordered list of sites. The system locates the matching data available at the first site querying its catalog. It then enquires the second site for matching the data that wasn’t at the first one, and this is repeated through the site list.

10 data location (2) The previous method has been improved adding to the metadata catalog the list of sites and the list indexes that uniquely identify in the catalog each collection on disk at each site. The previous method has been improved adding to the metadata catalog the list of sites and the list indexes that uniquely identify in the catalog each collection on disk at each site. This has improved the speed of the query because the data selection is done only once on a local database. This has improved the speed of the query because the data selection is done only once on a local database. A user doesn’t have to give anymore a list of sites but he can manually exclude them if needed. A user doesn’t have to give anymore a list of sites but he can manually exclude them if needed. Jobs are split accordingly to the sites with the data and to user specifications like the number of events to be processed in each job. Jobs are split accordingly to the sites with the data and to user specifications like the number of events to be processed in each job. If some data exist at more than one site this is reported in an index file that maps the tcl files with site names. If some data exist at more than one site this is reported in an index file that maps the tcl files with site names.

11 demonstrator job submission The user creates a grid-proxy and then uploads it into the server. The user creates a grid-proxy and then uploads it into the server. This provides a single entry authorisation point, as the browser then uses this certificate to authenticate the globus job submission. This provides a single entry authorisation point, as the browser then uses this certificate to authenticate the globus job submission. The server can then submit jobs to the remote sites on behalf of the user using globus-job-submit. The server can then submit jobs to the remote sites on behalf of the user using globus-job-submit. Job submission is done by a cgi perl script running in a web server. Job submission is done by a cgi perl script running in a web server. There is no other resource matching but the data. There is no other resource matching but the data.

12 demonstrator job submission Data selection in this case is done querying all the sites Data selection in this case is done querying all the sites Jobs are grouped accordingly to the sites where they will be submitted. Jobs are grouped accordingly to the sites where they will be submitted. For convenience when collecting the output For convenience when collecting the output Each of these groups is called superjob and is assigned an superjobid. Each of these groups is called superjob and is assigned an superjobid. The totality of superjobs is called hyperjob and is assigned a unique id hyperjobid The totality of superjobs is called hyperjob and is assigned a unique id hyperjobid For each superjob there is a Job0 in which the input sandbox is copied to the remote site. The other jobs part of the group just follow. For each superjob there is a Job0 in which the input sandbox is copied to the remote site. The other jobs part of the group just follow.

13 demonstrator output sandbox As each job finishes it moves its output file to one directory /path/ and then tars together all the files there. As each job finishes it moves its output file to one directory /path/ and then tars together all the files there. The user can then request that the outputs are collected on a machine local to the http server which has spare disk space and can run grid-ftp. This machine copies all the superjobs outputs in one directory /path/. The user can then request that the outputs are collected on a machine local to the http server which has spare disk space and can run grid-ftp. This machine copies all the superjobs outputs in one directory /path/. A link is provided and a specific MIME type given. When the link is clicked on the hyperjob directory is downloaded to the desktop machine where the application specific to this type has been arranged to unpack the directory and run a standard analysis job on it to draw the desired histograms. A link is provided and a specific MIME type given. When the link is clicked on the hyperjob directory is downloaded to the desktop machine where the application specific to this type has been arranged to unpack the directory and run a standard analysis job on it to draw the desired histograms.

14 Output collector User WEB browser demonstrator http server Remote site globus_job_submit output retrieval data location input sandbox transfer Metadata catalog

15 job submission with afs All BaBar software and the user test release (working directory) are in afs All BaBar software and the user test release (working directory) are in afs There is no need to provide and ship data or tcl files because those can be accessed through links to the parent releases. There is no need to provide and ship data or tcl files because those can be accessed through links to the parent releases. User locates data with the previously described method and data tcl files are stored in the test release. User locates data with the previously described method and data tcl files are stored in the test release. User create a proxy. User create a proxy. Jobs are submitted to different sites according to how the data have been split. Jobs are submitted to different sites according to how the data have been split. There is no need of categorizing jobs in hyper, super and effective jobs for collecting the output. There is no need of categorizing jobs in hyper, super and effective jobs for collecting the output.

16 job submission with afs Job submission requires: Job submission requires: gsiklog is copied and executed to gain access to the working directory and all the other software. gsiklog is copied and executed to gain access to the working directory and all the other software. Some environment variables like LD_LIBRAY_PATH are redefined to override the remote batch nodes setup. This happens also in the demonstrator Some environment variables like LD_LIBRAY_PATH are redefined to override the remote batch nodes setup. This happens also in the demonstrator The output sandbox is simply written back in the working directory and doesn’t require any special treatment. The output sandbox is simply written back in the working directory and doesn’t require any special treatment.

17 BaBar job submission with afs Remote site AFS cell Remote site Remote site farm globus_job_submit Local site farm globus_job_submit output User area in AFS cell gsiklog output Users desktops globus_job_submit Metadata catalog data location gsiklog

18 Conclusions The use of a shared file system as AFS has resulted in a great simplification of the input/output sandbox, especially in a complicated case like the user analysis one. The use of a shared file system as AFS has resulted in a great simplification of the input/output sandbox, especially in a complicated case like the user analysis one. There might be concern about the performance, but the comparison here should be done between running on a overloaded local system or running on a non- overloaded shared system. There might be concern about the performance, but the comparison here should be done between running on a overloaded local system or running on a non- overloaded shared system. The experience with the demonstrator has resulted in a nice GUI but it lacks of flexibility due to the fact that the http server has to be setup on purpose and it requires a three step data transfer to bring the output back to the user desktop. The experience with the demonstrator has resulted in a nice GUI but it lacks of flexibility due to the fact that the http server has to be setup on purpose and it requires a three step data transfer to bring the output back to the user desktop.


Download ppt "BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar."

Similar presentations


Ads by Google