Glexec/SCAS Pilot: IN2P3-CC status 07/09/2018 2009/04/08 Glexec/SCAS Pilot: IN2P3-CC status Pierre Girard CCIN2P3 T1-T2 2009-02-03
Grid deployment at CCIN2P3 Initial plan for pilot of Glexec/Scas 07/09/2018 Content Grid deployment at CCIN2P3 Initial plan for pilot of Glexec/Scas Setting-up issues Conclusion Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Grid Job Management at CCIN2P3 07/09/2018 Grid Job Management at CCIN2P3 Several Grid WN versions at time AFS Computing Element Computing Element Computing Element Computing Element Glite-WN-3.1.26-glexec Glite-WN-3.1.26-prod BQS Glite-WN-3.1.19-prod Anastasie Glite-WN-3.1.666-pps No MW locally on worker WN WN WN WN WN WN WN WN Globus4-WN Shared FS ( Computing Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Overview of grid job submission 07/09/2018 Overview of grid job submission Grid Job Credentials 1 RSL WN U-job Submit Glite-WN Computing Element U-job SL4.5 4 Job Manager spawn 2 Local Job Wrapping 3 BQS #!/bin/sh #PBS -q T #PBS -l M=2200MB #PBS -l T=3801600 #PBS -l scratch=16250MB #PBS -l platform=LINUX #PBS --share T1prod … qsub U-job Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
WN profile selection by BQS JobManager 07/09/2018 WN profile selection by BQS JobManager Grid Job Credentials 1 RSL WN U-job Submit BQS-JM config Glite-WN Computing Element 6 U-job Dynamically link to WN profile SL4.5 BQS JM 5 rules spawn 2 Globus4-WN Glite-WN-3.1.666-pps Glite-WN-3.1.19-prod Glite-WN-3.1.26-prod Glite-WN-3.1.26-glexec Local Job Wrapping 3 4 BQS Set WN profile qsub Glite-WN-3.1.26-glexec U-job AFS Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Last BQS JM enhancements 07/09/2018 Last BQS JM enhancements BQS JM control Submission policy (deny, accept) Forbearance management if BQS becomes unresponsive BQS JM Outputs BQS submission parameters Class: A (=short), G (=Medium), T (=Long), J (=verylong) Amount of {Mem, CPU, Scratch} Farm name Platform (SL3, SL4, SL5) Logical resources (list of) u_dcache_atlas, u_dcache_alice, u_OracleStress_atlas, … VO Share Wrapped data WN profile to be used profilesDirectory = /afs/ Site Name AFS token (or not) Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Last BQS JM enhancements 07/09/2018 Last BQS JM enhancements BQS JM configuration capabilities (Most of) BQS JM outputs are determined according to configuration rules A rule is basically an assignment Ex.: SubmissionPolicy = ACCEPT But can be conditionned depending on some job input data (in the precedence order) Mapped account Mapped group CE queue Ex.: UserSubmissionPolicy_atlas050 = DENY # Specific requirements for ATLAS with queue verylong GroupVirtualQueueMaxMem_atlas_verylong = default GroupVirtualQueueMaxCPU_atlas_verylong = max GroupVirtualQueueMaxScratch_atlas_verylong = default Configuration syntax Is quite ugly Makes the condition combination not possible But, seems enough for now Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Glexec deployment at IN2P3-CC 07/09/2018 Glexec deployment at IN2P3-CC Glexec is a tool to be deployed on the WN to be used by the VOs to manage the « real user jobs » within a job pilot With a setuid capability (job pilot forks the « real user job » by using another account) Site authorization by « real user job » based on real user proxy How the deployment was planned Deploy the Glite-WN/Glexec relocated on AFS Use the configuration capabilities to redirect the pilot jobs to this deployment profilesDirectory = /afs/ UserProfilesDirectory_dteam049 = /afs/ Sounded easy… Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Glexec deployment Issues at IN2P3-CC 07/09/2018 Glexec deployment Issues at IN2P3-CC Glexec requires to be locally installed on Worker Configuration file absolute path hardcoded /opt/glite/etc/glite.conf Only one MW configuration possible Dynamic library configuration (due to « setuid ») /etc/ Only one MW installation possible Log configuration (syslog) Not so problematic for now Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Glexec deployment in use at IN2P3-CC 07/09/2018 Glexec deployment in use at IN2P3-CC We are part of the « SCAS Pilot Service » Asked to provide SCAS/glexec in production Load test for SCAS services by Atlas and Lhcb Deployment done Useable by both LHCb and Dteam Through the T1 CEs According to specific VOMS roles/groups But Deployment issues Break down our WN setup strategy Relocatable distribution was not ready (home-made) First tests with LHCb Were not satisfactory Raised some questions Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08
Glite-WN only Will be activated Glite-3.2.0 (SL5) at IN2P3-CC 07/09/2018 Glite-3.2.0 (SL5) at IN2P3-CC Glite-WN only Deployed on AFS Tested with a test CE on BQS Farm « lcg » Will be activated as soon as SL5 workers enter the production (done) A queue will be added to the T2 (T1?) CEs Pierre Girard - Glexec/SCAS: IN2P3-CC status 2009/04/08