Download presentation
Presentation is loading. Please wait.
Published byBryce Hood Modified over 9 years ago
1
EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks A GSI-secured job manager for connecting PBS servers in independent administrative domains John Walsh, Brian Coghlan, Stephen Childs, Eamonn Kenny (Trinity College Dublin/EGEE) EGEE 2 nd User Forum – Manchester, May 2007
2
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Introduction RemotePBS –Based on/extends lcgpbs job manager on LCG-CE –Implements secure execution of grid jobs on remote batch systems (RBS) –Separate administrative domains –Single gatekeeper, multiple RBS model –RBS head/submit node Installed with gLite WN (+ YAIM/Quattor) Lightweight Additional “mini” information provider (IP) Remote access uses grid credentials –A work in progress, but used at three production EGEE sites Restricted VO/users EGEE 2nd User Forum, Manchester, May 11th 2007 2
3
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Current GK problems lcgpbs –Allows “remote” execution using routing queues –Requires /etc/hosts.equiv authentication Known PBS issue Remote batch submit node → gatekeeper Weak security model gLite-CE –Separate CE/RBS possible –RBS requires /etc/hosts.equiv –Same administrative domian EGEE 2nd User Forum, Manchester, May 11th 2007 3
4
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Mini IP # gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan, mpUCDie, local, grid dn: GlueCEUniqueID=gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan,mds-vo-name =mpUCDie,mds-vo- name=local,o=grid GlueCEHostingCluster: gridgate.ucd.ie GlueCEName: rowan GlueCEUniqueID: gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan GlueCEInfoGatekeeperPort: 2119 GlueCEInfoHostName: gridgate.ucd.ie GlueCEInfoLRMSType: remotepbs GlueCEInfoLRMSVersion: 2.1.8 GlueCEInfoTotalCPUs: 194 GlueCEInfoJobManager: remotepbs GlueCEInfoContactString: gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan GlueCEInfoApplicationDir: /home/ # cosmo, gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan, mpUCDie, local, gri d dn: GlueVOViewLocalID=cosmo,GlueCEUniqueID=gridgate.ucd.ie:2119/jobmanager-rem otepbs-rowan,mds-vo-name=mpUCDie,mds-vo-name=local,o=grid GlueVOViewLocalID: cosmo GlueCEAccessControlBaseRule: VO:cosmo EGEE 2nd User Forum, Manchester, May 11th 2007 4
5
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 RemotePBS network architecture EGEE 2nd User Forum, Manchester, May 11th 2007 5
6
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Job execution flow Remote PBS queue info published by site BDII TLGS/RB ↔ GK interaction remains the same However, no local queue required on GK –Queue name used by remotepbs as lookup to config data Remote submission node name Remote gsisshd port Real remote queue name on RBS additional PBS server directives (PPN etc) Job Script/Data constructed on GK –Minor modifications Symbolic links are now relative –Copied to remote submission node via gsissh –Job submitted via gsissh using qsub on remote submission node EGEE 2nd User Forum, Manchester, May 11th 2007 6
7
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Job status Job ID tracked by GK Monitor process on GK looks up all jobs –Iterates over all remote jobs –Gets unique remote host/queuename pairs –Gsissh qstats to all unique hosts for user jobs –Removes completed jobs Safe clean up job data on RBS EGEE 2nd User Forum, Manchester, May 11th 2007 7
8
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 RBS setup Gsisshd from VDT –RBS needs host cert –Config can limit connection to only those from GK Shared home directory on RBS/W Modules (optional) –User Grid Context can be determined at login to RBS –Grid environment set up “module load grid.ie” implicit with gsissh connection –Allows static user to use local batch + grid access EGEE 2nd User Forum, Manchester, May 11th 2007 8
9
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Current issues JM doesn’t yet implement Access Control on Users/VO –Globus monitoring process connects for invalid user Lifetime of LCG-CE –Move to gLite-CE(?) –Timeframe to implement equivalent + improvements gLite-CE BLAHP could simplify matter Independent pool accounts not yet possible –Username and $HOME must be same on GK and RBS –Use static accounts –Need to implement pool on CE + pool or static on RBS gsissh needs quick timeout –RBS responsive? APEL accounting records EGEE 2nd User Forum, Manchester, May 11th 2007 9
10
Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Summary RemotePBS –Implements secure execution of grid jobs on RBS –Separate administrative domains –Single gatekeeper, multiple RBS model –Accommodates Compute Centres with headnode-only model –A work in progress, but used at three production EGEE sites Acknowledgements –David Golden (UCD & DIAS) –Maarten Litmaath & David Smith (CERN) –Alastair McKinstry (ICHEC) –Stephane Dudzinski (DIAS & TCD) –CosmoGrid project consortium EGEE 2 nd User Forum, Manchester, May 11 th 2007 10
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.