Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks A GSI-secured job manager for connecting.

Similar presentations


Presentation on theme: "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks A GSI-secured job manager for connecting."— Presentation transcript:

1 EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks A GSI-secured job manager for connecting PBS servers in independent administrative domains John Walsh, Brian Coghlan, Stephen Childs, Eamonn Kenny (Trinity College Dublin/EGEE) EGEE 2 nd User Forum – Manchester, May 2007

2 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Introduction RemotePBS –Based on/extends lcgpbs job manager on LCG-CE –Implements secure execution of grid jobs on remote batch systems (RBS) –Separate administrative domains –Single gatekeeper, multiple RBS model –RBS head/submit node  Installed with gLite WN (+ YAIM/Quattor)  Lightweight  Additional “mini” information provider (IP)  Remote access uses grid credentials –A work in progress, but used at three production EGEE sites  Restricted VO/users EGEE 2nd User Forum, Manchester, May 11th 2007 2

3 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Current GK problems lcgpbs –Allows “remote” execution using routing queues –Requires /etc/hosts.equiv authentication  Known PBS issue  Remote batch submit node → gatekeeper  Weak security model gLite-CE –Separate CE/RBS possible –RBS requires /etc/hosts.equiv –Same administrative domian EGEE 2nd User Forum, Manchester, May 11th 2007 3

4 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Mini IP # gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan, mpUCDie, local, grid dn: GlueCEUniqueID=gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan,mds-vo-name =mpUCDie,mds-vo- name=local,o=grid GlueCEHostingCluster: gridgate.ucd.ie GlueCEName: rowan GlueCEUniqueID: gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan GlueCEInfoGatekeeperPort: 2119 GlueCEInfoHostName: gridgate.ucd.ie GlueCEInfoLRMSType: remotepbs GlueCEInfoLRMSVersion: 2.1.8 GlueCEInfoTotalCPUs: 194 GlueCEInfoJobManager: remotepbs GlueCEInfoContactString: gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan GlueCEInfoApplicationDir: /home/ # cosmo, gridgate.ucd.ie:2119/jobmanager-remotepbs-rowan, mpUCDie, local, gri d dn: GlueVOViewLocalID=cosmo,GlueCEUniqueID=gridgate.ucd.ie:2119/jobmanager-rem otepbs-rowan,mds-vo-name=mpUCDie,mds-vo-name=local,o=grid GlueVOViewLocalID: cosmo GlueCEAccessControlBaseRule: VO:cosmo EGEE 2nd User Forum, Manchester, May 11th 2007 4

5 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 RemotePBS network architecture EGEE 2nd User Forum, Manchester, May 11th 2007 5

6 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Job execution flow Remote PBS queue info published by site BDII TLGS/RB ↔ GK interaction remains the same However, no local queue required on GK –Queue name used by remotepbs as lookup to config data  Remote submission node name  Remote gsisshd port  Real remote queue name on RBS  additional PBS server directives (PPN etc) Job Script/Data constructed on GK –Minor modifications  Symbolic links are now relative –Copied to remote submission node via gsissh –Job submitted via gsissh using qsub on remote submission node EGEE 2nd User Forum, Manchester, May 11th 2007 6

7 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Job status Job ID tracked by GK Monitor process on GK looks up all jobs –Iterates over all remote jobs –Gets unique remote host/queuename pairs –Gsissh qstats to all unique hosts for user jobs –Removes completed jobs  Safe clean up job data on RBS EGEE 2nd User Forum, Manchester, May 11th 2007 7

8 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 RBS setup Gsisshd from VDT –RBS needs host cert –Config can limit connection to only those from GK Shared home directory on RBS/W Modules (optional) –User Grid Context can be determined at login to RBS –Grid environment set up  “module load grid.ie” implicit with gsissh connection –Allows static user to use local batch + grid access EGEE 2nd User Forum, Manchester, May 11th 2007 8

9 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Current issues JM doesn’t yet implement Access Control on Users/VO –Globus monitoring process connects for invalid user Lifetime of LCG-CE –Move to gLite-CE(?) –Timeframe to implement equivalent + improvements  gLite-CE BLAHP could simplify matter Independent pool accounts not yet possible –Username and $HOME must be same on GK and RBS –Use static accounts –Need to implement pool on CE + pool or static on RBS gsissh needs quick timeout –RBS responsive? APEL accounting records EGEE 2nd User Forum, Manchester, May 11th 2007 9

10 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 Summary RemotePBS –Implements secure execution of grid jobs on RBS –Separate administrative domains –Single gatekeeper, multiple RBS model –Accommodates Compute Centres with headnode-only model –A work in progress, but used at three production EGEE sites Acknowledgements –David Golden (UCD & DIAS) –Maarten Litmaath & David Smith (CERN) –Alastair McKinstry (ICHEC) –Stephane Dudzinski (DIAS & TCD) –CosmoGrid project consortium EGEE 2 nd User Forum, Manchester, May 11 th 2007 10


Download ppt "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks A GSI-secured job manager for connecting."

Similar presentations


Ads by Google