Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833 Job sandboxes.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833 Job sandboxes."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833 Job sandboxes management with WMProxy Fabrizio Pacini email: fabrizio.pacini@datamat.it EGEE JRA1 All Hands Meeting Brno, 20-22 June 2005

2 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Outline WMProxy Intro New request types Sandboxes management Demo Possible extensions

3 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno WMProxy (1/2) WMProxy is the web service based interface to the WMS –WS-I compliant WSDL description of the services made available by the WMS –Developed in C++ using gsoap 2.7.0 as soap stubs generator Not only an WS-interface: the NS component has been almost completely refactored –to include some of the logic “embedded” in the client side –to provide new functionalities –to provide a better error reporting –to improve usability and scalability

4 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno WMProxy (2/2) WMProxy runs as a fastCGI script in an Apache + GridSite container –FastCGI is a language independent, scalable, open extension to CGI that provides high performance and persistence. FastCGI applications use TCP or Unix sockets to communicate with the web server AuthN/delegation provided by Gridsite –AuthN: mod_gridsite Apache module –Delegation: libgridsite and the delegation port type LCMAPS for user mapping –the gsi-free flavour of LCMAPS FQANs based AuthZ –Using Gridsite gacl

5 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno New request types Parametric Jobs: –attributes in the JDL vary their values according to a parameter –Submission of a parametric job generates the subsmission of several instances of the same job just differing for the value of the parameter –Job instances are submitted as nodes of a DAG without dependencies Job Collections: sets of independent jobs to be submitted in one shot –Jobs of a collection are submitted as nodes of a DAG without dependencies Support for new types strongly relies on newly developed JDL converters and on the DAG submission support –All JDL conversions are performed on the server “Smarter” WMS client commands/API –allow submission of DAGs, Collections and parametric jobs exploiting the concept of “shared sandbox” –allow automatic generation and submission of collections from sets of JDL files located in user specified directories on the UI

6 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Shared Sandboxes (for compound jobs) How it was: –No sandbox for compound jobs; only for its sub-jobs –All sub-jobs sandboxes are transferred separately no matter which files they are composed by How it is: –JDL has been extended to allow specification of the input sandbox at the level of the compound request (i.e. DAGs, Collections and Parametric jobs) –This Input sandbox is “inherited” by all sub-jobs of the compound job not specifying their own sandbox –This Input sandbox is trasferred only once by the new WMS client commands but can be accessed by all sub-jobs of the compound job –Sub-jobs sandboxes can also refer to single files only of the “shared sandbox” (e.g. I nputSandbox = root.InputSandbox[0];) –Sub-jobs sandboxes can refer to sandboxes of other subjobs

7 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Sparse Input Sandboxes (1/2) How it was: –Input Sandbox is a list of files (absolute and relative paths) located on the UI file sytem which are needed by the job when running on the WN InputSandbox = {"/tmp/ns.log", "mytest.exe", "myscript.sh", "data/event1.txt“ }; –Files arrive on the WN in two steps  from the UI to the job sandbox area on the WMS node (upload performed by the WMS client)  from the job sandbox area on the WMS node to the job working dir on the WN (download performed by the JobWrapper)

8 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Sparse Input Sandboxes (2/2) How it is: –Input Sandbox can contain  file paths on the UI machine (i.e. the usual way)  URI pointing to files on a remote gridFTP/HTTPS server InputSandbox = { "gsiftp://neo.datamat.it:2811/var/prg/sim.exe", "https://ghemon.cnaf.infn.it:8443/data/idat_1", "file:///home/pacio/myconf" }; –A base URI to be applied to all ISB files can also be specified InputSandboxBaseURI = "gsiftp://matrix.datamat.it:2811/var"; –Only local files (file://) are copied to the WMS node –File pointed by URIs are directly copied on the WN by the JobWrapper just before the job is started

9 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Sparse Output Sandboxes (1/2) How it was: –Output Sandbox is a list of files (file names and relative paths w.r.t. to the job working dir) generated by the job while running OutputSandbox = { "myjobOutput", "myjobError", "run/result1", "run/result2", }; –These files arrive on the UI in two steps  from the WN to the job sandbox area on the WMS node (upload performed by the JobWrapper)  from the job sandbox area on the WMS node to the UI (download performed by the WMS client on user request)

10 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Sparse Output Sandboxes (2/2) How it is: –JDL has been enriched with a new attribute specifying the destinations for the files listed in the OutputSandbox attribute list OutputSandbox = {"myjobOutput", "run1/event1", "myjobError"}; OutputSandboxDestURI = { "gsiftp://matrix.datamat.it:/var/myjobOutput", "https://grid003.ct.infn.it:8443/home/cms/event1", "gsiftp://matrix.datamat.it:/var/myjobError“}; –Files are copied by the JobWrapper to the specified destination without transiting on the WMS node

11 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno demo WMS (tigerman.cnaf.infn.it) CE gridFTP + HTTP Server (ghemon.cnaf.infn.it) Job ISB files JDL + exe Job OSB files JDL + exe UI (trinity.datamat.it) Access job output   

12 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno demo

13 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno pros Allows inclusion of bigger files in the ISB Allows saving disk space on the WMS node JW independence from the way the WMS manages the job sandbox area (no matter if local or remote) Allows sharing and reuse of sandboxes between different jobs and files passing between node of a compound job through very simple JDL descriptions Note that the WMS does not manage the areas where sparse sandboxes are stored: just uses it... the old approach for sandboxes is however still supported

14 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE JRA1 All Hands Meeting, 20-22 June 2005, Brno Extensions Support for ‘sparse’ sandboxes located on SEs providing a gridftp interface is there We are also able to support sandboxes located on Gridsite HTTPS servers (htcp, curl) –Still under some problems to be fixed –Htcp id not installed by default on WNs Can this mechanism be extended to SEs not providing a gridftp interface? –glite i/o is not a transport protocol so can’t be used for this –Could tools like glite-url-copy be used at this aim?


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833 Job sandboxes."

Similar presentations


Ads by Google