London e-Science Centre Session 6: Distributed Computation Practical issues & Examples A. Stephen McGough Imperial College London Practical issues & Examples.

Slides:



Advertisements
Similar presentations
W w w. h p c - e u r o p a. o r g HPC-Europa Portal: Uniform Access to European HPC Infrastructure Ariel Oleksiak Poznan Supercomputing.
Advertisements

WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
Current status of grids: the need for standards Mike Mineter TOE-NeSC, Edinburgh.
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
A JSDL Applications Repository and Data Staging Portal: Some New Parameter Sweep Developments and Data transfer Requirements David Meredith STFC e-Science.
Universität Dortmund Robotics Research Institute Information Technology Section Grid Metaschedulers An Overview and Up-to-date Solutions Christian.
Workload Management Massimo Sgaravatto INFN Padova.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
W w w. h p c - e u r o p a. o r g Single Point of Access to Resources of HPC-Europa Krzysztof Kurowski, Jarek Nabrzyski, Ariel Oleksiak, Dawid Szejnfeld.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Grid Computing 7700 Fall 2005 Lecture 17: Resource Management Gabrielle Allen
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Nicholas LoulloudesMarch 3 rd, 2009 g-Eclipse Testing and Benchmarking Grid Infrastructures using the g-Eclipse Framework Nicholas Loulloudes On behalf.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Leading the pervasive adoption of grid computing for research and industry © 2005 Global Grid Forum The information contained herein is subject to change.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
Condor Birdbath Web Service interface to Condor
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
© 2008 Open Grid Forum Independent Software Vendor (ISV) Remote Computing Primer Steven Newhouse.
London e-Science Centre GridSAM A Standards Based Approach to Job Submission A. Stephen M C Gough Imperial College London A Standards Based Approach to.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
London e-Science Centre GridSAM Job Submission and Monitoring Web Service William Lee, Stephen McGough.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
EGEE-Forum – May 11, 2007 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks A gateway platform for Grid Nicolas.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, An Overview of the GridWay Metascheduler.
GridSAM - A Standards Based Approach to Job Submission Through Web Services William Lee and Stephen McGough London e-Science Centre Department of Computing,
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
Easy Access to Grid infrastructures Dr. Harald Kornmayer (NEC Laboratories Europe) Dr. Mathias Stuempert (KIT-SCC, Karlsruhe) EGEE User Forum 2008 Clermont-Ferrand,
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Grid Interoperability Update on GridFTP tests Gregor von Laszewski
GridSAM: an Introduction Mike Mineter.
EGI Technical Forum Amsterdam, 16 September 2010 Sylvain Reynaud.
The NGS Grid Portal David Meredith NGS + Grid Technology Group, e-Science Centre, Daresbury Laboratory, UK
GridSAM and the Job Submission Description Language Presented by Mike Mineter (Most) slides from Stephen.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
London e-Science Centre Activity Schema What we’ve discussed already A. Stephen M C Gough Imperial College London What we’ve discussed already A. Stephen.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
London e-Science Centre Session 4: The GridSAM service A. Stephen McGough Imperial College London A. Stephen McGough Imperial College London.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Leading the pervasive adoption of grid computing for research and industry © 2005 Global Grid Forum The information contained herein is subject to change.
Introduction to the Application Hosting Environment
OGSA Data Architecture Scenarios
Wide Area Workload Management Work Package DATAGRID project
Grid Systems: What do we need from web service standards?
Presentation transcript:

London e-Science Centre Session 6: Distributed Computation Practical issues & Examples A. Stephen McGough Imperial College London Practical issues & Examples A. Stephen McGough Imperial College London

London e-Science Centre 2 Outline  Overview  DRM Systems  Condor  Globus (GT4)  gLite  Other Way  JSDL  GridSAM  Overview  DRM Systems  Condor  Globus (GT4)  gLite  Other Way  JSDL  GridSAM

London e-Science Centre Overview Running Jobs on the Grid

London e-Science Centre 4 Context Middleware Map to resources jobs / legacy code / binary executables Resources

London e-Science Centre 5 security Stages to using the Grid – Classical View write (code) to solve problem “compile” against middleware advertise Select resources Deploy to resources middleware submit to Grid accounting Steering and visualisation Stage data

London e-Science Centre 6 To make life easy We want to hide the heterogeneity of the Grid User Grid resources Hide heterogeneity by tight abstraction here

London e-Science Centre 7 Common Grid Systems There are many Grid Systems.  Here we illustrate three.  Globus  Condor  gLite There are many Grid Systems.  Here we illustrate three.  Globus  Condor  gLite

London e-Science Centre 8 Globus Execute work on remote resources Without the need to log into the resource Execute work on remote resources Without the need to log into the resource User Site boundary Resources Globus

London e-Science Centre 9 Globus Toolkit ™ A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications Offer a modular “ bag of technologies ” Enable incremental development of grid-enabled tools and applications Implement standard Grid protocols and APIs Make available under liberal open source license Used as a gateway to other resources A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications Offer a modular “ bag of technologies ” Enable incremental development of grid-enabled tools and applications Implement standard Grid protocols and APIs Make available under liberal open source license Used as a gateway to other resources

London e-Science Centre 10 Four Key Protocols The Globus Toolkit ™ centers around four key protocols Connectivity layer: Security: Control access but allow collaboration Resource layer: Resource Management: Grid Resource Allocation Management (WS-GRAM) Information: Information Index Data Transfer: Grid File Transfer Protocol (GridFTP)

London e-Science Centre 11 High-Throughput Computing High-performance: CPU cycles/second under ideal circumstances. “ How fast can I run simulation X on this machine? ” “ How big a simulation can I run? ” High-throughput: CPU cycles/day (week, month, year?) under non-ideal circumstances. “ How far can I progress simulation X on this machine? ” “ How many times can I run simulation X in the next month using all available machines? ” High-performance: CPU cycles/second under ideal circumstances. “ How fast can I run simulation X on this machine? ” “ How big a simulation can I run? ” High-throughput: CPU cycles/day (week, month, year?) under non-ideal circumstances. “ How far can I progress simulation X on this machine? ” “ How many times can I run simulation X in the next month using all available machines? ”

London e-Science Centre 12 Condor Perform high throughput jobs across many resources User Resources Condor

London e-Science Centre 13 Condor Designed as a cycle-stealing middleware Uses idle resource time to perform tasks Converts collections of computers into clusters If user takes back control of a resource then Condor job will either migrate or terminate Provides reliable job completion Re-run jobs that didn’t complete Selects best resource for job based on requirements U ses ClassAd Matchmaking to make sure that everyone is happy. Designed as a cycle-stealing middleware Uses idle resource time to perform tasks Converts collections of computers into clusters If user takes back control of a resource then Condor job will either migrate or terminate Provides reliable job completion Re-run jobs that didn’t complete Selects best resource for job based on requirements U ses ClassAd Matchmaking to make sure that everyone is happy.

London e-Science Centre 14 gLite Execute work on many distributed resources Without the need to log into the resource or selecting which one Execute work on many distributed resources Without the need to log into the resource or selecting which one User Site boundary Resources gLite

London e-Science Centre 15 EGEE (gLite) Mission Infrastructure Manage and operate production Grid for European Research Area Interoperate with e-Infrastructure projects around the globe Contribute to Grid standardisation efforts Support applications from diverse communities High Energy Physics Biomedicine Earth Sciences Astrophysics Computational Chemistry Fusion Geophysics Finance, Multimedia … Business Forge links with the full spectrum of interested business partners + Disseminate knowledge about the Grid through training + Prepare for sustainable European Grid Infrastructure Infrastructure Manage and operate production Grid for European Research Area Interoperate with e-Infrastructure projects around the globe Contribute to Grid standardisation efforts Support applications from diverse communities High Energy Physics Biomedicine Earth Sciences Astrophysics Computational Chemistry Fusion Geophysics Finance, Multimedia … Business Forge links with the full spectrum of interested business partners + Disseminate knowledge about the Grid through training + Prepare for sustainable European Grid Infrastructure

London e-Science Centre 16 gLite Combines much of the other two architectures (Globus, Condor)  Along with other functionality  Brokering service (WMS)  Data Storage (SE)  Deployed over a vast range of sites  Based in Europe  But spreading fast Combines much of the other two architectures (Globus, Condor)  Along with other functionality  Brokering service (WMS)  Data Storage (SE)  Deployed over a vast range of sites  Based in Europe  But spreading fast

London e-Science Centre 17 Features in a Grid Architecture Specification Submission Discovery Selection Staging Security Specification Submission Discovery Selection Staging Security

London e-Science Centre 18 Specification The ability to specify the job you want run and how you want it run Languages to specify what is required by the user All systems have their own language The ability to specify the job you want run and how you want it run Languages to specify what is required by the user All systems have their own language CondorComplex almost programming language (ClassAds) GlobusSimple description language (RSL) gLiteVariation on the Condor ClassAds language

London e-Science Centre 19 Submission The mechanism for submitting jobs to the Grid What mechanisms does the system support for job submission The mechanism for submitting jobs to the Grid What mechanisms does the system support for job submission CondorCommand line, Web Service, port, Standard DRMAA and Web Service GlobusCommand line, Web Service gLiteCommand line, API, (Some) Web Service

London e-Science Centre 20 Discovery The process of discovering resources as they become available and determining when they disappear Having a good knowledge of the current state of the resources helps in selection The process of discovering resources as they become available and determining when they disappear Having a good knowledge of the current state of the resources helps in selection CondorResources advertise themselves to the scheduler GlobusResources advertise themselves to a service that the scheduler can query gLiteResources advertise themselves to an information service that the WMS can query

London e-Science Centre 21 Selection The process used to select the best resources for the job to run on Mechanisms provided to ensure that each job is placed on the most appropriate resource The process used to select the best resources for the job to run on Mechanisms provided to ensure that each job is placed on the most appropriate resource CondorJobs and resources are “matched” together. Jobs will be launched when an idle resource matching the requirements is found GlobusMost of the selection is done by the user who specifies the resource, third party schedulers are available gLiteWorkload Management Services are used to select the best CE to send a job to

London e-Science Centre 22 Staging The process of getting data to resources so that they can perform the required tasks May be sending whole files in advance or streaming data The process of getting data to resources so that they can perform the required tasks May be sending whole files in advance or streaming data CondorJobs are given a virtual file space with read and write operations being passed back to the submission node GlobusJobs can be staged out or provided by streams gLiteJobs can be staged out or provided by streams. Storage elements can hold files.

London e-Science Centre 23 Security (the three A’s) We have lots of users of the Grid and many resources. How do we positively identify users and resources? Authentication Not all users will be able to use all resources. Authorisation Requirement to keep records of what users have done. Accounting We have lots of users of the Grid and many resources. How do we positively identify users and resources? Authentication Not all users will be able to use all resources. Authorisation Requirement to keep records of what users have done. Accounting

London e-Science Centre 24 Security Preventing inappropriate use of the resources Authentication and Authorisation are key Need to develop a level of trust for both users and the resource owners Preventing inappropriate use of the resources Authentication and Authorisation are key Need to develop a level of trust for both users and the resource owners CondorUses public key infrastructure x509 & Proxy GlobusUses public key infrastructure x509 & Proxy gLiteUses public key infrastructure x509 & Proxy + Annotations on the certificates

London e-Science Centre 25 Working Together These systems don’t interoperate  May use the same technologies though they can’t understand each other  To get them to work together wrappers are needed  Can’t submit direct from one to the other  Though wrappers exist between them These systems don’t interoperate  May use the same technologies though they can’t understand each other  To get them to work together wrappers are needed  Can’t submit direct from one to the other  Though wrappers exist between them

London e-Science Centre Other Way… Standards Based Job Submission

London e-Science Centre 27 If all DRM systems supported the same interface…  If we had:  One interface definition for job submission  One job description language  Then life would be easier!  We’re getting there  JSDL is a proposed standard job submission description language  OGSA-BES are proposing a basic execution service interface  One day hopefully everyone will support this  Till then…  If we had:  One interface definition for job submission  One job description language  Then life would be easier!  We’re getting there  JSDL is a proposed standard job submission description language  OGSA-BES are proposing a basic execution service interface  One day hopefully everyone will support this  Till then…

London e-Science Centre JSDL 1.0 Primer Ali Anjomshoaa, Fred Brisard, Michel Drescher, Donal K. Fellows, William Lee, An Ly, Steve McGough, Darren Pulsipher, Andreas Savva, Chris Smith

London e-Science Centre 29 JSDL Introduction JSDL stands for Job Submission Description Language A language for describing the requirements of computational jobs for submission to Grids and other systems. A JSDL document describes the job requirements What to do, not how to do it No Defaults All elements must be satisfied for the document to be satisfied JSDL does not define a submission interface or what the results of a submission look like JSDL 1.0 is published as GFD-R-P.56 Includes description of JSDL elements and XML Schema Available at JSDL stands for Job Submission Description Language A language for describing the requirements of computational jobs for submission to Grids and other systems. A JSDL document describes the job requirements What to do, not how to do it No Defaults All elements must be satisfied for the document to be satisfied JSDL does not define a submission interface or what the results of a submission look like JSDL 1.0 is published as GFD-R-P.56 Includes description of JSDL elements and XML Schema Available at

London e-Science Centre 30 JSDL Document A JSDL document is an XML document It may contain Generic (job) identification information Application description Resource requirements (main focus is computational jobs) Description of required data files It is a template language Open content language – compose-able with others Out of scope, for JSDL version 1.0 Scheduling Workflow Security … A JSDL document is an XML document It may contain Generic (job) identification information Application description Resource requirements (main focus is computational jobs) Description of required data files It is a template language Open content language – compose-able with others Out of scope, for JSDL version 1.0 Scheduling Workflow Security …

London e-Science Centre 31 Workflow Job JSDL RRL SDL WS-A JLM JPL … … Job JSDL RRL SDL WS-A JLM JPL … … Job JSDL RRL SDL WS-A JLM JPL … … Job JSDL RRL SDL WS-A JLM JPL … … JSDL: Conceptual relation with other standards RRL - Resource Requirements Language SDL – Scheduling Description Language WS-A – WS-Agreement JLM – Job Lifetime Management JPL – Job Policy Language

London e-Science Centre 32 OGSA BES system WS Gateway WS Clients Job Manager A Grid Information Service Local Information Service Local resource (e.g., Supercomputer) Existing DRM Super Scheduler, or Broker, or … JSDL Here And Here JSDL Here And Here JSDL Document Usage

London e-Science Centre 33 BES JSDL Document Life Cycle A JSDL document may be Abstract Only the minimum information necessary For example, application name and input files Runnable at sites that understand this level of description Refined More detail provided Target site, number of CPUs, which data source May be refined several times Tied to a specific site/system Incarnated (Unicore speak); or Grounded (Globus speak) This model is supported/allowed but not required by JSDL A JSDL document may be Abstract Only the minimum information necessary For example, application name and input files Runnable at sites that understand this level of description Refined More detail provided Target site, number of CPUs, which data source May be refined several times Tied to a specific site/system Incarnated (Unicore speak); or Grounded (Globus speak) This model is supported/allowed but not required by JSDL

London e-Science Centre 34 A few words on JSDL and BES JSDL is a language No submission interface defined (on purpose) JSDL is independent of submission interfaces BES is defining a Web Service interface which consumes JSDL documents This is not the only use of JSDL Though we do like it JSDL is a language No submission interface defined (on purpose) JSDL is independent of submission interfaces BES is defining a Web Service interface which consumes JSDL documents This is not the only use of JSDL Though we do like it BES Container JSDL

London e-Science Centre 35 JSDL Document Structure Overview ? * ? * Note: None[1..1] ?[0..1] * [0..n] + [1..n]

London e-Science Centre 36 Job Identification Element ? * ? ? * ? Example: My Gnuplot invocation Simple application … Extensibility point

London e-Science Centre 37 Application Element ? * ? * Example: gnuplot 5.7 Use the gnuplot application v5.7 regardless where it is installed on the target system How do I define an executable explicitly?

London e-Science Centre 38 Application: POSIXApplication extension * ? * … * ? * … POSIXApplication is a normative JSDL extension Defines standard POSIX elements stdin, stdout, stderr Working directory Command line arguments Environment variables POSIX limits (not shown here)

London e-Science Centre 39 Hello World <jsdl:JobDefinition xmlns:jsdl=“ xmlns:jsdl-posix= “ /bin/echo hello world <jsdl:JobDefinition xmlns:jsdl=“ xmlns:jsdl-posix= “ /bin/echo hello world

London e-Science Centre 40 Resource description requirements Support simple descriptions of resource requirements NOT a comprehensive resource requirements language Avoided explicit heterogeneous or hierarchical descriptions Can be extended with other elements for richer or more abstract descriptions Main target is compute jobs CPU, Memory, Filesystem/Disk, Operating system requirements Allow some flexibility for aggregate (Total*) requirements “I want 10 CPUs in total and each resource should have 2 or more” Very basic support for network requirements Support simple descriptions of resource requirements NOT a comprehensive resource requirements language Avoided explicit heterogeneous or hierarchical descriptions Can be extended with other elements for richer or more abstract descriptions Main target is compute jobs CPU, Memory, Filesystem/Disk, Operating system requirements Allow some flexibility for aggregate (Total*) requirements “I want 10 CPUs in total and each resource should have 2 or more” Very basic support for network requirements

London e-Science Centre 41 Resources Element ? * ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? * * ? * ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? * * Example: One CPU and at least 2 Megabytes of memory

London e-Science Centre 42 Relation of Individual* and Total* Resources elements It is possible to combine Individual* and Total* elements to specify complex requirements “I want a total of 10 CPUs, 2 or more per resource” Caveat: Not all Individual/Total combinations make sense It is possible to combine Individual* and Total* elements to specify complex requirements “I want a total of 10 CPUs, 2 or more per resource” Caveat: Not all Individual/Total combinations make sense

London e-Science Centre 43 RangeValues Example: Between 512MB and 2GB of memory (inclusive) exactepsilon intervalsranges Define exact values (with an optional “epsilon” argument), left- open or right-open intervals and ranges. Example: Between 2 and 16 processors

London e-Science Centre 44 JSDL Type Definitions Example: OperatingSystemTypeEnumeration JSDL defines a small number of types As far as possible re-use existing standards Example: OperatingSystemTypeEnumeration Basic value set defined based on CIM: Windows_XP, JavaVM, OS_390, LINUX, MACOS, Solaris, … CIM defines these as numbers; JSDL provides an XML definition Watching WS-CIM work Similarly for values of other types: ProcessorArchitectureEnumeration based on ISA values JSDL defines a small number of types As far as possible re-use existing standards Example: OperatingSystemTypeEnumeration Basic value set defined based on CIM: Windows_XP, JavaVM, OS_390, LINUX, MACOS, Solaris, … CIM defines these as numbers; JSDL provides an XML definition Watching WS-CIM work Similarly for values of other types: ProcessorArchitectureEnumeration based on ISA values

London e-Science Centre 45 Data Staging Requirement Previous statements included: “A JSDL document describes the job requirements What to do, not how to do it *” “Workflow is out of scope.” But … data staging is a common requirement for any meaningful job submission Especially for batch job submission No standard to describe such data movements Our solution Assume simple model: Stage-in – Execute – Stage-Out Files required for execution Files are staged-in before the job can start executing Files to preserve Files are staged-out after the job finishes execution More complex approaches can be used But this is outside JSDL You don’t need to use the JSDL Data Staging Previous statements included: “A JSDL document describes the job requirements What to do, not how to do it *” “Workflow is out of scope.” But … data staging is a common requirement for any meaningful job submission Especially for batch job submission No standard to describe such data movements Our solution Assume simple model: Stage-in – Execute – Stage-Out Files required for execution Files are staged-in before the job can start executing Files to preserve Files are staged-out after the job finishes execution More complex approaches can be used But this is outside JSDL You don’t need to use the JSDL Data Staging Stage-In Execute Stage-Out

London e-Science Centre 46 DataStaging Element ? ? * ? ? * Example: Stage in a file (from a URL) and name it “control.txt”. In case it already exists, simply overwrite it. After the job is done, delete this file. control.txt overwrite true

London e-Science Centre 47 JSDL Adoption The following projects have presented at GGF JSDL sessions and are known to have implementations of some version of JSDL; not necessarily 1.0. Business Grid Grid Programming Environment (GPE) GridSAM HPC-Europa Market for Computational Services NAREGI UniGrids The following groups also said they are or will be implementing JSDL: DEISA GridBus Project (see OGSA Roadmap, section 8) gridMatrix (Cadence) (presentation) Nordugrid Also within GGF a number of groups either use directly or have a strong interest or connection with JSDL: BES-WG, CDDLM-WG, DRMAA-WG, GRAAP-WG, OGSA-WG, RSS-WG An up-to-date version of this list is on Gridforge: The following projects have presented at GGF JSDL sessions and are known to have implementations of some version of JSDL; not necessarily 1.0. Business Grid Grid Programming Environment (GPE) GridSAM HPC-Europa Market for Computational Services NAREGI UniGrids The following groups also said they are or will be implementing JSDL: DEISA GridBus Project (see OGSA Roadmap, section 8) gridMatrix (Cadence) (presentation) Nordugrid Also within GGF a number of groups either use directly or have a strong interest or connection with JSDL: BES-WG, CDDLM-WG, DRMAA-WG, GRAAP-WG, OGSA-WG, RSS-WG An up-to-date version of this list is on Gridforge:

London e-Science Centre 48 JSDL Mappings ARC (NorduGrid) Condor eNANOS Fork Globus 2 GRIA provider Grid Resource Management System (GRMS) ARC (NorduGrid) Condor eNANOS Fork Globus 2 GRIA provider Grid Resource Management System (GRMS) JOb Scheduling Hierarchically (JOSH) LSF Sun Grid Engine Unicore

London e-Science Centre GridSAM Job Submission and Monitoring Web Service Other way…

London e-Science Centre 50 GridSAM Overview Grid Job Submission and Monitoring Service  What is GridSAM?  A Job Submission and Monitoring Web Service  Funded by the Open Middleware Infrastructure Institute (OMII) managed programme  V1.0 Available as part of the OMII 2.x release (v soon to be released)  Open source (BSD)  One of the first system to support the GGF Job Submission Description Language (JSDL)  What is GridSAM?  A Job Submission and Monitoring Web Service  Funded by the Open Middleware Infrastructure Institute (OMII) managed programme  V1.0 Available as part of the OMII 2.x release (v soon to be released)  Open source (BSD)  One of the first system to support the GGF Job Submission Description Language (JSDL)

London e-Science Centre 51 GridSAM Overview Grid Job Submission and Monitoring Service  What is GridSAM to the resource owners?  A Web Service to expose heterogeneous execution resources uniformly  Single machine through Forking or SSH  Condor Pool  Grid Engine 6 through DRMAA  Globus exposed resources  OR use our plug-in API to implement …  What is GridSAM to the resource owners?  A Web Service to expose heterogeneous execution resources uniformly  Single machine through Forking or SSH  Condor Pool  Grid Engine 6 through DRMAA  Globus exposed resources  OR use our plug-in API to implement …

London e-Science Centre 52 GridSAM Overview Grid Job Submission and Monitoring Service  What is GridSAM to end-users?  A set of end-user tools and client-side APIs to interact with a GridSAM web service  Submit and Start Jobs  Monitor Jobs  Terminate Jobs  File transfer  Client-side submission scripting  Client-side Java API  What is GridSAM to end-users?  A set of end-user tools and client-side APIs to interact with a GridSAM web service  Submit and Start Jobs  Monitor Jobs  Terminate Jobs  File transfer  Client-side submission scripting  Client-side Java API

London e-Science Centre 53 What’s not?  GridSAM is not  a scheduling service  That’s the role of the underlying launching mechanism  That’s the role of a super-scheduler that brokers jobs to a set of GridSAM services  a provisioning service  GridSAM runs what’s been told to run  GridSAM does not resolve software dependencies and resource requirements  GridSAM is not  a scheduling service  That’s the role of the underlying launching mechanism  That’s the role of a super-scheduler that brokers jobs to a set of GridSAM services  a provisioning service  GridSAM runs what’s been told to run  GridSAM does not resolve software dependencies and resource requirements

London e-Science Centre 54 Deployment Scenario: Forking HTTP + WS-Sec./ HTTPS + WS- Sec. / HTTPS mutual. Local FS Local FS GSIFTP FTP WEBDAV HTTP …

London e-Science Centre 55 Deployment Scenario: Secure Shell (SSH) HTTP + WS-Sec./ HTTPS + WS- Sec. / HTTPS mutual. GSIFTP FTP WEBDAV HTTP … SFTP - FS SFTP - FS

London e-Science Centre 56 Deployment Scenario: Condor Pool Condor command- line wrapper HTTP + WS-Sec./ HTTPS + WS-Sec. / HTTPS mutual. GSIFTP FTP WEBDAV HTTP … Network FS Network FS

London e-Science Centre 57 Deployment Scenario: Globus 2.4.3

London e-Science Centre 58 Deployment Scenario: Grid Engine 6 GSIFTP FTP WEBDAV HTTP … Network FS Network FS

London e-Science Centre 59 Latest Features  Available in v2.0.0-rc1 (released 1/7/06)  MPI Application through GT2 plugin  Simple non-standard JSDL extension that extends with a element  Authorisation based on JSDL structure  Allow / deny submission based on a set of XPath rules and the identities of the submitter (e.g. distinguished name).  Prototype Basic Execution Service (ogsa-bes) interface  Demonstrated in the mini face-to-face in London last December  Shown interoperability with the Uni. Of Virginia BES (.NET based) implementation.  Available in v2.0.0-rc1 (released 1/7/06)  MPI Application through GT2 plugin  Simple non-standard JSDL extension that extends with a element  Authorisation based on JSDL structure  Allow / deny submission based on a set of XPath rules and the identities of the submitter (e.g. distinguished name).  Prototype Basic Execution Service (ogsa-bes) interface  Demonstrated in the mini face-to-face in London last December  Shown interoperability with the Uni. Of Virginia BES (.NET based) implementation.

London e-Science Centre 60 Upcoming Features  Job State Notification  Integrate with FINS (WS-Eventing)  Resource Usage Service  GGF RUS compliant service implementation for recording and querying usages  Integrate with GridSAM to account for job resource usage  Basic Execution Service  Continue tracking the changes in the ogsa-bes specification  Support dual submission WS-interfaces  Job State Notification  Integrate with FINS (WS-Eventing)  Resource Usage Service  GGF RUS compliant service implementation for recording and querying usages  Integrate with GridSAM to account for job resource usage  Basic Execution Service  Continue tracking the changes in the ogsa-bes specification  Support dual submission WS-interfaces

London e-Science Centre 61 Further Information Official Download Project Information and Documentation Official Download Project Information and Documentation

London e-Science Centre Application Wrapping Don’t forget…

London e-Science Centre 63 Application Wrapping My Job (BLAST) This is the environment the job expects to see Library Input Output Database Files Environment variables My Job (BLAST) Library Input Output Database Files Job Wrapper Environment variables This is what will be invoked remotely We need to ensure that everything goes

London e-Science Centre Questions?