Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.

Similar presentations


Presentation on theme: "Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework."— Presentation transcript:

1

2 Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework (CSF): Provides a single interface to different resource schedulers. Community Scheduler Framework (CSF): Provides a single interface to different resource schedulers. –PBS, Condor(G). Workspace management Workspace management –Dynamically create and manage workspaces on remote hosts. Grid Telecontrol Protocol Grid Telecontrol Protocol –WSRF-enabled service interface for control of remote instruments. Remote goldfish surgical procedures. Remote goldfish surgical procedures.

3 Jobs are computational tasks that may perform input/output operations while running. Jobs are computational tasks that may perform input/output operations while running. Affect the state of the computational resource and its associated file systems. Affect the state of the computational resource and its associated file systems. May require coordinated staging of data into the resource prior to job execution and out of the resource following execution. May require coordinated staging of data into the resource prior to job execution and out of the resource following execution. Some users, particularly interactive ones, benefit from accessing output data files as the job is running. Monitoring consists of querying and subscribing for status information such as job state changes. Some users, particularly interactive ones, benefit from accessing output data files as the job is running. Monitoring consists of querying and subscribing for status information such as job state changes.

4 Monitoring consists of querying and subscribing for status information such as job state changes. Monitoring consists of querying and subscribing for status information such as job state changes. Operated under the control of a scheduler which implements allocation and prioritization policies (i.e., priorities). Operated under the control of a scheduler which implements allocation and prioritization policies (i.e., priorities). GRAM is not a resource scheduler but a protocol engine for communicating with different local resource schedulers. GRAM is not a resource scheduler but a protocol engine for communicating with different local resource schedulers.

5 Conceptual Details Targeted Job Types Targeted Job Types –Not “RPC” –reliable operation, stateful monitoring, credential management, and file staging are important (i.e., the performance is horrible so only use if necessary).

6 Component Architecture Based on Component architecture Based on Component architecture –Job management services represent, monitor, and control the overall job life cycle. These services are the job-management specific software provided by the GRAM solution. represent, monitor, and control the overall job life cycle. These services are the job-management specific software provided by the GRAM solution. –File transfer services support staging of files into and out of compute resources. support staging of files into and out of compute resources.

7 Component Architecture –Credential management services are used to control the delegation of rights among distributed elements of the GRAM architecture based on users' application requirements. are used to control the delegation of rights among distributed elements of the GRAM architecture based on users' application requirements.

8 Security Secure Operation Secure Operation –WS GRAM utilizes WSRF functionality to provide for authentication of job management requests as well as to protect job requests from malicious interference. Local System protection domains Local System protection domains –jobs are executed in appropriate local security contexts e.g. under specific Unix user IDs based on details of the job request and authorization policies. e.g. under specific Unix user IDs based on details of the job request and authorization policies.

9 Credential delegation and management Credential delegation and management –Client may delegate some of its rights to GRAM services e.g. rights for GRAM to access data on a remote storage element as part of the job execution. e.g. rights for GRAM to access data on a remote storage element as part of the job execution. Audit Audit –To assist with normal accounting functions as well as to further mitigate risks from abuse or malfunction.

10 Job Management Reliable job submission. Reliable job submission. –“at most once” semantics Job Cancellation Job Cancellation –a mechanism for clients to cancel (abort) their jobs at any point in the job life cycle.

11 Data Management Reliable Data Staging Reliable Data Staging –reliable, high-performance transfers of files between the compute resource and external (gridftp) data storage elements before and after the job execution. Output Monitoring Output Monitoring –mechanism for incrementally transferring output file contents from the computation resource while the job is running.

12 Task Coordination Parallel Jobs Parallel Jobs Task rendezvous Task rendezvous –mechanism for task rendezvous which job applications may use if they do not have another more appropriate solution –Usually done in MPI

13 WS-GRAM (Web Services version). WS-GRAM (Web Services version). Designed to support job execution with coordinated file staging. Designed to support job execution with coordinated file staging. Uses a set of Web services in the GT4 WSRF core. Uses a set of Web services in the GT4 WSRF core. –ManagedJob: Provides interface to monitor the status of the job, terminate. Each submitted job is a distinct resource. –ManagedJobFactory: Interface to create ManagedJob resources of appropriate type to perform a job in that local scheduler. ManagedJob resource creation ManagedJobFactory::createManagedJob invocation. ManagedJob resource creation ManagedJobFactory::createManagedJob invocation.

14

15 Creation of Job Creation of Job –ManagedJobFactory::createManagedJob invocation. –A meaningful WS GRAM client MUST create a job that will then go through a life cycle where it eventually completes execution and the resource is eventually destroyed Optional Staging Credentials Optional Staging Credentials –Must be performed before call to createMnagedJob Optional Job Credential Optional Job Credential –Store into user account for use by job process.

16 Optional Credential Refresh Optional Credential Refresh –Credentials delegated may be refreshed. Optional Hold of Cleanup Optional Hold of Cleanup –User wants to directly access output files without waiting for stage-out. ManagedJob Destruction ManagedJob Destruction –Can explicitly destroy job.

17 Globus Toolkit Components used by WS GRAM Reliable File Transfer (RFT) Reliable File Transfer (RFT) –For file staging before and after job completes. GridFTP GridFTP –Supports retry –Partial file transfer –3 rd party file transfer

18 GridFTP FOO1 FOO2

19 GridFTP FOO1 FOO2

20 Delegation Services Can delegate credentials to any service that is deployed in the same container as the service. Can delegate credentials to any service that is deployed in the same container as the service. –Tells delegation service it wants to delegate its credentials. –The service that wants to use them must contact the delegation service to acquire them.

21 External Components Used by WS GRAM Local job scheduler: Local job scheduler: –PBS, LSF, Condor Sudo Sudo –Access to user accounts without having root privilege.

22


Download ppt "Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework."

Similar presentations


Ads by Google