Download presentation
Presentation is loading. Please wait.
Published byFrederica Maxwell Modified over 9 years ago
1
Grid Workload Management Massimo Sgaravatto INFN Padova
2
Grid Workload Management WP Goal: define and implement a suitable architecture for distributed scheduling and resource management in a GRID environment Large heterogeneous environment Large numbers (thousands) of independent users Many challenging issues : Optimizing the choice of execution location based on the availability of data, computation and network resources Uniform interface to possible different local resource management systems under different administrative domains Priorities, policies on resource usage Reliability, scalability, … … http://www.infn.it/workload-grid
3
Approach We need much more experience with the various grid issues The application requirements are not completely defined yet. They will evolve as more familiarity with the grid model is acquired Fast prototyping instead of a classic top-down approach
4
Current activities Report on current technology on Grid scheduling and resource management Globus resource management Condor Survey on Grid scheduling systems Focus on the implementation of a first prototype workload management system This part will be plugged together with the other parts implemented by the other WP’s to form the project month 9 (September) deliverable Grid accounting
5
Functionalities foreseen for the 1 st release First version of job description language (JDL) First version of resource broker Job submission service First version of bookkeeping and logging services First user interface
6
Block diagram of the currently foreseen components of the workload management system Not a real architecture Functional interactions among the various components Dependencies on “external” functionalities
8
Job Description Language (JDL) First release of job description language (JDL) used when the job is submitted, to specify the job characteristics (application, input data set id, resources [required and preferable], …) A document describing the syntax and semantics of a “prototype” JDL, based on Condor ClassAds was prepared Ready to collect feedback from applications
9
Resource Broker First version of resource broker, that chooses the computing resources (queues or “single” nodes) where to submit jobs, considering Access policies (grid-mapfiles in the Globus based prototype) Characteristics and status of resources Availability of input data set Availability of the required run time/application environments Resources required specified in the JDL Resources required published in an Information Space (Globus GIS in the first prototype) + Replica Catalog Ongoing implementation based on the Condor matchmaking library (Salvatore’s presentation)
10
Information Service All the information needed by the broker published in one Grid Information Space (Globus GIS/MDS for the first release) New MDS 2 alpha release soon available Should address some of the existing shortcomings Necessary to implement plug in modules Index (for a first level query, to identify a set of candidate resources) Information providers (to publish needed information about resources)
11
Job submission service Job submission service based (for the first release) on: Globus GRAM Condor-G on top of Globus GRAM (to implement a reliable job submission service) Globus GRAM Comprehensive evaluation already done (collaboration with the “Evaluation of the Globus toolkit” WP) Globus GRAM as uniform interface to different underlying resource management system (LSF, Condor, PBS) GRAM reporter (GRAM – GIS interaction) RSL
12
Job submission service Condor-G First prototype implementation already tested Promising, but many problems to fix New Condor-G implementation under testing Many problems fixed, but still other open issues Other new Condor-G implementation released hopefully in a few weeks Exploitation of a new persistent Globus jobmanager Active in following the developments of Globus GRAM, Condor-G, implementing the required customizations
13
Bookkeeping & Logging Job monitoring and control Job status Used resources Start time End time … Record of significant events occurring in the workload management system
14
User interface Command-line, for job management operations List of resources “suitable” to run a job Job submission (with the possibility to specify where to submit the job, or leaving this choice to the broker) Job status monitoring Job removal Access to bookkeeping info for the job
15
Workload management system (1 st prototype) Globus GRAM CONDOR Globus GRAM LSF Globus GRAM PBS Site1 Site2Site3 Job submission service Condor-G Broker GIS + Replica Catalog Submit jobs (using JDL [Class-Ads]) Resource Discovery Information on characteristics and status of local resources Local Resource Management Systems Globus GRAM as uniform interface to different local resource management systems Condor-G able to provide a reliable/crash- proof job submission service Broker chooses in which Globus resources the jobs must be submitted Farms Other info
16
Grid Accounting New problem Working systems (even prototype implementations) don’t exist yet Economy-based model for Grid accounting ? See Stefano’s presentation
17
Deliverables foreseen in the INFN-GRID proposal D2.1.1 Technical assessment about Globus and Condor, interactions and usage (5/2001) Done D2.1.2 First resource broker implementation for high throughput applications (7/2001) The resource broker should be easily customizable for high throughput applications Usable after M9 release
18
Deliverables foreseen in the INFN-GRID proposal D2.1.3 Comparison of different local resource managers (10/2001) Condor, LSF, PBS Farms with these resource management systems already in place and instrumented with the Globus software D2.1.4 Study of the three workload systems and implementation of the workload system for Monte Carlo productions (12/2001) Should be achievable
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.