AN INGENIOUS APPROACH FOR IMPROVING TURNAROUND TIME OF GRID JOBS WITH RESOURCE ASSURANCE AND ALLOCATION MECHANISM Shikha Mehrotra Centre for Development of Advanced Computing CDAC, Bangalore, India September 20121IEEE HPEC'12
Outline Indian National grid GARUDA Need for Reservation in Grid Approach followed in realizing reservation in Garuda Grid – Architecture – Features Performance analysis – Job flow in Garuda grid – Performance metrics – Turnaround time of grid jobs – Case-study Turn-around time without reservation Turn-around time with reservation Data analysis Results Conclusion September 2012 IEEE HPEC'12 2
Grid Computing Distributed Computing taken to the next level Aggregation of Resources from many participants (geographically distributed in general) – Compute resources – Data resources – Special instruments (Telescopes, microscopes, so on..) Unified, Seamless access to these resources – Analogous to the Power Grid September IEEE HPEC'12
Indias National Grid Computing Initiative: GARUDA September IEEE HPEC'12 Motivation To Collaborate on Research and Engineering of Technologies, Architectures, Standards and Applications in Grid Computing To Contribute to the aggregation of resources in the Grid Production infrastructure with Gigabit networking backbone (NKN) Large HPC computing resources Massive Storage Tools and Services for Unified Access Currently Connects more than 60 institutions Academic & Research labs Spans across 17 cities of India Supports 10 Virtual Organizations Bioinformatics, Seismic engineering, Climate modeling, Drug discovery ….
Problem Statement As the demand for the resources increases more and more, it becomes really difficult to manage the jobs and allocate resources to them and hence most of the jobs will be in the queued state waiting for the resource to be free September 2012 IEEE HPEC'12 5
Our Approach Reduce waiting time Solution : Advance Reservation of resources – An advance reservation is a reservation that a user or administrator can request and the scheduler can create. – It guarantees the availability of resources at specified future time slot September 2012 IEEE HPEC'12 6
Compute Reservation An advance reservation is essentially defined by the following: – Start time which is defined using the standard date-time format – An end time, which is either defined using the standard date-time format or computed from the start time plus a duration value, – Number and type of resource to be reserved September IEEE HPEC'12
Garuda Reservation Architecture RESERVATION REPLICA DB LOCAL RESOURCE MANAGER RESERVATION MANAGER AND SCHEDULER GARUDA LRM RESERVATION COMPONENT GARUDA MIDDLEWARE RESERVATION COMPONENT GLOBUS MIDDLEWARE GRIDWAY META-SCHEDULER GARUDA GRID LEVEL RESERVATION COMPONENT RESERVATION DB FAILOVER API COMMANDS APPLICATIONS
Garuda Reservation Features Advanced and Immediate Reservation of resources across multiple clusters – Ensure resource availability – GSI based reservation: Garuda Reservation – Grid Reservation Failover mechanism: – Application Programming Interface – Intelligent resource allocation based on QoS Parameters – Virtual Organization support – Avoiding resource under utilization – Integration with Gridway Meta-scheduler and Globus Middleware
Performance Analysis September 2012 IEEE HPEC'12 10
Performance Metrics Mean waiting time Execution time Turnaround time September IEEE HPEC'12
Turnaround Time Turnaround time (total time taken between the submission of a program/process/thread/task (Linux) for execution and the return of the complete output to the customer/user) September IEEE HPEC'12 Job Submission Job Output User
Performance Analysis September 2012 IEEE HPEC'12 13
Turn-around time without reservation September IEEE HPEC'12 Job SetWaitingExecutionTurnaround Job Set 10:04:000:17:160:22:02 Job Set 20:06:000:17:270:24:14 Job Set 30:44:000:18:311:02:49 Job Set 41:11:000:17:271:38:42 Job Set 51:20:000:18:261:37:41
Turn-around time without reservation September IEEE HPEC'12
Turn-around time with reservation September IEEE HPEC'12 Job SetWaitingExecution Turnaroun d Job Set 10:00:090:08:030:08:32 Job Set 20:00:090:08:050:08:35 Job Set 30:00:090:08:070:08:37 Job Set 40:00:090:08:050:08:37 Job Set 50:00:080:07:150:07:45
10-12 September 2012 IEEE HPEC'12 17 Turn-around time with reservation
Comparison of Turnaround times September 2012 IEEE HPEC'12 18
Guarantees the availability of resources Eliminates the waiting time Reduces Turnaround time considerably Well integrates into the Grid Middleware Built for the production infrastructure Analysis has shown results that are really encouraging September 2012 IEEE HPEC'12 19 Conclusion
Thank You September 2012 IEEE HPEC'12 20