Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College

Overview ● End-user requirements ● Brief description of compute hardware ● Sun Grid Engine software deployment ● Tweaks to the default SGE configuration ● Future changes ● References for more information and questions.

End-User Requirements ● We have many different users: high-energy physicists, bioinfomaticians, chemists, parallel software researchers. ● Jobs are many and varied: – Some users run relatively few long running tasks, others submit large clusters of shorter jobs. – Some require several cluster nodes to be co-allocated at runtime (16, 32+ MPI hosts), others simply use a single machine. – Some require lots of RAM.. (1, 2, 4, 8GB+ per machine) ● In general users are fairly happy so long as they get a reasonable response time.

Hardware ● Saturn: 24-way 750Mhz UltraSparcIII Sun E6800 – 36GB RAM, ~20TB online RAID storage, – 24TB tape library to support long-term offline backups. – Running Solaris 8 ● Viking cluster: 260 Node dual P4 Xeon 2Ghz+ – 128 machines with Fast Ethernet; 2x64 machines also with Myrinet – 2 front-end nodes & 2 development nodes. – Running RedHat Linux 7.2 (plus local additions and updates) ● Mars cluster: 204 Node dual AMD Opteron 1.8Ghz+ – 128 machines with Gigabit Ethernet; 72 machines also with Infiniband. – Running RedHat Enterprise Linux 3 (plus local refinements) – 4 front-end interactive nodes.

Sun Grid Engine Deployment ● Two separate logical SGE installations – Saturn acts as the master node for both cells. – However, Viking is running SGE 5.3 and Mars is running SGE 6.0. ● Mars is still ‘in beta’; Viking is still providing the main production service. ● When Mars’s configuration is finalized, end-users will be migrated to Mars – Viking will then be reinstalled with the new configuration.

Changes to Default Configuration ● Issue 1: – If all the available worker nodes are running long-lived jobs, then a new short-lived job added to the queue will not execute until one of the long-lived jobs has completed. (SGE does not provide a job checkpoint-and-preempt facility.) – Resolution: A subset of nodes are configured to only run short- lived jobs. – Trades slightly reduced cluster utilization for shorter average-case response time for short-lived jobs. – End users only benefit if they specify the job will finish quickly at submission-time.

Changes to Default Configuration ● Issue 2: – Clusters are internally heterogenous; eg some have more memory, faster processors, bigger local disks than others. – Sometimes a low-requirement job will be allocated to one of these more capable machines unnecessarily because the submitter has not specified the job’s requirements. – This can prevent a job which does have high requirements from being run as quickly. – Experiment with changing the SGE configuration so that a job will, by default, only require the resources of the least-capable node. – Again, places onus on user to request extra resources if needed.

Changes to Default Configuration ● Issue 3: – If a job is submitted that requires the co-allocation of several cluster nodes simultaneously (eg for a 16-way MPI job) then that job can be starved by a larger number of single-node jobs. – Resolution: Manually intervene to manipulate queues so that the large 16-way job will be scheduled. (SGE 5.3) – Resolution: Upgrade to SGE 6 which uses a more advanced scheduling algorithm (advance reservation with backfill.)

Future Changes: LCG ● We are participating in the Large Hadron Collider Compute Grid as part of the London Tier-2. ● This has been non-trivial; the standard LCG distribution only supports PBS-based clusters. – We’ve developed SGE-specific Globus JobManager and Information Reporter components for use with LCG. – We have also been working with the developers to address issues with running on 64bit Linux distributions. ● Currently deploying front-end nodes (CE, SE, etc.) to expose Mars as an LCG compute site. ● We are also joining the LCG Certification Testbed to provide a SGE-based test site to help ensure future support.

References ● London e-Science Centre homepage: – http://www.lesc.ic.ac.uk/ ● SGE intergration tools for Globus Toolkit 2, 3, 4 and LCG: – http://www.lesc.ic.ac.uk/projects/sgeindex.html

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.

Similar presentations

Presentation on theme: "Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.

Similar presentations

Presentation on theme: "Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College."— Presentation transcript:

Similar presentations

About project

Feedback