1 Platform LSF6 What’s new in LSF6
© Platform Computing Inc What is the Platform LSF Family of Products?
© Platform Computing Inc How It Works - Platform LSF Load Information Manager Host Workload Manager LSF Web Services Broker Web Application Job Submission API Plugin Schedulers Cluster Workload Manager Cluster Workload Manager Job Queue Intelligent Scheduler Fairshare Preemption Resource Reservation Advance Reservation License Scheduling SLA Scheduling Service Level Agreement MultiCluster Other Scheduling Modules
© Platform Computing Inc Key Features - Platform LSF High Performing, Open, Scalable Architecture Scalable Scheduler Architecture External executable support OGSI compliance Intelligent Scheduling Policies Fairshare (user & project-based) Policy-based preemption Goal oriented SLA scheduling Job Groups Advanced Self-Management Flexible, comprehensive resource definitions Job-level exception management Automatic job migration and requeue Master scheduler failover Heterogeneous Platform Support Extensive Application Support Comprehensive, Extensible and Standards-based Security
© Platform Computing Inc Key Features – Platform LSF Intelligent Scheduling Policies Advanced Self-Management Heterogeneous Platform Support Extensive Application Support Comprehensive, Extensible and Standards-based Security High Performing, Open, Scalable Architecture
© Platform Computing Inc Scalable Scheduler Architecture Modularized into manager and scheduler plug-ins Supports over 500,000 active jobs per cluster More than 2,000 multi-processor host per cluster - with multiple processors in each host Process 5x more work Achieve 100% utilization Scale with your challenges Intelligent Scheduler Fairshare Preemption Resource Reservation Advance Reservation SLA Scheduling Service Level Agreement MultiCluster Other Scheduling Modules Plugin Schedulers License Scheduling
© Platform Computing Inc External Executable Support Collect information from multiple external resources to track site specific local and global resources Extends out-of-the-box capabilities to manage additional resources and customer application execution Differentiation Multiple vs single external resource collector
© Platform Computing Inc Job Groups Organize jobs into higher level work units - hierarchical tree Similar to the directory structure of a file system Easy to manage and control work Increases manageability by reducing complexity
© Platform Computing Inc OGSI Compliance CSF - “OGSI-compliant” & “Web Services enabled” Future-proof & protect grid investment using standards- based solutions Standardized approach to access Platform LSF Interoperate with third-party systems Differentiation First to comply
© Platform Computing Inc Key Features – Platform LSF Intelligent Scheduling Policies High Performing, Open, Scalable Architecture Advanced Self-Management Heterogeneous Platform Support Extensive Application Support Comprehensive, Extensible and Standards-based Security
© Platform Computing Inc Fairshare (User & Project-based) Ensure job resources are used for the right work Guarantees resource allocation among users and projects are met Co-ordinate access to the right number of resources for different users and projects according to pre-defined shares Differentiation Hierarchal & guaranteed Intelligent Scheduler Fairshare Preemption Resource Reservation Advance Reservation SLA Scheduling Service Level Agreement MultiCluster Other Scheduling Modules Plugin Schedulers License Scheduling
© Platform Computing Inc Policy-based Preemption Maximizes throughput of high priority critical work based on priority and load conditions Prevents starvation of lower priority work Differentiation Platform LSF supports multiple preemption policies Intelligent Scheduler Fairshare Preemption Resource Reservation Advance Reservation License Scheduling SLA Scheduling Service Level Agreement MultiCluster Other Scheduling Modules Plugin Schedulers
© Platform Computing Inc Goal-oriented SLA driven policies Based on customer SLA driven goals: Deadline, Velocity, Throughput Guarantees projects are completed on time Reduces projects and administration costs Provides visibility into the progress of projects Allows the admin focus on “What work and When” needs to be done, not “how” the resources are to be allocated
© Platform Computing Inc Key Features – Platform LSF Advanced Self-Management High Performing, Open, Scalable Architecture Intelligent Scheduling Policies Heterogeneous Platform Support Extensive Application Support Comprehensive, Extensible and Standards-based Security
© Platform Computing Inc Flexible, Comprehensive Resource Definitions Resources defined on a node basis across an entire cluster or subset of the nodes in a cluster Auto-detectable or user defined resources Adaptive membership – nodes join and leave Platform LSF clusters dynamically and automatically without administration effort Dynamic or static resources Heterogeneous support Enables dynamic scheduling
© Platform Computing Inc Job Level Exception Management Exception-based error detection to take automatic, configurable, corrective actions Increased job reliability & predictability Improved visibility on job and system errors Reduced administration overhead and costs
© Platform Computing Inc Automatic Job Migration and Requeue Automatically migrate and requeue jobs based on policies in the event of host or network failures Reduce user and administrator overhead in managing failures Reduce risk of running critical workloads
© Platform Computing Inc Master Scheduler Failover Automatically fail over to another host if the master host is unavailable Continuous scheduling service and execution of jobs Eliminate manual intervention
© Platform Computing Inc Key Features – Platform LSF High Performing, Open, Scalable Architecture Intelligent Scheduling Policies Advanced Self-Management Extensive Application Support Comprehensive, Extensible and Standards-based Security Heterogeneous Platform Support
© Platform Computing Inc Heterogeneous Platform Support UNIX Compaq - Alpha Tru64 IBM – AIX HP – HP-UX SGI – IRIX Sun – Solaris Linux Debian Caldera RedHat SuSE TurboLinux IA32/64 & AMD64 Windows 98, 2000, NT, XP Other NEC, Mac OS, Cray Mainframe Linux on IBM zSeries DCE, AFS, DFS, environments
© Platform Computing Inc Key Features – Platform LSF High Performing, Open, Scalable Architecture Intelligent Scheduling Policies Advanced Self-Management Heterogeneous Platform Support Comprehensive, Extensible and Standards-based Security Extensive Application Support
© Platform Computing Inc Extensive Application Support Electronics Industrial Manufacturing Life Sciences
© Platform Computing Inc Other Application integration initiative Offer to the market a competent Grid vision Grid Computing solutions showroom Allow customers to “test drive” the Grid Computing and Itanium2 from their own desk
© Platform Computing Inc Applications are the key to Grid success!!! We are involved in multiple convergent efforts in Research and Industry: EGEE/LCG/GENIUS Life Sciences, Chemistry, Rendering, Earth Observation, etc. GridAge Engineering EnginFrame integrations Engineering, Oil&Gas, Electronics, Telecom, etc. LSF integrations Electronics, Engineering, Oil&Gas, Finance, etc. What about working together on a joint, pragmatic, research- and industry-proven, common (and open?) standard?
© Platform Computing Inc What for? Standardized user interface Relocable between GENIUS and EnginFrame (inherits layout, AAA, user mapping, etc.) GridML to allow generic job & resource monitoring Standardized scripting kit Relocable submission to LSF, LCG, GLOBUS, etc. (needs work) Relocable job control via Grid plug-ins in EnginFrame Standardized application packaging? Is the LCG work reusable?
© Platform Computing Inc Key Features – Platform LSF Comprehensive, Extensible and Standards-based Security High Performing, Open, Scalable Architecture Intelligent Scheduling Policies Advanced Self-Management Heterogeneous Platform Support Extensive Application Support
© Platform Computing Inc Additional New Features in Platform LSF V6.0 Administrator action messages Scheduler dynamic debug Administration & Diagnostics Resource allocation limit display Non-normalized job run limit Job Limit Enhancements Job starvation prevention plug-in Queue priority-based user Fairshare Queue-based Fairshare Scheduling Additional Features
© Platform Computing Inc
Q & A