SPECIFYING AND MONITORING GUARANTEES IN COMMERCIAL GRIDS THROUGH SLA Sven Graupner Vijay MachirajuAad van Moorsel IEEE/ACM International Symposium on Clustering Computing and the Grid 2003 Akhil Sahai Presented by: Yun Liaw Hewlett-Packard Laboratories
Outline Introduction SLA and the Grid Grid Deployment Infrastructure Grid Management Architecture Specifying and Monitoring SLAs Conclusions & Comments 2015/5/18 2
Introduction “Best effort” was a sufficient policy for committing resources in academic grid environments But when we moving into a commercial space, some stricter guarantees must be hold 2 Problems arises: At any given point of time hundreds of SLA may exist, with large number of metrics to be observed SLA needs formalize representation so that the SLA evaluation can be automated For a given application context, multiple resource providers and consumers are involved The SLA management system must have the ability (Grid Proxy) to combine the distributed states of SLAs, to provide a consolidated view in the embracing application context 2015/5/18 3
SLA and the Grid Negotiating a SLA is an exchange (protocol) of messages between user and provider, potentially involving some form of a middleman or broker SNAP (Service Negotiation and Acquisition Protocol) [11] Designed for distributed systems Three types of supported SLA in SNAP: Resource acquisition agreements (user’s right to use the resource) Task submissions agreements (inform needed resources of the existence of a user’s task) Task/resource binding agreement (enabling the task to consume and agreed quantity of a resource) Not mentioned the quality aspect, and the maintenance of SLA for the life-span meaning It is important to understand the SLA hosting environment To understand how SLA may be specified and monitored 2015/5/18 4 [11] K. Czajkowski, et al., “SNAP: A Protocol for Negotiation of Service Level Agreements and Coordinated Resource Management in Distributed Systems,” JSSPP, 2002
Grid Deployment Infrastructure 2015/5/18 5 HP’s UDC (Utility Data Center) : Farm A programmable hosting environment for applications Globus Resource Specification Language (RSL) A language to specify the resources in a grid, including the resource topology For UDC resource manager to configure resources In order to protect different farm instances, two types of resources are virtualized for farms: Network Resources Storage Resources
RSL Example 2015/5/18 6
Grid Management Architecture 7 OGSA Grid Conceptual Architecture: based on web services (.Net or J2EE based) SLA management needs: 1.Factory and R & D services to find resources based on QoS requirements 2.Life-cycle management and manageability services to collect measurement data 3.Reliable invocation for controlling resources 4.Notification to inform impacted parties 2015/5/18
Grid Management Proxy Grid Proxy: Corresponding to a particular Grid deployment infrastructure Interact with each other forming a Grid management proxy overlay Protocols that grid community has agreed on proxy communication GRAAP: Grid Resource Allocation Management GIS: geographic Information System GASS: Grid Application Support System GSI: Grid Security Infrastructure 2015/5/18 8
SLA Management Protocols between Grid Proxies Why the additional protocols are needed? Not all data required for managing SLAs can be measured locally Provider’s behavior is dependent on user’s behavior Provider’s behavior is dependent on another provider’s behavior All the controls needed to manage SLAs are not available locally SLA assurance may also be accomplished by management systems across multiple domains exchanging messages that invoke a limited set of explicitly stated control actions 2015/5/18 9
SLA Definition Purpose The reasons behind the creation of the SLA Parties Parties involved in the SLA and their respective roles Validity Period The valid time of this SLA Scope The service scope covered in this SLA Restrictions The necessary steps to be taken for the requested service levels to be provided Service Level Objectives The service level that both users and the provider agreed on 2015/5/18 10
SLA Definition (cont’d) Service Level Indicators The means by which these levels can be measured Penalties Describing what happens in case the service provider is unable to meet the SLO Optional services Services that are not normally required by the user, but may be an exception Exclusions Specifies what is not covered in the SLA Administration Describe the processes created in the SLA to meet and measure its objectives 2015/5/18 11
SLA specification An SLA is specified over a set of data that is measurable Date constraint (start date, end date, nextEvalDate) SLOs Day-time constraint MeasuredItems: Set of clauses based on measured data Contains many items evalWhen: the trigger time of this SLO evaluation evalOn: Determine how the sample data is computed for the evaluation evalFunc: the mathematical function that is expressible in terms of its inputs and logic 2015/5/18 12
SLA specification Example Scenario: SLO clause: At month-end, the availability of the farm allocated to the user myASP.com, measured on the myUDC.com from Mon-Fri from 9AM-5PM should be at least 99.9% 13
SLA Monitoring 2015/5/18 14
SLA Measurement Protocol Init: from measurement proxy to evaluate proxy Request: The evaluator site decides the exact measurement spec and send to the measurement proxy Agreement: The measurement proxy sends this message if it agrees to the request to the evaluator Start: message from the evaluator to commence the report Report: actual measurement report Close: termination 2015/5/18 15
Conclusions and Comments Conclusions: Applying grid model to commercial environment requires specification, monitoring and assurance of SLA Define specification language and framework to monitoring Comments: No implementation detail Waving hands 2015/5/18 16