Presentation is loading. Please wait.

Presentation is loading. Please wait.

Service Level Status Overview project Sebastian Lopienski CERN, IT/FIO HEPiX meeting, Jefferson Lab, October 10 th, 2006.

Similar presentations


Presentation on theme: "Service Level Status Overview project Sebastian Lopienski CERN, IT/FIO HEPiX meeting, Jefferson Lab, October 10 th, 2006."— Presentation transcript:

1 Service Level Status Overview project Sebastian Lopienski CERN, IT/FIO HEPiX meeting, Jefferson Lab, October 10 th, 2006

2 2 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Agenda Overview of the project Concepts –service, subservice, metaservice –availability vs. status –Key Performance Indicators Demonstration Your own SLS instance?

3 3 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO The need What is current availability of the CVS service? Which services are still affected by the power cut last night? If my service is in maintenance, what other services will be affected? What is overall status of all services used by ATLAS experiment?

4 4 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Service Level Status Overview (SLS) Aim: –To provide a web-based tool that dynamically shows availability, basic information and statistics about IT services, as well as dependencies between them. For whom? –service users –department and CERN management –other service providers –manager of the given service

5 5 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO First insight

6 6 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Features collecting and displaying service information, status and availability dependencies and reverse dependencies service incidents, scheduled interventions hierarchical structure of services configurable views of services charts of availability trends over time statistics of availability (and other values) Key Performance Indicators (KPIs)

7 7 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Architecture we collect and display information but we don’t generate it!

8 8 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Architecture

9 9 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Agenda Overview of the project Concepts –service, subservice, metaservice –availability vs. status –Key Performance Indicators Demonstration Your own SLS instance?

10 10 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Services, metaservices etc.

11 11 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO What is service availability? Service availability indicates to what extent a given service is accessible and useful for its users Services should be monitored from users’ point of view –a user doesn’t care about alarms on machines running the service In SLS, service availability is a number N: 0 ≤ N ≤ 100

12 12 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Service availability and status Service fully (100%) available Service available at 95%, still marked as fully available –above the highest threshold Service available at 87%, marked as affected –below the highest threshold Service available at 50%, marked as degraded –below the medium threshold Service available at 13%, marked as not available –below the lowest threshold Service info expired, update not available Scheduled outage or maintenance Different status thresholds mean different status for services with the same availability (more at http://cern.ch/SLS/help.php)http://cern.ch/SLS/help.php

13 13 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Key Performance Indicators KPIs are metrics that indicate whether a service meets its requirements (performance or other) Examples of Key Performance Indicators: –% of availability of CPU servers (how many machines in production out of total) –% of AFS volumes and servers available, also breakdown by VO –CPU delivered to VO as compared to quota, % of usage from Grid KPI is a pair of two values: measured and target

14 14 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Agenda Overview of the project Concepts –service, subservice, metaservice –availability vs. status –Key Performance Indicators Demonstration Your own SLS instance?

15 15 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO SLS instance at CERN http://cern.ch/SLS (NICE password required)http://cern.ch/SLS –all availabilities shown there are real and up to date inline SLS view for a given service (e.g. at http://cern.ch/CVS)http://cern.ch/CVS

16 16 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO SLS instance at CERN Most IT services are covered by SLS: Administrative applications Windows, Mail, Web services AFS, lxbatch, lxplus, Backup, Tapes, Remedy, Lemon CVS services, J2EE Public Service, EDMS databases LCG Tier-0 and 1 sites Indico, CDS, CRBS, VRVS etc. Metaservices and views: logical structure, group structure, VO-oriented structure

17 17 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Agenda Overview of the project Concepts –service, subservice, metaservice –availability vs. status –Key Performance Indicators Demonstration Your own SLS instance?

18 18 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Setting up an SLS instance Simple installation from an RPM –for SLC3 and SLC4 –see : https://twiki.cern.ch/twiki/bin/view/FIOgroup/SLSAdminDocumentation https://twiki.cern.ch/twiki/bin/view/FIOgroup/SLSAdminDocumentation No CERN-specific dependencies Requirements –Apache, Python, PHP (with DOM and OCI8 extensions) –Xerces-C >= 2.3 –JpGraph and GD library –cx_Oracle (for the database functionality) Comes with one service predefined – SLS itself Released under the EU DataGrid software licenseEU DataGrid software license –a BSD-style license

19 19 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Adding a new service Service manager has to: –have an idea how to measure service availability –and a piece of code that calculates the availability percentage value (0..100) Then, follow the two simple steps: –prepare a static service description XML file and send it to us (once) –make service update XMLs available via HTTP SLS Manual for Service Managers provides detailed instructions, and many examples of XMLs: https://twiki.cern.ch/twiki/bin/view/FIOgroup/SLSManualForSM

20 20 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Minimal static XML example DFS DFS (Distributed File System) https://websvc02.cern.ch/winservices-soap/... DFS DFS (Distributed File System) https://websvc02.cern.ch/winservices-soap/... Example of static service description XML with more information: https://twiki.cern.ch/twiki/bin/view/FIOgroup/SLSManualForSM#Static_XML_with_more_information

21 21 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Service managers … Carlos Ungil Maciej Stepniewski William Tomlin … Carlos Ungil Maciej Stepniewski William Tomlin … Contact data from LDAP

22 22 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Service dependencies Two different levels of dependency: –dependson - means that the service will not work if AFS is down –uses - means that the service uses Castor (for example for backup), but will work fine (or almost fine) even if Castor is not available … AFS Castor … AFS Castor …

23 23 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Status thresholds … 80 70 30 … 80 70 30 … 0 30 7080 100

24 24 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Minimal update XML example <serviceupdate xmlns="http://sls.cern.ch/SLS/XML/update"> CVS 100 2006-03-14T14:20:27+01:00 <serviceupdate xmlns="http://sls.cern.ch/SLS/XML/update"> CVS 100 2006-03-14T14:20:27+01:00 Example of availability update XML with more information: https://twiki.cern.ch/twiki/bin/view/FIOgroup/SLSManualForSM#Update_XML_with_more_information

25 25 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Making update XML accessible via http Generate update XMLs with any server-side language / technology / platform: –PHP, Perl, Python, CGI, ASP –.Net: C#, J2EE: Servlets, JSP or: Refresh periodically (from a cron) a file and make it available via http or: Write a Lemon sensor providing service availability Advice and examples in the SLS Manual for Service Managers SLS Manual for Service Managers

26 26 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Observations Trusting service managers –there is no way to cross-check availability figures provided by services User expectations –Is it really real-time? –My mailbox/CVS repository/J2EE container doesn’t work, but the service is green! Surprisingly, convincing service managers to join in was not that difficult

27 27 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Summary SLS shows availability and status of services as seen by users SLS is a flexible and informative display covering the entirety of computing services SLS collects and displays information provided by the services SLS is available for use outside CERN

28 28 Service Level Status Overview projectSebastian Lopienski, CERN IT/FIO Thank you! SLS instance at CERN (password protected) http://cern.ch/SLS http://cern.ch/SLS Sebastian.Lopienski@cern.ch Questions?


Download ppt "Service Level Status Overview project Sebastian Lopienski CERN, IT/FIO HEPiX meeting, Jefferson Lab, October 10 th, 2006."

Similar presentations


Ads by Google