Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance and Exception Monitoring Project Tim Smith CERN/IT.

Similar presentations


Presentation on theme: "Performance and Exception Monitoring Project Tim Smith CERN/IT."— Presentation transcript:

1 Performance and Exception Monitoring Project Tim Smith CERN/IT

2 2000/11/02Tim Smith: HEPiX @ JLab2 Overview  Motivation  Objectives  Analysis and Design  Prototyping  Perspective and Future

3 2000/11/02Tim Smith: HEPiX @ JLab3 Motivation  Alarm  Recovery action  Monitoring  System  Local  Remote  Process killer  Console  Resource planning  Accounting  Security  Inventory  Independent systems  No single overview  Duplicated collection  Host based: Want Service  Perceived problems not real  Scalability

4 2000/11/02Tim Smith: HEPiX @ JLab4 Motivation  Alarm  Recovery action  Monitoring  System  Local  Remote  Console  Resource planning  Accounting  Security  Inventory  Configuration  Collection  Transport  Repository mgmt  Display

5 2000/11/02Tim Smith: HEPiX @ JLab5 Objectives  To provide tools in which the alarms and displays are orientated to the overall service provided:  User end-to-end views, Quality of service views  Managerial views of resource usage / evolution / failure rates  Service provider views, and detailed machine views  Link the alarms to both the monitoring and corrective actions  To provide service level metrics  To provide a uniform monitoring infrastructure  Coordinated central repositories + Common logging format  Averaging and archiving of logged information  Correlations between logged information  Multiple input routes; extensible moni. clients  Modular tools; demonstrated scalability

6 2000/11/02Tim Smith: HEPiX @ JLab6 Process  Analysis  User Requirements Document  Current Tools survey  Enterprise/Cluster mgmt, Pub domain, other labs, building blocks, DAQ, Run Control, Slow Control  Goal / Question / Metric formalism  System Requirements Document  Design  Interfaces Document  Prototyping

7 2000/11/02Tim Smith: HEPiX @ JLab7 Goal / Question / Metric  Ensure quality of Interactive Service  Sufficient nodes?  Low enough load?  Slow to respond to commands?  Contactable via network  Network daemons alive  No nologin  Free ptys  Connection test from remote node

8 2000/11/02Tim Smith: HEPiX @ JLab8 PEM Architecture User Interface Monitoring Agent Monitoring Broker Measurement Repository Configuration Repository Correlation Engine Access Server 1 1 1 1 1 1 1 1 11..n Outside PEM

9 2000/11/02Tim Smith: HEPiX @ JLab9 Configuration Repository Parser XML- DBMS jdbc RDBMS Viewers Xerces From Apache XML-DBMS freeware (Tried XSU from Oracle) XML Schema Loading the DB Host, Host type Metrics, Services

10 2000/11/02Tim Smith: HEPiX @ JLab10 Configuration Repository Parser XML- DBMS jdbc RDBMS XML DB Querying the DB jdbc Configuration Items Java Objects

11 2000/11/02Tim Smith: HEPiX @ JLab11 Correlation Engine  To correlate metrics from the MRS according to configuration in the CRS  Metric collections: trends + multiple machines  Samplings: Union for read efficiency from MRS  Example Java Classes:  Correlation coordinator  Sampling cache  Evaluators  Timers

12 2000/11/02Tim Smith: HEPiX @ JLab12  Publish / Subscribe : Java RMI  Interfaces Document Events User Interface Monitoring Agent Monitoring Broker Measurement Repository Configuration Repository Correlation Engine Access Server metric stream metric value exception configuration

13 2000/11/02Tim Smith: HEPiX @ JLab13 Monitoring Agent/Broker I  SNMP  extended existing infrastructure  Multithreaded broker loading DB  JMX / JDMK  JMX public specification: managed resources  Plugable agents  Reported several important bugs  Demo at JavaOne conference  Linux/NT remote reset  Netlogger instrumentation  Opened up license negotiations

14 2000/11/02Tim Smith: HEPiX @ JLab14 Monitoring Agent/Broker II  C  Low overhead SNMP /proc netlogger Script Spool Monitoring ProcessSpool ManagerMonitoring Broker  Not yet … DMTF  DMI, CMI

15 2000/11/02Tim Smith: HEPiX @ JLab15 PEM Futures  Today: CERN CC needs it  Prototype for ALICE MDC III in January  Tomorrow: Tier-0 RC / GRID node need it  More complete management solutions  Integrate into the Fabric Management WP  ‘GRIDification’  Rapidly evolving technologies  Lots of middleware  Lots of companies wanting collaboration  still need framework

16 2000/11/02Tim Smith: HEPiX @ JLab16 Configuration Management Alarm Recovery Actions Inventory Resource Planning Security PEM in Perspective PC Hardware Console Mgmt Power Mgmt/Remote Reset OS Installation/Update OS Configuration/Update Application Inst/Update Monitoring


Download ppt "Performance and Exception Monitoring Project Tim Smith CERN/IT."

Similar presentations


Ads by Google