Download presentation
Presentation is loading. Please wait.
Published byGarey Cooper Modified over 9 years ago
1
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 1
2
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 2 Motivation We wish to monitor the ALICE HLT analysis cluster – 500 PCs The analysis of data obtained from the ALICE experiment will take a long time, therefore a stable analysis cluster is needed To ensure stability, this cluster must be constantly monitored Using the EPICS architecture with SNMP support it is possible to monitor such a PC cluster
3
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 3 Contents Cluster Management –SNMP MIB Trees SNMP Operations Using data from SNMP –EPICS Overview Channel Access Record Display Device Support –devSNMP Management Possibilities Test Implementation –Overview –Software –Monitored Resources –Example Implementation –Extended Implementation Extension Possibilities Current State Summary
4
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 4 Cluster Management Nowadays PC clusters are widely used for data analysis in many settings, such as in physics experiments or commercial organisations These clusters often consist of hundreds to thousands of individual PCs (nodes) In order to maintain a healthy, efficient cluster, key resources of the nodes must be monitored, eg: –Hard disk usage –Processor usage –Running processes, etc... What is the best way of obtaining this information from the nodes? –Self monitoring? –Operating system logging? –SNMP?
5
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 5 Simple Network Management Protocol Simple Network Management Protocol (SNMP) is a management protocol for gathering statistical data about network/host traffic and the behaviour of network components It is a telecom industry standard protocol and therefore most standardized organizations and main vendors support SNMP It creates an extensive Management Information Base (MIB) on the host system, which is a database of information useful for network management MIB objects are organised in a tree structure that includes public (standard) and private branches These MIBs contain key system resource information which can be used for monitoring purposes
6
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 6 MIB Tree - Graphical View sysDescr = 1sysUpTime = 3 dskTotal = 6dskAvail = 7 mgmt = 2 iso = 1 org = 3 dod = 6 internet = 1 MIB-2 = 1 private = 4 system = 1 enterprises = 1 ucdavis = 2021 dskTable = 9 dskEntry = 1 MIB tree can referred to symbolically or numerically –Eg: iso.org.dod.internet.mgmt.mib-2.system.sysUpTime = 1.3.6.1.2.1.1.3
7
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 7 MIB Tree - Output View +--iso(1) | +--org(3) | +--dod(6) | +--internet(1) | +--directory(1) | +--mgmt(2) | | | +--mib-2(1) | | | +--system(1) | | | | | +-- -R-- String sysDescr(1) | | | Textual Convention: DisplayString | | | Size: 0..255 | | +-- -R-- ObjID sysObjectID(2) | | +-- -R-- TimeTicks sysUpTime(3) | | +-- -RW- String sysContact(4) | | | Textual Convention: DisplayString | | | Size: 0..255 | | +-- -RW- String sysName(5) | | | Textual Convention: DisplayString | | | Size: 0..255 | | +-- -RW- String sysLocation(6) | | | Textual Convention: DisplayString | | | Size: 0..255 | | +-- -R-- INTEGER sysServices(7) | | | Range: 0..127 | | +-- -R-- TimeTicks sysORLastChange(8) | | | Textual Convention: TimeStamp | | | | | +--sysORTable(9) | | | | | +--sysOREntry(1) | | | Index: sysORIndex | | | | | +-- ---- INTEGER sysORIndex(1) | | | Range: 1..2147483647 | | +-- -R-- ObjID sysORID(2) | | +-- -R-- String sysORDescr(3) | | | Textual Convention: DisplayString | | | Size: 0..255 | | +-- -R-- TimeTicks sysORUpTime(4) | | Textual Convention: TimeStamp | | | +--interfaces(2) | | | | | +-- -R-- Integer32 ifNumber(1) | | | | | +--ifTable(2) | | | | | +--ifEntry(1) | | | Index: ifIndex | | | | | +-- -R-- Integer32 ifIndex(1) | | | Textual Convention: InterfaceIndex | | | Range: 1..2147483647 | | +-- -R-- String ifDescr(2) | | | Textual Convention: DisplayString | | | Size: 0..255 | | +-- -R-- EnumVal ifType(3)
8
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 8 SNMP Operations - Overview SNMP has simple client-server interactions with few operations to access information held in the MIB tree: –{Get} {Set} {GetNext} {Walk} {Table} {Trap} {Translate} These operations can query local MIB trees, or those of networked machines SNMP Agent MIB Managed Device SNMP Agent MIB SNMP Agent MIB SNMP Agent MIB SNMP Agent Network SNMP Operation
9
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 9 SNMP Operations - Command Struct. Typical SNMP {get} command structure: Operation Community PC to Query MIB Object to query Output: MIB Object queried Object Type Object Value
10
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 10 Using Data from SNMP Once the information has been obtained from the MIB trees it must be fed into a control system for it to be useful in a management context This might process the information, store it for later analysis, or simply display it using a Graphical User Interface (GUI) Many systems currently exist: –EPICS –Ganglia –Lemon
11
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 11 EPICS - Overview One such system is the Experimental Physics and Industrial Control System (EPICS) –www.aps.anl.gov/epicswww.aps.anl.gov/epics It is currently in use in over 12 organizations to control devices in major projects such as Particle Accelerators, Telescopes, and Large Experiments –GSI, SLAC, ANL, DESY, LANL,... Therefore, huge support and knowledge base It is based on a client/server network model, with servers holding information in Records which can be accessed by the clients
12
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 12 EPICS - Architecture Record Field 1: x Field 2: y Field 3: z Record Field 1: x Field 2: y Field 3: z EPICS Clients EPICS Servers Network
13
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 13 EPICS - Channel Access Remote access to EPICS records is achieved through the Channel Access (CA) protocol This requires a CA server to be running on the EPICS server, and a CA client to be running on the EPICS client These are usually already integrated into EPICS clients/servers when they are created
14
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 14 EPICS - Architecture Record Field 1: x Field 2: y Field 3: z Record Field 1: x Field 2: y Field 3: z EPICS Clients EPICS Servers Network CA Server CA Client
15
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 15 EPICS - Record Display The information from EPICS records can be displayed by a GUI: MEDM
16
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 16 EPICS - Record Display GumTree
17
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 17 EPICS - Device Support Records can be interfaced to numerous devices These devices can be hardware or software Interfacing allows information from device to be input into EPICS records This interfacing is known as device support
18
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 18 EPICS - Architecture Record Field 1: x Field 2: y Field 3: z Record Field 1: x Field 2: y Field 3: z EPICS Clients EPICS Servers Network CA Server CA Client Support
19
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 19 Device Support for SNMP - devSNMP devSNMP is the device support for SNMP Allows the input of data from SNMP into EPICS records –Sets input field of a record to an SNMP {get} operation It is configured for the open source product, NET-SNMP –This is simply one particular implementation of SNMP –www.net-snmp.orgwww.net-snmp.org
20
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 20 Device Support for SNMP - devSNMP SNMP {get} command: Record definition file: record (stringin, “System_Description"){ field (DTYP,"Snmp") field (INP,"@localhost public system.sysUpTime.0 STRING:100") field (SCAN,"5 second")}
21
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 21 Management Possibilities EPICS records are capable of carrying out simple calculations and conditionality relations – nothing very complicated The data from SNMP can therefore be used to control other devices interfaced with EPICS records One reaction possibility is an SNMP {set} operation, which writes values to a MIB However, the current release of devSNMP supports only {get} operation Other SNMP command support planned for the future
22
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 22 Test Implementation - Overview Carried out at the Linux PC Cluster at the Kirchhoff Institute for Physics, University of Heidelberg 32 PCs running SuSE 9 Linux OS
23
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 23 Test Implementation - Software EPICS Servers: –30 cluster nodes (2.4 and 2.6 kernels) running EPICS soft IOCs with devSNMP –NET-SNMP tool set and libraries installed on each node EPICS Clients: –Two cluster nodes (2.6 kernel) running an installation of Motif Editor and Display Manager (MEDM) on an EPICS base
24
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 24 Test Implementation - Architecture MEDM Record Inp: SNMP Record Inp: SNMP CA Server CA Client Record Inp: SNMP CA Server SNMP Agent MIB devSNMP SNMP Agent MIB SNMP Agent MIB devSNMP Network
25
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 25 Test Implementation - Info. Flow MEDM CA Client MEDM CA Client Record Inp: SNMP CA Server Record Inp: SNMP CA Server Record Inp: SNMP CA Server
26
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 26 Test Implementation - Mon. Resources Some resources monitored: –Hard disk partition usage (total, available, used, percentage used, alarm limit) –Avg CPU usage over 1 min –System up time (from SNMP daemon start) –Inbound Packet Errors –Uncast Outbound Packets –SNMP daemon process check
27
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 27 Example Implementation - DESY Currently EPICS with devSNMP is being used at DESY to monitor key switches and routers –Network Traffic –Status Solaris and Linux PC clusters to be monitored in the future In total around 25 managed devices, but this is increasing all the time More information on EPICS/devSNMP at DESY: –http://www-mks2.desy.de/content/e4/e40/e41/e12212/index_ger.htmlhttp://www-mks2.desy.de/content/e4/e40/e41/e12212/index_ger.html
28
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 28 Extension Possibilities EPICS has limitations as a management system: –EPICS is a static system. –Records have limited analysis and reaction capabilities, in particular, no rule based events For dynamic management we can forward information from EPICS records to an expert management system – SysMES (Camilo Lara, et al.) Allows complex analysis and reaction to the data obtained from SNMP Management system must have CA Client to communicate with EPICS records
29
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 29 Current State Interface between CA Client and SysMES has been written Interface between the cluster monitoring systems LEMON and Ganglia have been defined and we are in the process of implementation
30
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 30 Current State - Architecture MEDM Record Inp: SNMP Record Inp: SNMP CA Server CA Client Record Inp: SNMP CA Server SNMP Agent MIB devSNMP SNMP Agent MIB SNMP Agent MIB devSNMP SysMES Client Interface CA Client Network
31
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 31 Summary SNMP: –Is the standard for network management in almost all modern networked devices (eg: PCs, work stations, bridges, switches, routers,...) –Widely implemented protocol with a large knowledge base –Very low system resource usage –A lot of system information is stored in node MIB Trees (which SNMP can access) EPICS: –Widely implemented control system with a huge support base –Allows input and output to a vast array of devices Through device support for SNMP, these can be combined to create a monitoring system This can be extended by forwarding the monitoring data to an expert management system (such as SysMES)
32
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 32 Thanks Many thanks to all who have helped, but especially: –Camilo LaraCoordinator, KIP –Albert KagarmanovdevSNMP at DESY
33
Marcelo Alcocer KIP / ICL CBM Conference 2006 Cluster Monitoring with EPICS and SNMP 33 The End Thank you for your attention Any questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.