Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006.

Similar presentations


Presentation on theme: "Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006."— Presentation transcript:

1 Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006

2 Connect. Communicate. Collaborate Agenda Extraction of monitoring information from the GÉANT2 network External application developed by DANTE for JRA-4 Demonstration of a home grown weather-map Conclusion

3 Connect. Communicate. Collaborate Network Element Manager All network elements communicate with the NM separately NM task is to configure and monitor one by one each NE It is not service aware – no knowledge about the intra-domain e2e path status.

4 Connect. Communicate. Collaborate Regional Network Manager (RM) Topology Services Correlation “User” interface

5 Connect. Communicate. Collaborate How we export data ! Alarms Perf. Meas. Rem. Inv.

6 Connect. Communicate. Collaborate Status via alarms Alarms SNMPTrapD Alarms Monitoring station

7 Connect. Communicate. Collaborate Alarm content From the NM: –Information about interfaces and associated signal status, SDH timing problems –NE and ILA status From the RM –Information related to services –Information related to path, trails and physical connections at all layers

8 Connect. Communicate. Collaborate One hop case NMS vs JRA-4 Path – gen_mil_CERN OCH trailPhys-linkPhys link Domain linkP. ID link BOL-CERN-LHC-001

9 Connect. Communicate. Collaborate Multiple hop case NMS vs JRA-4 Path – gen_mil_CERN OCH trailPhys-linkPhys link Domain linkP. IDLink CERN-SARA-LHC-001 OCH trailPhys-link P. IDLink

10 Connect. Communicate. Collaborate Alarm processing SNMP traps from the Alcatel IOO module. Alcatel Enterprise v1/v2c MIB SNMP traps received by a Linux station –snmptrapd to pick up all alarms –For each trap a bash script is called which performs: Analysis Selection Action

11 Connect. Communicate. Collaborate Alarm type & information Alarm Raise: –friendlyName –probableCause –perceivedSeverity –currentAlarmId –eventTime –acknowledgementStatus –additionalInformation –eventType –snmpTrapAddress Alarm Clear: –friendlyName –probableCause –currentAlarmId –eventTime –snmpTrapAddress

12 Connect. Communicate. Collaborate Used alarm information Alarm Raise: –friendlyName –probableCause –perceivedSeverity –currentAlarmId –eventTime –acknowledgementStatus –additionalInformation –eventType –snmpTrapAddress Alarm Clear: –friendlyName –probableCause –currentAlarmId –eventTime –snmpTrapAddress

13 Connect. Communicate. Collaborate Alarm analyzer process SNMP trap received snmpTrapAddressMust be registered Check for type Of Alarm Raise Additional Info path clientpath ochtrail omstrail physicallink recordAlarm Call External Program Clear alarmID Read recordAlarm Call ExternalProgram Record all traps delete recordAl

14 Connect. Communicate. Collaborate Alarm analyzer Called every time a trap is received Written in bash Each trap is analyzed separately and if in the meantime a new trap arrives it waits in the queue (snmptrapd) –Possible problem if an external program get stuck and the scripts hangs. The alarms remains unprocessed in the queue Must maintain state –SNMP traps may get lost so a program needs to check time to time if the monitoring station is in syncro with the NMS.

15 Connect. Communicate. Collaborate External applications JRA-4 monitoring (xml file generation) perfSonar DB feeder Project weather-map: LHC

16 Connect. Communicate. Collaborate JRA-4 monitoring (XML file generation)

17 Connect. Communicate. Collaborate E2E Data transformation Prototype applications developed in Java – –E2EXMLWriter –XMLGenerator E2EXMLWriter takes in a template XML and produces an XML file containing live e2e path status information conforming to the JRA4 e2e data model –Triggered by a script listening to SNMP alarms –Parameters passed Trail ID Status XMLGenerator produces this template XML that E2EXMLWriter uses to export domain’s e2e information

18 Connect. Communicate. Collaborate Design of E2EXMLWriter Relies on 2 configuration files to produce live XML status information –Properties file (links.properties) Properties file containing key = value entries Each key is one e2e path name Value to each key is a csv of multiple trails that form one path Currently manually maintained –Alarm register A simple csv file Application maintained An “alarm raise” registers the associated path An “alarm clear” de-registers the associated path (contd).

19 Connect. Communicate. Collaborate Design (contd.) The application sets all path’s default status as UP with admin state as NORMALOPERATION Only the paths “registered” in the alarm-register csv file are set as DOWN with admin state as MAINTENANCE No implementation of the status DEGRADED at the moment No implementation of other admin states at the moment

20 Connect. Communicate. Collaborate Design of XMLGenerator Relies on 3 configuration files – –Properties file (init.properties) Contains a key = value entry Key = DOMAIN Value = Enables on-the-fly domain name configuration –Config file (config.csv) A simple CSV file Contains node-link-node information –A sample XML file containing “pieces of XML” to be replicated for each node and link in the final output “template XML” All configuration files are currently manually maintained

21 Connect. Communicate. Collaborate Data Provision Currently, the final XML containing live e2e path status information is written to a URL for export –http://unix.dante.org.uk/~otto/jra4-cbf.xmlhttp://unix.dante.org.uk/~otto/jra4-cbf.xml Later, maybe integration with perfSONAR framework

22 Connect. Communicate. Collaborate perfSonar feeder Enters data in the perfSonar MA Takes as input: –Type of logical link: trunk, trail, physical link or path. –Name: friendlyName –Time: the time when the event occurred –Status: UP/Down –Alarm ID

23 Connect. Communicate. Collaborate LHC weather-map live demonstration 1.CERN user-side down 2.CERN user-side up 3.GEN-MIL Lambda down 4.GARR user-side down 5.Back-to-back interconnection in DE broken 6.AMS-FRA lambda down 7.Up DE interconnection 8.AMS-FRA lambda up 9.GARR user-side up 10. GEN-MIL lambda up

24 Connect. Communicate. Collaborate Conclusion Status monitoring via alarms in an advanced phase and well understood. –Once the characteristic of the equipment/alarms/faults understood the development was easy. Alarm collector can be reused by NRENs using Alcatel equipment. XMLGenerator and perfSonar feeder not bonded to a specific equipment.

25 Connect. Communicate. Collaborate Questions ? Otto.Kreiter@dante.org.uk Navneet.Daga@dante.org.uk

26 Connect. Communicate. Collaborate Backup

27 Connect. Communicate. Collaborate CERN user side down

28 Connect. Communicate. Collaborate Lambda CH-IT down

29 Connect. Communicate. Collaborate Lambda and user failure in IT

30 Connect. Communicate. Collaborate Lambda + POP interconnect failure

31 Connect. Communicate. Collaborate Multiple Lambda, user and POP interconnect failure


Download ppt "Connect. Communicate. Collaborate GÉANT2 monitoring Otto Kreiter, DANTE Navneet Daga, DANTE LHC Monitoring Workshop, Munich, 19.07.2006."

Similar presentations


Ads by Google