1 AI Approaches to Network Fault Management Andrew Learn 29 Nov 2001.

Slides:



Advertisements
Similar presentations
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Advertisements

Ch.21 Software Its Nature and Qualities. Ch.22 Outline Software engineering (SE) is an intellectual activity and thus human-intensive Software is built.
Chapter 19: Network Management Business Data Communications, 5e.
Computer Science Department FMIPA IPB 2003 Neural Computing Yeni Herdiyeni Computer Science Dept. FMIPA IPB.
G-RCA: A Generic Root Cause Analysis Platform for Service Quality Management in Large IP Networks He Yan, Lee Breslau, Zihui Ge, Dan Massey, Dan Pei, Jennifer.
Future Trends in Monitoring Keith J Ruskin, MD Associate Professor of Anesthesiology Yale University School of Medicine.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.1 Troubleshooting Working at a Small-to-Medium Business or ISP – Chapter 9.
Chapter 19: Network Management Business Data Communications, 4e.
Soft computing Lecture 6 Introduction to neural networks.
1 CCNA 2 v3.1 Module 9. 2 Basic Router Troubleshooting CCNA 2, Module 9.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
SESSION 10 MANAGING KNOWLEDGE FOR THE DIGITAL FIRM.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Fault, Configuration, Performance Management
MSIS 110: Introduction to Computers; Instructor: S. Mathiyalakan1 Systems Design, Implementation, Maintenance, and Review Chapter 13.
Case-based Reasoning System (CBR)
Chapter 11 Managing Knowledge.
Artificial Neural Networks (ANNs)
Agent-Based Acceptability-Oriented Computing International Symposium on Software Reliability Engineering Fast Abstract by Shana Hyvat.
Chapter 12: Troubleshooting Networking Problems Network+ Guide to Networks Third Edition.
Copyright © 2015 John, Wiley & Sons, Inc. All rights reserved FitzGerald ● Dennis ● Durcikova Prepared by Taylor M. Wells: College of Business Administration,
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
1 25\10\2010 Unit-V Connecting LANs Unit – 5 Connecting DevicesConnecting Devices Backbone NetworksBackbone Networks Virtual LANsVirtual LANs.
Remote Monitoring and Desktop Management Week-7. SNMP designed for management of a limited range of devices and a limited range of functions Monitoring.
McGraw-Hill The McGraw-Hill Companies, Inc., 2000 SNMP Simple Network Management Protocol.
Introduction to Computer Technology
Library Automation: Planning and Implementation
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Jung SooSung Vice President KT ICOM September 27 th, 2001.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Fault Management * * Mani Subramanian “Network Management: Principles and practice”, Addison-Wesley, 2000.
Fault Diagnosis System for Wireless Sensor Networks Praharshana Perera Supervisors: Luciana Moreira Sá de Souza Christian Decker.
1. There are different assistant software tools and methods that help in managing the network in different things such as: 1. Special management programs.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Oppenheimer.
Help Desk System How to Deploy them? Author: Stephen Grabowski.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Frankfurt (Germany), 6-9 June 2011 EL-HADIDY – EG – S5 – 0690 Mohamed EL-HADIDY Dalal HELMI Egyptian Electricity Transmission Company Egypt EXAMPLES OF.
Repeaters and Hubs Repeaters: simplest type of connectivity devices that regenerate a digital signal Operate in Physical layer Cannot improve or correct.
Chapter 9 Neural Network.
Architecture styles Pipes and filters Object-oriented design Implicit invocation Layering Repositories.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
Principles of Information Systems, Sixth Edition Systems Design, Implementation, Maintenance, and Review Chapter 13.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Chapter 6 – Connectivity Devices
Data and Computer Communications Circuit Switching and Packet Switching.
An Approach To Automate a Process of Detecting Unauthorised Accesses M. Chmielewski, A. Gowdiak, N. Meyer, T. Ostwald, M. Stroiński
Cisco – Semester 4 – Chapter 7
Network Management Lecture 3. Network Faults Hardware Software.
Manag ing Software Change CIS 376 Bruce R. Maxim UM-Dearborn.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 12-1 Chapter 12 Advanced Intelligent Systems.
Chapter 4 Decision Support System & Artificial Intelligence.
Software Maintenance Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Basic component of Network Management Woraphon Lilakiatsakun.
Principles of Information Systems, Sixth Edition 1 Systems Design, Implementation, Maintenance, and Review Chapter 13.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Network management Network management refers to the activities, methods, procedures, and tools that pertain to the operation, administration, maintenance,
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
IEEE AI - BASED POWER SYSTEM TRANSIENT SECURITY ASSESSMENT Dr. Hossam Talaat Dept. of Electrical Power & Machines Faculty of Engineering - Ain Shams.
A Validation System for the Complex Event Processing Directives of the ATLAS Shifter Assistant Tool G. Anders (CERN), G. Avolio (CERN), A. Kazarov (PNPI),
Automatic Network Management: Graphical Models for Fault Location Ricardo Morla INESC Porto / FEUP.
Chapter 27 Network Management Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
System Design, Implementation and Review
Machine Learning overview Chapter 18, 21
Neural Computing: The Basics
Part 3 Design What does design mean in different fields?
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Chapter 12 Advanced Intelligent Systems
Module 9 Troubleshooting.
Top-Down Network Design Chapter Nine Developing Network Management Strategies Copyright 2010 Cisco Press & Priscilla Oppenheimer.
Presentation transcript:

1 AI Approaches to Network Fault Management Andrew Learn 29 Nov 2001

2 Outline Fault Management Process AI Approaches –Expert Systems –Neural Networks –Case-based Reasoning

3 Network Faults Hardware –Wear and tear –Cut cables –Improper installation Software –Incorrect design –Bugs –Incorrect data (e.g. routing tables)

4 Fault Management Process 1.Collect alarms 2.Filter and correlate alarms 3.Diagnose faults 4.Restoration and repair 5.Evaluate effectiveness

5 1. Collect Alarms Types of alarms –Physical: Failure in communication e.g. loss of signal, CRC failure –Logical: Statistical values exceed threshold e.g. number of packets dropped Communication with components –Control protocol: Simple Network Management Protocol (SNMP) –Data format: Management Information Base (MIB- II, 1990) has ~170 manageable objects

6 Sample MIB Entry Sample SNMP “get” call ipInReceives OBJECT-TYPE SYNTAX Counter ACCESS read-only STATUS mandatory DESCRIPTION "The total number of input datagrams received from interfaces, including those received in error." ::= { ip 3 } snmpget netdev-kbox.cc.cmu.edu public system.sysUpTime.0  Name: system.sysUpTime.0 Timeticks: ( ) 6:18:23

7 2. Filter and Correlate Alarms Filter –Eliminate redundant alarms –Suppress noncritical alarms –Inhibit low-priority alarms in presence of high-priority alarms Correlate –Analyze and interpret multiple alarms to assign new meaning (derived alarm)

8 3. Diagnose Faults May require additional tests/diagnostics on circuits or components –Automated or manual Analyze all info from alarms, tests, performance monitoring Identify smallest system module that needs to be repaired or replaced

9 4. Restoration and Repair Restoration: Continue service in presence of fault –Switch over to spares –Reroute around trouble spot –Restore software or data from backup Repair –Replace parts –Repair cables –Debug software Retest to verify fault is eliminated

10 5. Evaluate Effectiveness Questions to answer : –How often do faults occur? –How many faults affect service? –How long is service interrupted? –How long to repair? Provides assessment of: –Performance of fault management system –Reliability of equipment

11 AI Approaches to Fault Management Well-developed approach: –Expert systems New approaches: –Neural networks –Case-based reasoning –Other

12 Why AI? Need for intelligence –Data analysis –Pattern recognition –Clustering and categorization –Problem solving Need for automation –Manual analysis/solution takes time –Limited manpower –Limited expertise

13 Well-developed approach: Expert Systems Expert systems = Rule-base + Working Memory Three parts to rules: 1.Context trigger (when should rule be considered) 2.Condition ( if X... ) 3.Conclusion (... then Y) Used since 1980’s by major telecomm companies –Bell: Automated Cable Expertise (ACE) system –GTE: Central Office Maintenance Printout Analysis & Suggestion System (COMPASS) –AT&T: Network Management Expert System (NEMESYS)

14 Need for New Approaches Weaknesses of expert systems –Brittle in unforeseen situations –Cannot learn from experience –Hard to maintain (adding/deleting/modifying rules) –Knowledge acquisition bottleneck –Can’t handle incomplete or probabilistic data Factors driving new approach –Rapidly changing technology –Dynamic network topology –Network complexity –Competition, demand for QoS

15 Neural Nets Structure: input, hidden, output layers Training –Supervised: Input pattern & desired output –Unsupervised: Clustering of similar inputs Input Hidden Output weights

16 Neural Nets Advantages –Pattern matching & generalization –Fast & efficient –Trainable –Handles incomplete, ambiguous data Disadvantages –Black box –Lack of training data

17 Neural Net Example Example: Alarm correlation in cell phone networks (Univ of Hannover, Germany) Base Stations Mobile units Base Station Controller Switching Centers BS2 BS1 MC BSC Microwave Links Maintenance Center

18 Neural Net Example BSC alarms Initial Cause Test Results: –94 alarms –99.76% correct classification with up to 25% noise ML-1 fault ML-2 fault BS-2 alarms BS-1 alarms

19 Case-Based Reasoning Case-based reasoning = matching previous examples –Case library: Set of previous faults, diagnoses, solutions –Usually based on “trouble ticket” help-desk databases Design considerations: –What are key attributes of a case? –What attributes will be used to index & access a case?

20 Case-Based Reasoning Advantages –Easier knowledge acquisition than expert systems –Can learn by adding new cases –Doesn’t require extensive maintenance Disadvantages –Requires time-consuming user interaction –No help for first-time problems

21 Case-Based Reasoning Example Case 134 Problem Type: Performance Description: High error rate in comm between POA-SP & DF No access: Intermittent Retrieval: Case 103 [Similarity = 0.69] Description: 64kb line from VendorX drops big datagrams. Additional Info requested: Is there loss of big datagrams in ping test? (Result: Yes) Cause: Link 34 inside Bldg 207 was defective Solution: Vendor replaced cabling.

22 Summary of 3 AI Methods Expert systems –If / then rules –Well-developed technology –Brittle, hard to maintain Neural networks –Output = weighted transform of inputs –Fast pattern matching, robust to noise –Black box, lack of training data Case-based systems –Trouble-ticket retrieval –Easy to build, maintain –Slower diagnosis, takes time to build

23 Other Approaches Bayesian networks –Model statistical probabilities and dependence of faults Mobile intelligent agents –Independent software agents cooperate to collect info, suggest solutions

24 Future Trends Proactive fault detection –Recognizing trouble signs and taking corrective action before service degrades Hybrid systems –Multiple AI methods integrated