FMS: A COMPUTER NETWORK FAULT MANAGEMENT SYSTEM BASED ON THE OSI STANDARDS Norleyza Jailani and Ahmed Patel Malaysian Journal of Computer Science, Vol.

Slides:



Advertisements
Similar presentations
Chapter 19: Network Management Business Data Communications, 5e.
Advertisements

CIS : Network Management. Introduction Network, associated resources and distributed applications indispensable Complex systems —More things can.
Telecommunications Management /635 Network Management.
Chapter 13 Managing Computer and Data Resources. Introduction A disciplined, systematic approach is needed for management success Problem Management,
Chapter 19: Network Management Business Data Communications, 4e.
Distributed components
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Protocols and the TCP/IP Suite
SIM5102 Software Evaluation
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Network Management Management Tools –Desirable features Management Architectures Simple Network Management Protocol.
Page 1 Copyright © Alexander Allister Shvartsman CSE 6510 (461) Fall 2010 Selected Notes on Fault-Tolerance (12) Alexander A. Shvartsman Computer.
Remote Monitoring and Desktop Management Week-7. SNMP designed for management of a limited range of devices and a limited range of functions Monitoring.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
Term 2, 2011 Week 3. CONTENTS The physical design of a network Network diagrams People who develop and support networks Developing a network Supporting.
Chapter 2: Computer-System Structures 2.1 Computer System Operation 2.5 Hardware Protection 2.6 Network Structure.
1 Network Monitoring Mi-Jung Choi Dept. of Computer Science KNU
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Management Platforms. SMFA In TMN the intent is to define a realitivly common set of what can be supported by CMISE network elements and managed via a.
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
Distributed System Concepts and Architectures Services
An Introduction to Networking
Distributed System Services Fall 2008 Siva Josyula
Open System Interconnection Describe how information from a software application in one computer moves through a network medium to a software application.
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
Manajemen Jaringan, Sukiswo ST, MT 1 Network Monitoring Sukiswo
Manajemen Jaringan, Sukiswo ST, MT 1 OSI Management Framework: Overview Sukiswo
PART1 Data collection methodology and NM paradigms 1.
Chapter 25 – Configuration Management 1Chapter 25 Configuration management.
OSI Model OSI MODEL. Communication Architecture Strategy for connecting host computers and other communicating equipment. Defines necessary elements for.
OSI Model OSI MODEL.
SQL Database Management
Chapter 19: Network Management
Chapter 2: Computer-System Structures
Credits: 3 CIE: 50 Marks SEE:100 Marks Lab: Embedded and IOT Lab
Software Architecture in Practice
THE OSI MODEL By: Omari Dasent.
Instructor Materials Chapter 9: Testing and Troubleshooting
RMON.
Fault Tolerance & Reliability CDA 5140 Spring 2006
How SCADA Systems Work?.
Lecturer, Department of Computer Application
IEEE Std 1074: Standard for Software Lifecycle
DEPARTMENT OF COMPUTER SCIENCE
CHAPTER 3 Architectures for Distributed Systems
Network Administration CNET-443
Chapter 16: Distributed System Structures
Protocols and the TCP/IP Suite
Ch 15 –part 3 -design evaluation
Lecture 09:Software Testing
Fault Tolerance Distributed Web-based Systems
Fundamentals of Network Management
Software Architecture
Introduction to Fault Tolerance
OSI Model OSI MODEL.
Chapter 2: Operating-System Structures
System Testing.
Protocols and the TCP/IP Suite
Chapter 4 Network Management Standards and Models
LO1 – Understand Computer Hardware
Chapter 2: Computer-System Structures
Chapter 2: Computer-System Structures
Chapter 2: Operating-System Structures
Chapter 4 Network Management Standards and Models
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Infokall Enterprise Solutions
Standards, Models and Language
Presentation transcript:

FMS: A COMPUTER NETWORK FAULT MANAGEMENT SYSTEM BASED ON THE OSI STANDARDS Norleyza Jailani and Ahmed Patel Malaysian Journal of Computer Science, Vol. 11 No. 1, June 1998, pp

Understanding the need for Fault Management Fault management is usually mentioned as the first concern in network management. Its main role is to ensure high availability of a network Hence, involving a procedure to anticipate and avoid network failures In the case where a failure cannot be avoided, the necessary steps are required to contain the damage and resolve the effects on the network

Defining fault management

From the 1 st week : Fault Management The goal of fault management detect, log, notify users of, and (to the extent possible) automatically fix network problems to keep the network running effectively. Because faults can cause downtime or unacceptable network degradation, fault management is perhaps the most widely implemented of the ISO network management elements.

From the 1 st week : Fault Management  Fault management involves first. First, determining symptoms and isolating the problem Second, Then the problem is fixed and the solution is tested on all-important subsystems Finally, The detection and resolution of the problem is recorded.

FAULT MANAGEMENT – Defining the terms errors Direct or indirect cause fault Overall result failure manifestation A fault is a software or hardware defect in a system that disrupts communication or degrades performance An error is the incorrect output of a system component. If a component presents an error, we say the component fails  This is a component failure

Fault symptoms can be associated to four types of error An output with an expected value comes either too early or too late timing error An output with an unexpected value within the specified time interval timely error An output with an unexpected value outside the specified time interval  response produced commissi on error These symptoms may take one of the following forms : An output with an unexpected value outside the specified time interval  no response is produced omission error

network faults can be generally divided according to their duration into three groups: permanent, intermittent and transient: 1 st : A permanent fault will exist in the network until a repair action has been taken. This results in permanent maximum degradation of the service 2 nd :An intermittent fault occurs in a discontinuous and a periodic manner. Its outcome will be failures in current processes. This implies maximum degradation of the service level for a short period of time. 3 rd :A transient fault will momentarily cause a minor degradation of the service.

FAULT: For faults of the first type, it will cause an event report to be sent out and changes made in the network configuration to prohibit further utilisation of this resource. For a fault of the second type, the severity of the fault may transfer from being intermittent to being permanent if an excessive occurrence of this kind of fault becomes significant. For a fault of the third type, it will usually be masked by the error recovery procedures of network protocols and therefore may not be observed by the users.

Fault Management Process Various Process scenarios: trouble ticketing loggingreportingMonitoringrecovery activities diagnosiscorrelating filtering Errror detectionIsolationAdministration

Typical fault  Observable fault  Unobservable fault  For example, the existence of a deadlock between co-operating distributed processes may not be observable locally.  Other faults may not be observable because the vendor equipment is not instrumented to record the occurrence of a fault  Too many related observation:  A single failure can affect many active communication paths. The failure of a WAN back-bone will affect all active communication between the token-ring stations and stations on the Ethernet LANs, as well as voice communication between the PBXs.  Propagation of failures  a failure in one layer of the communications architecture can cause degradation or failures in all the dependent higher layers.

FMS: A FAULT MANAGEMENT SYSTEM BASED ON THE OSI STANDARDS FMA deals with event reports from the moment they are received until recovery action takes place. Fault management procedures are realised through management function events handling, event correlation, fault diagnosis and fault recovery. It may be invoked by the user in an administrative role. In such cases, when the network manager’s intervention is needed to perform a recovery action, output is given in the form of an advice or a suggestion. Otherwise, fault recovery is carried out automatically.

FMS: A FAULT MANAGEMENT SYSTEM BASED ON THE OSI STANDARDS LMA This logger accomplishes the task of maintaining a log of event reports. It is a tool for trend analysis and reporting, and provides library for future diagnosis of the same or similar problems. LMA is realised in three separate programs. Facilities for logging and setting log attribute values are realised by program LogSet. Facilities for deleting log and logRecord are realised by program DelLogRec. The facility for reviewing logRecords is realised by program RevRec.

FMS: A FAULT MANAGEMENT SYSTEM BASED ON THE OSI STANDARDS DTA This tool exists for the convenience of the user with the responsibility of a network administrator or technician. It allows the user to perform diagnostic tests on a particular network resource. Its purpose is either to check that the diagnostic tests provided function correctly or to perform a diagnostic routine as a result of reviewing the history of event logs recorded by a log at the SMA. DTA comprises two applications: ViewTest and ScheExec. View Test is the program that allows the list of existing diagnostic tests to be displayed. ScheExec allows diagnostic tests to be scheduled and executed.

FMS functional components

Since the users of the FMS can have different responsibilities such as a network administrator or a network operator, the user interface provides a focused interface to the various fault management applications intended for different users. The network administrator is allowed to initiate or terminate the FMA.

FMS functional components There are three groups of Management Application, FMA, LMA and DTA. All of these applications are in executable form. The Management Application uses the OSI System Manager to send management operations to the OSI System Management Agent.

FMS functional components The OSI System Manager is an application which has responsibility for issues management operation requests (action) and receives notifications (events) on behalf of the Management application function. Five OSI System Managers have been implemented which perform the M-SET primitive, M-GET primitive, M-EVENTREPORT primitive, M-SET primitive for logging event records at the OSI System Management Agent and MDELETE primitive for deleting existing event records at the OSI System Management Agent respectively.

FMS functional components The access to network elements defined for the MAs is through the agent. Access to a network element involves collecting its statistical information and manipulating it for management activities. The SMA is realised as a single UNIX process. SMA is the OSI agent.

FMS functional components Management Information Base (MIB) is implemented for each participating protocol i.e. X.25 MIB. Facilities include the ability to both read and write specific managed object attributes, ability to add and remove elements to/from “set-valued” attributes, spontaneous generation of event reports and management of associations with remote manager processes. The MIB communicates with the FMS through the ISO CMIS and CMIP.

FMS functional components It must be tailored not only to the real resource type but also to the means the of how the real resource is used to present management information. This is a part specific to a particular managed object representing a real resource. These real resources may reside in the operating system’s kernel, on communication boards, in user- space processes or even at remote systems which are managed via a proxy management.