Text TCS Internal October 17, 2014 Introduction to Ethernet OAM
Document Name CONFIDENTIAL Course Objectives To understand basics of Ethernet OAM Course Prerequisite: –Basic understanding of Ethernet transport –Fault and Performance Management
Document Name CONFIDENTIAL Ethernet OAM Ethernet OAM is the protocol for monitoring and troubleshooting Ethernet metropolitan area network (MANs) and Ethernet WANs. It relies on a new, optional sub layer in the data link layer of the Open Systems Interconnection (OSI) model. Ethernet OAM deals with the Fault and Performance management of Carrier Ethernet services. There are two flavours of Ethernet fault management –“Link OAM” defined by IEEE 802.3ah –“Service OAM” for which there are two standards, IEEE 802.1ag and ITU-T Y.1731 The Y.1731 standard additionally provides capabilities for performance management
Document Name CONFIDENTIAL IEEE 802.3ah Link OAM Defined by IEEE 802.3ah (Ethernet in the First Mile) The functionality of the Link OAM can be summarized under the following categories: –Discovery: Discovery is the mechanism to detect the presence of an OAM sub layer on the remote device. During the discovery process, information about OAM entities, capabilities and configuration are exchanged –Link monitoring: This process is used to detect link faults and to provide information about the number of frame errors and coding symbol errors –Remote fault detection: Provides a mechanism for an OAM entity to convey error conditions to its peer via a flag in the OAMPDUs –Remote loopback: This mechanism is used to troubleshoot networks and to isolate problem segments in a large network by sending test segments
Document Name CONFIDENTIAL Link OAM – Discovery Process This is the first phase of the EFM-OAM. At this phase, EFM-OAM identifies network devices along with their OAM capabilities. The Discovery process relies on the Information OAMPDUs. During discovery, the following information is advertised through the TLVs within periodic Information OAMPDUs: –OAM configuration (capabilities): Advertises the capabilities of the local OAM entity. Using this information, a peer can determine what functions are supported and accessible (e.g. loopback capability). –OAM mode: This is conveyed to the remote OAM entity. The mode can be either active or passive, and can also be used to determine device’s functionality. –OAMPDU configuration: This includes maximum OAMPDU size to delivery. In combination with the limited rate of ten frames/sec this information can be used to limit the bandwidth allocated to OAM traffic.
Document Name CONFIDENTIAL Link OAM – Link Monitoring Process The Link Monitoring Process is used for detecting and indicating link faults under a variety of circumstances. Link monitoring uses the Event Notification OAMPDU, and sends events to the remote OAM entity when there are problems detected on the link. The error events defined in the standard are: –Errored Symbol Period (errored symbols per second): the number of symbol errors that occurred during a specified period exceeded a threshold. These are coding symbol errors (for example, a violation of 4B/5B coding). –Errored Frame (errored frames per second): the number of frame errors detected during a specified period exceeded a threshold. –Errored Frame Period (errored frames per N frames): the number of frame errors within the last N frames has exceeded a threshold. –Errored Frame Seconds Summary (errored secs per M seconds): the number of errored seconds (one second intervals with at least one frame error) among the last M seconds has exceeded a threshold.
Document Name CONFIDENTIAL Link OAM – Remote Failure Indication Process Faults in Ethernet that are caused by slowly deteriorating quality are more difficult to detect than completely disconnected links. A flag in the OAMPDU allows an OAM entity to send failure conditions to its peer The failure conditions are defined as follows: –Link Fault: The Link Fault condition is detected when the receiver loses the signal. This condition is sent once per second in the Information OAMPDU –Dying Gasp: This condition is detected when the receiver goes down. The Dying Gasp condition is considered as unrecoverable. –Device power down (incidental or deliberate). –Critical Event: When a critical event occurs, the device is unavailable as a result of malfunction, and it is to be restarted by the user. The critical events can be sent immediately and continually.
Document Name CONFIDENTIAL Link OAM – Remote Loopback Process An OAM entity can put its remote entity into loopback mode using a loopback control OAMPDU. This helps users ensure quality of links during installation or when troubleshooting In loopback mode, each frame received is transmitted back on that same port except for OAMPDUs and pause frames The periodic exchange of OAMPDUs must continue while in the loopback state to maintain the OAM session The loopback command is acknowledged by responding with an Information OAMPDU with the loopback state indicated in the state field. This allows to estimate if a network segment can satisfy an SLA.
Document Name CONFIDENTIAL IEEE 802.3ah Link OAM Example deployment 802.3ah OAM
Document Name CONFIDENTIAL IEEE 802.1ag – Connectivity Fault Management Defined by IEEE 802.1ag standard Specifies protocols, procedures, and managed objects to support transport fault management End-to-end service-level OAM technology These allow discovery and verification of the path, through bridges and LANs, taken for frames addressed to and from specified network users, detection, and isolation of a connectivity fault to a specific bridge or LAN.
Document Name CONFIDENTIAL Maintenance Domain (MD) The network or the part of the network for which faults in connectivity can be managed. A domain is owned and operated by a single entity and defined by the set of ports internal to it and at its boundary. A unique maintenance level in the range of 0 to 7 is assigned to each domain by a network administrator The hierarchical relationship of domains parallels the structure of customer, service provider, and operator. The larger the domain, the higher the level value. Domains should not intersect because intersecting would mean management by more than one entity, which is not allowed Domains may nest or touch but when two domains nest, the outer domain must have a higher maintenance level than the domain nested within it.
Document Name CONFIDENTIAL Maintenance Association (MA) A set of MEPs, each configured with the same MAID and MD Level, established to verify the integrity of a single service instance. Maintenance Entity (ME) : A point-to-point relationship between two MEPs within a single MA
Document Name CONFIDENTIAL Maintenance Point (MP) A maintenance point is a demarcation point on an interface that participates in CFM within a maintenance domain. Maintenance points on device ports act as filters that confine CFM frames within the bounds of a domain by dropping frames that do not belong to the correct level A MP can be either Maintenance association End Point (MEP) or Maintenance domain Intermediate Point (MIP)
Document Name CONFIDENTIAL Maintenance association End Point (MEP) MEP is an end point of a single MA and is an end point of a separate Maintenance Entity for each of the other MEPs in the same MA. A MEP is associated with exactly one MA. Down MEP: –A MEP residing in a Bridge that receives CFM PDUs from, and transmits them towards, the direction of the LAN. Up MEP: –A MEP residing in a Bridge that transmits CFM PDUs towards, and receives them from, the direction of the Bridge Relay Entity.
Document Name CONFIDENTIAL Maintenance domain Intermediate Point (MIP) MIPs CFM entities are internal to a MD, not boundary. A CFM entity consisting of two MIP Half Functions (MHF) MHF is a CFM entity, associated with a single Maintenance Domain, and thus with a single MD Level and a set of VIDs, that can generate CFM PDUs, but only in response to received CFM PDUs. Down MHF: –A MHF residing in a Bridge that receives CFM PDUs from, and transmits them towards, the direction of the LAN. Up MHF: –A MHF residing in a Bridge that transmits CFM PDUs towards, and receives them from, the direction of the Bridge Relay Entity.
Document Name CONFIDENTIAL MEPs, MIPs MEPs, MIPs, and MD Levels.
Document Name CONFIDENTIAL CFM Messages CFM uses standard Ethernet frames. CFM frames are distinguishable by type and reserved multicast MAC address. CFM frames are sourced, terminated, processed, and relayed by bridges Bridges that cannot interpret CFM messages forward them as normal data frames Three types of CFM messages are supported. –Continuity Check –Loopback –Linktrace Common CFM Header format:
Document Name CONFIDENTIAL CFM Messages - contd MD Level –(most-significant 3 bits) Integer identifying the Maintenance Domain Level (MD Level) of the packet. –Higher numbers correspond to higher Maintenance Associations, those with the greatest physical reach, with the highest values for customers’ CFM packets. –Lower numbers correspond to lower Maintenance Associations, those with more limited physical reach, with the lowest values for single Bridges or physical links. Version –(least-significant 5 bits) The protocol version number, always 0 OpCode –(1 octet) The OpCode field specifies the format and meaning of the remainder of the CFM PDU Flags –(1 octet) The use of the Flags field is defined separately for each OpCode. First TLV Offset –(1 octet) The offset, starting from the first octet following the First TLV Offset field, up to the first TLV in the CFM PDU.
Document Name CONFIDENTIAL Continuity Check Protocol CFM continuity check messages (CCMs) are multicast heartbeat messages exchanged periodically among MEPs. They allow MEPs to discover other MEPs within a domain and allow MIPs to discover MEPs The Continuity Check Message provides a means to detect connectivity failures in an MA CFM CCMs have the following characteristics: –Transmitted at a configurable periodic interval by MEPs –Terminated by remote MEPs at the same maintenance level –Catalogued by MIPs at the same or higher maintenance level –Unidirectional and do not solicit a response –Carry the status of the port on which the MEP is configured A connectivity failure is defined as either: –Inability of a MEP to receive three consecutive CCMs from any one of the other MEPs in its MA, indicating either a MEP failure or a network failure; –Reception by a MEP of a CCM with an incorrect transmission interval, indicating a configuration error; –Reception by a MEP of a CCM with an incorrect MEPID or MAID, indicating a configuration error or a cross connect error; –Reception by a MEP of a CCM with an MD Level lower than that of the MEP, indicating a configuration error or a cross connect error; or –Reception by a MEP of a CCM containing a Port Status TLV or Interface Status TLV indicating a failed Bridge Port or aggregated port.
Document Name CONFIDENTIAL Continuity Check Protocol - contd CCM Message format
Document Name CONFIDENTIAL Continuity Check Protocol - contd Flags: (1 octet) The Flags field of the Common CFM Header is split into three parts for the CCM as follows –RDI field –Reserved field –CCM Interval field Sequence Number Maintenance association End Point Identifier –(2 octets) Contains an integer value and specifies from which MEP the CCM was transmitted. Maintenance Association Identifier –(48 octets) This field contains the MAID of the transmitting MEP
Document Name CONFIDENTIAL Loopback Protocol A unicast Loopback Message (LBM) is used for Fault verification and isolation To verify the connectivity between MEPs and MIPs, a MEP can be instructed by a system administrator to issue one or more LBMs. The LBM is initiated by a MEP with specified destination address, priority, and drop eligible parameters, the destination address being the Individual MAC address of another MP within the same Maintenance Association as the transmitting MEP The receiving MP responds to the LBM with a unicast Loopback Reply (LBR).
Document Name CONFIDENTIAL Loopback Protocol - contd LBM, LBR Message format
Document Name CONFIDENTIAL Loopback Protocol - contd Flags: (1 octet) In an LBM, the Flags field of the Common CFM Header is set to 0 by the transmitting MP, and is not examined by the receiving MP. Loopback Transaction Identifier
Document Name CONFIDENTIAL Linktrace Protocol An LTM is transmitted by a MEP in order to perform path discovery and fault isolation. The LTM carries a target MAC address as part of its payload. It is carried in a multicast frame, with a destination address according to the MD Level of the transmitting MEP, and is relayed as such through the Bridged Network until it reaches an MP at the appropriate MD Level. That MP intercepts the LTM and determines whether its Bridge’s MAC Relay Entity would forward an ordinary data frame with the specified target MAC address to a single egress Bridge Port, or would filter or flood it. If the single egress port is found, or if the receiving MP is the terminating MP, the Linktrace Responder sends a unicast Linktrace Reply (LTR) to the originator of the LTM, whose MAC address was also carried as payload in the LTM. In addition, if the MP through which the LTM was received was an MHF, the Linktrace Responder forwards an altered version of the LTM out of a single Bridge Port in the direction of the target MAC address. The MEP Linktrace Initiator that originated the initial LTM collects the LTRs. These provide sufficient information to construct the sequence of MPs that would be traversed by a data frame sent to the target MAC address.
Document Name CONFIDENTIAL Linktrace Protocol - contd LTM Message format
Document Name CONFIDENTIAL Linktrace Protocol - contd Flags: (1 octet) In the LTM, the Flags field of the Common CFM Header specifies below options –UseFDBonly –Reserved LTM Transaction Identifier LTM TTL –(1 octet) The number of hops remaining to this LTM. –Decremented by 1 by each Linktrace Responder that handles the LTM. –One less than this value is returned in the LTR Original MAC Address –(6 octets) The MAC address of the MEP that originated the LTM. –This can be different from the source MAC address of an LTM because each MIP along the path puts its own MAC address in the source MAC address field, while retaining the Original MAC Address Field. Target MAC Address –(6 octets) Specifies an Individual MAC address, the path to which the LTM is intended to trace
Document Name CONFIDENTIAL Linktrace Protocol - contd LTR Message format
Document Name CONFIDENTIAL Linktrace Protocol - contd Flags: (1 octet) In the LTR, the Flags field of the Common CFM Header specifies below options –UseFDBonly –FwdYes –TerminalMEP –Reserved LTR Transaction Identifier Reply TTL –(1 octet) One less than the value from the LTM TTL field in the LTM that triggered the transmission of this LTR. –If the LTM TTL field contained a 0, no LTR is transmitted Relay Action –RlyHit –RlyFDB –RlyMPDB
Document Name CONFIDENTIAL Y.1731 SOAM ITU-T Y.1731 offers Fault and Performance Ethernet OAM capabilities Terminology differences between CFM and Y.1731 –MEG (Y.1731) MA (CFM) –MEG ID (Y.1731) MA ID (CFM) –MEG Level (Y.1731) MA Level (CFM)
Document Name CONFIDENTIAL Y.1731 OAM Functions for FM Ethernet continuity check (ETH-CC) –The Ethernet continuity check function (ETH-CC) is used for proactive OAM. –It is used to detect loss of continuity (LOC) between any pair of MEPs in a MEG. –ETH-CC also allows detection of unintended connectivity between two MEGs (mismerge), unintended connectivity within the MEG with an unexpected MEP (unexpected MEP), and other defect conditions (e.g., unexpected MEG level, unexpected period, etc.). –When a MEP does not receive ETH-CC information from a peer MEP, in the list of peer MEPs, within an interval of 3.5 times the ETH-CC transmission period, it detects loss of continuity to that peer MEP. Ethernet loopback (ETH-LB) –The Ethernet loopback function (ETH-LB) is used to verify connectivity of a MEP with a MIP or peer MEP(s). There are two ETH-LB types. –Unicast ETH-LB. –Multicast ETH-LB. Ethernet link trace (ETH-LT) –The Ethernet link trace function (ETH-LT) is an on-demand OAM function that can be used for Fault localization and also for retrieving adjacency relationship between a MEP and a remote MEP or MIP
Document Name CONFIDENTIAL Y.1731 OAM Functions for FM Ethernet alarm indication signal (ETH-AIS) –The Ethernet alarm indication signal function (ETH-AIS) is used to suppress alarms following detection of defect conditions at the server (sub) layer. Ethernet remote defect indication (ETH-RDI) –The Ethernet remote defect indication function (ETH-RDI) can be used by a MEP to communicate to its peer MEPs that a defect condition has been encountered. –ETH-RDI is used only when ETHCC transmission is enabled. Ethernet locked signal (ETH-LCK) –The Ethernet locked signal function (ETH-LCK) is used to communicate the administrative locking of a server (sub) layer MEP and consequential interruption of data traffic forwarding towards the MEP expecting this traffic. Ethernet test signal (ETH-Test) –The Ethernet test signal function (ETH-Test) is used to perform one-way on-demand in-service or out-of-service diagnostics tests. –This includes verifying bandwidth throughput, frame loss, bit errors, etc. Ethernet automatic protection switching (ETH-APS) –The Ethernet automatic protection switching function (ETH-APS) is used to control protection switching operations to enhance reliability. Ethernet maintenance communication channel (ETH-MCC) –The Ethernet maintenance communication channel function (ETH-MCC) provides a maintenance communication channel between a pair of MEPs. –ETH-MCC can be used to perform remote management.
Document Name CONFIDENTIAL Y.1731 OAM Functions for FM Ethernet experimental OAM (ETH-EXP) –ETH-EXP is used for the experimental OAM functionality which can be used within an administrative domain on a temporary basis Ethernet vendor-specific OAM (ETH-VSP) –ETH-VSP is used for vendor-specific OAM functionality which may be used by a vendor across its equipment Ethernet client signal fail (ETH-CSF) –The Ethernet client signal fail function (ETH-CSF) is used by a MEP to propagate to a peer MEP the detection of a failure or defect event in an Ethernet client signal when the client itself does not support appropriate fault or defect detection or propagation mechanisms, such as ETH-CC or ETH- AIS. –ETH-CSF is only applicable to point-to-point Ethernet transport applications
Document Name CONFIDENTIAL Y.1731 OAM Functions for PM Frame loss measurement (ETH-LM) –ETH-LM is used to collect counter values applicable for ingress and egress service frames where the counters maintain a count of transmitted and received data frames between a pair of MEPs. –ETH-LM is performed by sending frames with ETH-LM information to a peer MEP and similarly receiving frames with ETH-LM information from the peer MEP. –Each MEP performs frame loss measurements which contribute to unavailable time. –ETH-LM can be performed in two ways: –Dual-ended ETH-LM. –Single-ended ETH-LM. Frame delay measurement (ETH-DM) –ETH-DM can be used for on-demand or proactive OAM to measure frame delay and frame delay variation. –Frame delay and frame delay variation measurements are performed by sending periodic frames with ETH-DM information to the peer MEP and receiving frames with ETH-DM information from the peer MEP during proactive measurement session and/or the diagnostic –interval. –Each MEP may perform frame delay and frame delay variation measurement. –ETH-DM can be performed in two ways –One-way ETH-DM –Two-way ETH-DM
Document Name CONFIDENTIAL Y.1731 OAM Functions for PM Throughput measurement –It specifies measuring the throughput by sending frames at an increasing rate (up to the theoretical maximum), graphing the percentage of frames received, and reporting the rate at which frames start being dropped. –unicast ETH-LB (e.g., LBM and LBR frames with the data field) and ETH-Test (e.g., TST frames with the data field) can be used for –performing the throughput measurements. Synthetic loss measurement (ETH-SLM) –Synthetic loss measurement is a mechanism to measure frame loss using synthetic frames, rather than data traffic. –A number of synthetic frames are sent and received, and the number of those that are lost is hence calculated. –This can be treated as a statistical sample, and used to approximate the frame loss ratio of data traffic.