Network Survivability
Network Survivability Want high availability for connections A common requirement: 99.999% availability Network links, nodes, individual channels can fail Need survivable network that can continue providing service in the presence of failures Survivability can be provided by protection switching Provide some redundant capacity in the network Automatically reroute around failure using the redundant capacity
Protection Usually implemented in a distributed manner to ensure fast restoration of service after a failure In most cases, protection schemes are engineered to protect against a single failure event How to deal with more than one concurrent failure? Divide the network into smaller subnetworks and restrict the protection scheme to within a subnetwork Ensure that the mean time to repair a failure is much smaller than the mean time between failures
Basic Concepts Working path: carry traffic under normal operation Protection path: an alternate path to carry the traffic in case of failures Working and protection paths are diversely routed so that both paths are not lost in case of a single failure
Dedicated v.s Shared Protection Dedicated protection: each working connection is assigned its own dedicated protection bandwidth Shared protection: if a set of working connections will not fail simultaneously, they can share protection bandwidth Reduce bandwidth needed for protection Protection bandwidth can be used to carry low-priority traffic under normal conditions
Nonrevertive v.s Revertive Protection Traffic is switched from the working path to the protection path when a failure occurs Nonrevertive protection: traffic remains on the protection path until it is manually switched back onto the working path Revertive protection: once the working path is repaired, traffic is automatically switched back from the protection path onto the working path Shared protection schemes are usually revertive
Unidirectional v.s Bidirectional Protection Switching Unidirectional protection switching When a fiber is cut, only the affected direction of traffic is switched over to the protection fiber Used in conjunction with dedicated protection schemes Traffic transmitted simultaneously on the working and protection paths The receiver at the end of the paths simply selects the better of the two arriving signals Not require a signaling protocol between the receiver and the transmitter
Unidirectional v.s Bidirectional Protection Switching When a fiber is cut, both directions of traffic are switched over to the protection fibers The receiver needs to inform the transmitter of the cut require a signaling protocol called automatic protection switching (APS) protocol How does an APS protocol work? If a receiver in a node detects a fiber cut, it turns off its transmitter on the working fiber and then switches over to the protection fiber to transmit traffic The receiver at the other node detects the loss of signal on the working fiber and then switches its traffic over to the protection fiber
Path/Span/Ring Switching Path switching: the connection is rerouted end to end on an alternate path Span switching: the connection is rerouted on a spare link between the nodes adjacent to the failure Ring switching: the connection is rerouted on a ring between the nodes adjacent to the failure