Congestion Mitigation Trying to maximise performance between MFN’s network and a peer’s network over some busy PNIs
Hello, Good Evening Joe Abley Toolmaker, Token Canadian Metromedia Fiber Network
Background There are frequent periods of congestion between our network and a peer’s network in Europe The peer is a major operator in the region, and evil forces are preventing us from simply lighting up new fibre We need to work with what we have in place already
Characteristics of Problem The peer is home to a lot of broadband subscribers Transient hot-spots in content hosted within MFN’s network cause localised congestion in some PNIs, with other PNIs showing headroom We get paid for shifting packets (if we don’t carry the packets, we don’t get paid)
State of Play: Day One LocationCapacityUtilisation AmsterdamSTM-175M FrankfurtSTM-1140M ViennaSTM-190M
Goals Identify the important consumers of traffic in and beyond the peer’s network Once we can characterise the major traffic sinks, we can try and balance them out across our various PNIs Hopefully this will make the PNIs less sensitive to bursty traffic We expect to have to keep measuring and rebalancing
Tools Ixia IxTraffic –Gumption Unix-based BGP speaker that participates in the IBGP mesh Gives us route history –SeeFlow Smart NetFlow collector which talks to Gumption Awk –We always have awk
Infrastructure NetFlow traffic to be collected in-band from GSR12012s Single IxTraffic box: –FreeBSD 4.5, i386, dual 700MHz P3, 2GB RAM –Overkill –Load average occasionally peaks above 0.07 –10GB filesystem for storing routing and flow data in –Located in Virginia MRTG-like thing (duck) which also lives in VA on a different box gives us nice visibility of congestion trends
Exporting Flow Data Flow switching needs to be turned on in a maint window, because it makes the routers belch impolitely –All interfaces that can contribute towards traffic sent towards the peer get “ip route-cache flow sampled” –See kiddie script! See router die! Export config is trivial: ip flow-export source Loopback0 ip flow-export version 5 peer-as ip flow-export destination ip flow-sampling-mode packet-interval 1000 Note low sampling rate of 1:1000.
Collecting Flow Data SeeFlow is configured to populate net2net and aspath matrices (“buckets”) –We suspect that a lot of data is getting sunk within the peer network, hence net2net –We could be wrong, and aspath matrices are cool, so we collect those too Buckets chew up about 50MB of disk per day (all routers)
Initial Discoveries All the traffic is being sunk within the peer network, and not in a downstream network Damn All the traffic is being sunk into a single /12 advertisement Damn We need better granularity if we are going to be able to spread the demand across our PNIs
ASPATH Matrices seeasp -s 3320./dtag.agg | more Facets: TimeInterval : 05/09/ :00: /05/ :57: PDT SuperDataFacet : Facets: RouterIpv4Addr : RouterName : pr1.fra1.de.mfnx.net Facets: RouterIpv4Addr : RouterName : mpr2.vie3.at.mfnx.net Facets: RouterIpv4Addr : RouterName : mpr1.ams1.nl.mfnx.net AS P PktsThru BytesThru PktsTo BytesTo PktsTotal BytesTotal M G M G AAAA P 1.816M M M G M G BBBB M M 1.516M M CCCC K M K M DDDD K K M K M EEEE K 7.998M 3.663K 2.413M K M FFFF K K 9.642M K M GGGG K 1.406M K 8.250M K 9.656M HHHH K 8.587M K 8.587M IIII K 7.427M K 7.427M JJJJ K 7.029M K 7.029M
Net2Net Matrices /13 -> A.A.0.0/ /13 -> A.A.0.0/ /24 -> B.B.0.0/ /24 -> B.B.0.0/ /14 -> A.A.0.0/ /14 -> A.A.0.0/ /24 -> C.C.0.0/ /24 -> A.A.0.0/ /24 -> A.A.0.0/ /24 -> C.C.0.0/ /24 -> C.C.0.0/ /24 -> D.0.0.0/ /24 -> D.0.0.0/ /27 -> A.A.0.0/ /24 -> E.E.0.0/ /24 -> F.F.0.0/ /24 -> F.F.0.0/ /27 -> B.B.0.0/ /27 -> B.B.0.0/ /16 -> A.A.0.0/ /16 -> A.A.0.0/
Destination Prefix Histogram destination net megabytes proportion A.A.0.0/ % B.B.0.0/ % C.C.0.0/ % D.0.0.0/ % F.F.0.0/ % G.G.0.0/ % H.H.0.0/ % I.I.I.0/ % J.J.0.0/ % K.K.0.0/ %
Drilling down into A.A.0.0/12 Ask peer to advertise longer prefixes within the /12, so we can measure the traffic per prefix Wait for response GOTO 10 Maybe we can fix this ourselves
/home/dlr/bin/bgpd We injected 15 covered /16 prefixes into IBGP, with a NEXT_HOP that lay within the remaining /16 All tagged no-export, to avoid messing with the peer’s public route policy Strictly local-use within AS6461
More Collection The increased granularity gives us better visibility into the traffic sinks within the peer network We will try to spread the traffic over the available PNIs so we can weather bursts of demand more effectively We will also continue to let the peer know what we are doing –You never know, they may be listening
New Dest Prefix Histogram destination net megabytes proportion B.B.0.0/ % A / % C.C.0.0/ % A / % A / % A / % A / % A / % A / % A / %
State of Play: Day N LocationCapacityUtilisation AmsterdamSTM-1110M FrankfurtSTM-1105M ViennaSTM-185M
Conclusions “Light more fibre” is not always a realistic strategy You are not always your peer’s number one priority, so it’s nice to be able to take matters into your own hands Distributing the heavy traffic sinks across different PNIs makes bursty demand less unpleasant Routing plus flow data = Power. Or something.