ABwE: Available Bandwidth Estimator Jiri Navratil R. Les ABwE: Available Bandwidth Estimator Jiri Navratil R. Les. Cottrell Stanford Linear Accelerator Center (SLAC), 2575 Sand Hill Road, Menlo Park, California 94025 jiri@slac.stanford.edu, cottrell@slac.stanford.edu
ABwE: Available Bandwidth Estimator Introduction (motivation, needs,..) Basic principles Path characteristics and the examples of packet pair dispersion delays Bandwidth estimation ABwE versus Iperf Conclusions
Introduction 1 The HEP community is increasingly dependent on networking as internal cooperation grows (needs transfer huge amount of data between experimental sites as SLAC,CERN,etc. and home institutes spread over the world) Our main task is to provide the physicists reliable access to the network (and the integral part of this activity is NETWORK MONITORING) We have several monitoring system in operation (active: as ping or iperf, and passive: reading SNMP counts or using netflows data
Introduction 2 Network administrators and the users need to know RTT,losses, routing path, and estimations of available bandwidth to our partners Currently we have such information in limited sampling periods.The big question is. Do we have valid information if we do measurements once per 90 minutes and can we do measurements with tools as Iperf or to transfer the test files more frequently? Probably no. We need a tool that could be used in continuous mode 24 hours a day 7 days a week which can quickly and non intrusively detect changes on multiple path
Specification Tool based on dispersion techniques which doesn’t pollute Internet (and overload an entry point to the Internet) with huge amount of testing packets Get the result from one path during in a few seconds and produce results that could be easily preprocessed by graphical tools or enter to other systems (prediction, warning etc.) Easily configurable and manageable from one site We evaluated several tools using dispersion techniques but none of them in their current implementation met our demands.(Some of them were slow,some of them failed for high capacity paths and some of them were just technically too complicated).
Basic principles of ABwE ABwE is based on the simplest way of probing (using only Packet pairs) Evaluation is based on detailed technical analysis of how the packets pass via queuing devices Complete path is cascade of queuing devices with different capacities The separation of probing packets will happen even if there is no cross traffic The final dispersion PP1 and PP2 is the results of superposition of many factors
How we measure the dispersion time Using Netdyn package (package from University of Maryland 1991) 20 packets pairs probes for each path Probes are repeated with the period 20 msec and once a minute per each path. Set of 20 probes is called bunch. The bunch is evaluated as one statistical set of measurements. 20 ms 1 3 2 20 .. Path-1 Path-2 Linux Send PP1 Send PP2 dt_send (7-25 ms) t2 t1 time bunch Lpp/C NIC
ABwE Basic principles (The simple linear situations) PP2 PP1 time r t t Td-send Probe Receiver S hop R Td-receive Probe-Sender T-stamp PP1 Cross-traffic input time CT packets PP2 PP2 PP1 Td-receive T-stamp T-stamp PPD = Td-receive - Td-send , ( Td-receive >= Td-send ) REAL ratio between generally used long and short packets Dynamic Dispersion Delay
CERN APAN CESNET SLAC NERSC CALTECH
Detail Timing for PP on the way Static Dispersion Delay via experimental path (stretching, compressing and contracting) PP2 PP1 time move direction Td-send Probe-Sender H1 H2 H3 H4 622 Mbps 155 Mbps 622 Mbps 1000 Mbps 1000 Mbps S hop hop hop hop R Cross-traffic Cross-traffic Cross-traffic Cross-traffic Input-H1 OUT-H1 PP2 PP1 time Input-H2 Output-H2 (stretching) Input-H3 Output-H3 (contracting) Td34 Td23 Input-H4 PP2 PP1 Td-receive Free spaces for Cross-traffic Td23 = LPP/C23 Free spaces for cross-traffic Static Dispersion Delay Td-receive = Td34 = Td23
(Td)
NTT - Normalized Transfer Time
What type of traffic we can expect on the path
ABwE “Narrow Band” hop characteristics PP2 PP1 155 Mbps 622 Mbps hop hop Td12 Cross-traffic (622 Mbps lines) Cross-traffic Ex.1: “Stretching + absorbing Td effect ” PP2 PP1 a) Different input Td12 same output Td23 Input-H2 Output H2 (stretching PP) Input-H2 Absorbing CT Output H2 (stretching PP) Td23 b) Td not changed Input-H2 Output H2 (stretching PP) Td12 Td23
The principle of gradually narrowing bandwidth Impact No impact load load 1000 1000 622 622 622 155 Light beam Light source 622 622 622 Remarks: Fully valid and easily applicable for continuous streams or for data with strong source not easily applicable for immediate situation on the path with Poisson’s traffic or heavy bursty traffic with lights periods
ABwE “Narrow Band” hop characteristics PP2 PP1 622 Mbps 155 Mbps time hop hop Td12 Cross-traffic (622 Mbps lines) Cross-traffic PP2 PP1 Input-H2 Output H2 (stretching PP) Td23 Ex.2: “Multiplication effect” Input-H3 (no cross-traffic from H2) Output H3 Td23 CT packets from H2 Input – H3 time PP2 Output H3 PP1 Td3 ~ n*Td23
Multiplication factor in Td
Example of very low NB-Narrow Band
Example of superposition on Low NB 16 + - 100 + - 622 Mbits 726,6 Tdmin 551,0 + - n*64 Mbits Tdmin
Traceroute Graph of Monitoring paths According to the previous paragraphs an each path has: Static Dispersion Delay and Dynamical Dispersion Delay given by the hops I/O capacities in the tree caused by CT
Can we convert Td to bandwidth (Capacity) ? What does it represent ? We know that Td ~ r (utilization factor) If r grows than Td grows If linear than C ~ K * 1/Td If non-linear We have problem But we know that we deal with Bottleneck band hop which can: - eliminate previous Td - replace them by own Td ~ Lpacket/C - Queuing start to play important role Use non linear solution
The principle of gradually narrowing bandwidth Impact No impact load load 1000 1000 622 622 622 155 Light beam Light source 622 622 622
Most of the time only one queue dominates The principle of gradually narrowing bandwidth Impact No impact load load 1000 1000 622 622 622 155 Light beam Light source 622 622 622 We assume that: Most of the time only one queue dominates in the instant of our measurements !
Most of the time only one queue dominates The principle of gradually narrowing bandwidth Impact No impact load load 1000 1000 622 622 622 155 Light beam Light source 622 622 622 We assume that: Most of the time only one queue dominates in the instant of our measurements !
All “probing packets PP” share same queue with outside CrossTraffic (It means that Td caused by Queuing is not dependent only on the pkt_lengths of PP) Open question: What to use for estimation of CT (the average pkt_length or pkt_length close to MTU) ? Queue in Node-x
Tdi = LPP/Ci + E(N)* LCT/Ci Td = Tdinit + Tdvar (1) From the Queuing theory for M/M/1: Tsojourn= (1+E(N)) Tservice and in this we replace Tsojourn ~ Td , Tservice = Lp/C and also use LPP and LCT instead of Lp Tdi = LPP/Ci + E(N)* LCT/Ci Td = Tdinit + Tdvar (1) Tdjinit = mini ( (Tdij) | 1<i<=20 ) QDFi = (Tdij – Tdjinit)/NTTclass this allows us to replace E(N) in formula (1) by QDF: Tdij = LPP /Ci + QDFj * LCT/Ci From this we can calculate Ci for each singleton in one bunch j. Ci = (LPP + QDFj * LCT)/ Tdij (2) Tdjinit = mini ( (Tdij) | 1<i<=20 )
Graphical interpretation of the formula Ci = LPP /Tdi + QDFi * LCT/Tdi Cmax= Lpp/Tdmin (when no CT, QDF=0) C [Mbits/s] Ci Time [s]
EWMA: Filtration characteristics (avgi = (1 – a) *yi +a* avgi-1 )
Monitoring sites (average values)
Iperf versus ABwE (few unclear points) How to configure Iperf to achieve maximum performance in changing environment ( difference ~ 10 - 100 %) Limitation on the Entry-points to the Internet (SLAC 622Mbits, customer load (10% - 40% ) Machine performance (400-550 Mbits) Iperf aggressiveness (it suppress bandwidth of other running applications) and reports all what Iperf transferred Synchronization problem to avoid dependency
ABwE compare with Iperf
ABwE compare with Iperf
ABwE compare with Iperf
ABwE compare with Iperf
ABwE compare with Iperf
Conclusions We have demonstrated several network analysis and a new method for monitoring ABw and bottleneck capacity in the range several Mbits to 1000 Mbits ABwE is a non intrusive method which can be run in a continuous mode 24 hours a day 7 days a week It can detect changes in the path capacity based on heavy traffic and also discover dramatic changes in routing. The usefulness of ABwE has been proven several times since last summer Unfortunately, ABwE still doesn’t exists as a publicly used tool