Kenji SHIMIZU NTT Network Innovation Labs. This work was partially supported by the National Institute of Information and Communications Technology. 1 The 16 th Asia-Pacific Conference on Communications, Nov. 1st, 2010 Available Bandwidth Estimation for Gb/s-Class Applications Using 10-Gb/s Network Interface Cards Hardware Assistance
OUR MOTIVATION: NETWORK MEASUREMENT 2 We have engaged for several years in R&D of high-quality video sharing-over- IP-network technologies including 1.5-/6.4-Gbps uncompressed HDTV/4K video streaming system. Video transmitting PC Video Receiving PC Video archiving storage Packet droppings because of traffic bustiness Traffic congestion at switches Degraded traffic characteristics because of misconfiguration of routers Trouble!! There are a lot of troubles when and after starting the high-quality streaming service. Therefore, we have to… 1.make sure the network quality whether it meet the applications requirements. 2.keep on observing the network quality during the services to avoid troubles.
OUR ONGOING PROJECT: PRESTA 10G PRESTA 10G offers high-speed packet processing and traffic monitoring functions for 10-Gb/s IP networks, based on PRESTA 10G network interface card (NIC) and an off-the-shelf PC with Linux OS. Less expensive but high-performance systems Flexible software development while taking advantage of the hardware assists. We can deploy lots of high-performance monitoring systems among the network. 3 PRESTA: PRotocol Engine for Streaming Acceleration PRESTA 10G NIC Our key device: PRESTA 10G NIC 10-Gbps wire-rate packet capturing/generating. Well-controlled packet generation capable of sending up to 10 Gbps. Inter-packet gaps can be configured in packet-by-packet manner in nano- second resolution. Appends high-resolution and precise timestamps to all sent and received packets generated in the NIC. Globally synchronized using GPS, and They have fine resolution of 100- nsec
GOAL 4 Implementing an available bandwidth estimation (ABE) system 1.to make sure whether the network can satisfy the application requirement before starting the service. 2.using an off-the-shelf PC because of it less-expensive advantage 3.for 10-Gbps high-speed networks where Gb/s-class streams flows. Available bandwidth estimation technique Available bandwidth (Ba) is : Ba = Bl - Bc Bl: Network physical bottleneck speed Bc: Competing traffic bandwidth Competing traffic Traffic with pre-defined attributes (probe packets) Resulting multiplexed traffic When the incoming traffic bandwidth exceeds the available bandwidth, the pre- defined attributes are effected. Therefore, by detecting the effects while changing the probe packets attributes, we can determine the available bandwidth.
CHALLENGES 5 Our scope does not include development of new ABE algorithm but include development of how to apply existing ABE algorithm to 10-Gb/s networks for Gb/s- class applications. Points: It increases Inter-packet gaps in packet trains gradually. The algorithm is terminated when observed inter-packet gaps go through the network under test without any inter-packet gap changes. Competing traffic bandwidth The ABE results Observation of inter-packet gaps of packet trains. Network under test An ABE system (TX) An ABE system (RX) Packet train IGI algorithm : Initial Gap Increasing Difficulties : 1.How to control the inter-packet gaps 2.How to observe the inter-packet gaps accurately
6 How to control the inter-packet gaps to generate probe packets with arbitrary IPGs HARDWARE ASSISTANCE FOR OUR ABE SYSTEM(1) We have two difficulties to accomplish it based on an off-the-shelf PC. 1.Generating minimum inter-packet gaps corresponding to 10-Gb/s wire rate. The maximum probe-packet generating bit rate degrades because of various reasons. CPU performance shortage, bus bandwidth limitation and software scheduling which are worsened when the probe packet sizes are relatively short. 2.Controlling inter-packet gaps at the resolution of some nano-second. General-purpose operating systems schedule all execution at the resolution of some milli-second. 1.Software only passes the packet header (hatched) part to reduce the amount of data which software processes. 2.Our NIC appends padding data to generate probe packets in an arbitrary length. 1.Software just passes all the packet data stored in a single buffer to our NIC all at once. 2.Our NIC splits each packet, then send them while controlling the inter-packet gaps at 5-nano second resolution.
7 HARDWARE ASSISTANCE FOR OUR ABE SYSTEM(2) How to observe inter-packet gaps accurately to detect the inter-packet gaps changes. Two difficulties 1.The timestamps on an off-the-shelf PC are not so accurate. Generating timestamps based on on-board hardware clocks gives accurate measurement results. 2.Packet droppings in the receiver side makes inter-packet gap calculation inaccurate. Three steps to capture packets with their arrival timestamps 1.Our NIC extracts only the header parts of the packets discarding the remains. 2.Our NIC appends hardware-generated timestamps to the extracted headers. 3.After collecting multiple pairs of headers and timestamps, our NIC makes single data blocks containing multiple pairs for efficient transfer from NICs memory to the PCs memory. Series of captured packets in PRESTA 10G NIC 3. Our NIC makes single data block 1.Our NIC discards the unnecessary parts for ABE analysis. 2. Our NIC appends hardware-generated timestamps.
EVALUATION 8 Using PRESTA 10G hardware-assisted ABE systems and software ABE systems, we calculated available bandwidth using IGI (Initial Gap Increasing) algorithm. Configuration Network Switches Competing traffic ABE systems Shared 10-Gb/s link Competing traffic Constant bit rate traffic Elapsed time Bandwidth Type 1Type 2 Uncompressed HDTV stream (bursty)
RESULTS (1/3) 9 Actual available bandwidth [Mb/s] Estimation results [Mb/s] Actual available bandwidth Software Hardware- assisted Comparison between newly developed hardware-assisted ABE and software-based ABE 1.System based on software implementation only gives ABE results not exceeding 1-Gb/s available bandwidth. 2.Our ABE systems gives further accurate results even if the actual available bandwidth are up-to 10-Gb/s bandwidth.
RESULTS (2/3) 10 Degraded estimation accuracy when the competing traffic (uncompressed HDTV stream) was bursty. 1.When the competing traffic was a realistic HDTV streaming traffic, a larger number of probe packets were required for getting nearly accurate ABE results. 2.This is because the probe packets can fall in the instant bandwidth drops with small amount of probe packets.
RESULTS (3/3) 11 Packet droppings of uncompressed HDTV stream due to the incoming probe packets. Probe packets are generated with small inter-packet gaps corresponding from 5 Gb/s to 9 Gb/s, which caused network switchs buffer over flow.
FUTURE WORKS 12 In this work, we clarified 1.what is required to accomplish an accurate ABE system based on an off-the-shelf PC with NICs hardware assistance. 2.how the ABE results was improved by the implementation. 3.problems when the ABE system was applied to the network with actual competing HDTV streaming application. Packet droppings and degraded ABE accuracy In our future work, we will try to 1.avoid from packet droppings and degraded ABE accuracy by utilizing the applications packets (for example, video signal packets) as probe packets. 2.integrate our high-accuracy ABE into the transmission layer, for example, into a transmission control layer for congestion control.d
13 Intentionally blank
14 Multi-layer traffic monitoring system executes multiple network monitoring software in a single platform taking advantage of our PRESTA 10G platform. Layer-7 : SAA (streaming analyzing agent), collects streaming headers such as RTP and i-Visto ( in our case, and reports stability of video playbacks from multiple sites. GUI shows whether video can be played back stably and whether there are any dropped packets, packet reorders, and delays in each measurement site. Also saves history data. Layer-3/4: Open-source Netflow probes to show amount of traffic and protocol distribution of each flow. We used Softflowd (compliant with Netflow v9 protocol). NF probe NF probe SAA PS- HRA QoSmon TXRX Layer-1 : High-resolution perf SONAR-HRA introduced in Winter Joint Techs in Feb Shows traffic bit-rate in micro-second resolution using modified perfSONAR-PS and HRA daemon. In-service QoS monitor shows real-time behavior of delays and jitters in micro-second resolution. perfSONAR-HRA viewerin-service QoS monitor All monitoring software can run in single PRESTA box without performance degradation due to hardware assistance. Video streaming NF probe NF probe SAA PS- HRA QoS mon tcpdump NF probe NF probe SAA PS- HRA QoS mon tcpdump NF probe NF probe SAA PS- HRA QoS mon tcpdump Our ongoing projects of multi-layer traffic monitoring system including passive and active measurements
15 Visualization example One-way delay distribution measurement results at closest site to video sender. Layer-1 One-way delay at closest site to video receiver. Traffic burstiness calculated for 100- ms time slots. Layer-1 Availability of traffic playback in three measurement sites shown. Layer-7 around ms around ms
ms 16.70ms Dynamic configuration of layer-1 path Initial configuration Stable: all green. In trouble: packet reorders between site B and site C (not losses). A B C A B C A B CA B C After dynamic configuration Stable: all green again. No delay jitters observed. Therefore, layer-7 status might become stable. Use case of high-resolution perfSONAR in Snow festa 2010, Japan
RESULTS 17