SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming

SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming
Enhua Tan1, Lei Guo1, Songqing Chen2, Xiaodong Zhang1 1The Ohio State University 2George Mason University Good afternoon, everyone. I’m Enhua Tan, from the Ohio State University. Today I am going to present a paper about the performance optimization of P2P streaming applications in a wireless environment. This is a joint work with Lei Guo at the Ohio State Unverisity, Songqing Chen at the George Mason University, and my advisor Xiaodong Zhang at the Ohio State University. Lei Guo is currently an employee at Yahoo incorporation. ICDCS’07, Toronto, Canada

Background Wireless access to Internet is pervasive:
On campus, in offices, at home, and public utilities Most are supported by Wireless LANs Peer-to-Peer applications are widely used: Streaming: PPLive, Joost, etc … VoIP: Skype, etc … Large file distribution: BitTorrent, etc … Our Focus: Interaction between wireless users and P2P streaming applications I will first introduce the background of this work. As we know, wireless access to the Internet is becoming more and more pervasive on campus, in offices, at home, and public utilities such as coffee shops. And most of them are using Wireless Local Area Networks, shorten as WLAN. At the same time, Peer-to-Peer applications are commonly used in our daily life: such as PPLive, a popular P2P live streaming application, and Joost, a beta version P2P on-demand streaming software. A lot of users are also using Skype for VoIP, and BitTorrent for large file downloadings. The focus of our study is on the interaction between wireless users and P2P applications, specifically, P2P live streaming applications. To the best of our knowledge, few studies have been done on the performance issue of this interaction.

Wired/wireless Communications
Internet WLAN Access Point (AP) Wired users So, what’s the difference between wired and wireless users on accessing the Internet? Generally, the wireless users access the Internet through the Access Point (or AP) in a Wireless LAN by a/b/g protocol, while the wired users connect to the Internet directly to a Internet switch as the Access Point does. Wireless users

P2P Streaming for Wired/wireless Users: Workflow
Internet Source Peer Access Point Now considering that a wireless user is running a P2P streaming application, and we call this user a wireless peer. Typically, in Peer-to-Peer streaming application, each peer needs to relay streaming content for the P2P network. And we call the peers relaying content for this wireless peer as source peers. So the content will be streaming from source peers to this wireless peer from the Internet to the local switch, and then forwarded by the Access Point. This wireless peer will also upload the just downloaded content to other content viewing peers on the Internet through the Access Point again. Viewing Peer Wireless Peer WLAN

P2P Streaming for Wired/wireless Users: Problems
Internet Downstream traffic for other wireless users AFFECTED Source Peer WLAN Generating upstream traffic Supposing the wireless peer is not the only user in this WLAN, then the Access Point may be serving more than three connections simultaneously: they are the downstream multimedia content, the upstream multimedia content, and other content delivering to other wireless users in the Wireless LAN. Apparently, the wireless peer, as a relay and viewing peer, is generating upstream traffic, which will affect the downstream traffic for other wireless users due to the channel competitions. And the Internet viewing peers served by this wireless peer may also experience degraded streaming quality caused by the competition in the WLAN. Viewing Peer Streaming content Streaming quality degraded Wireless Peer (Relay/Viewing) Other packets

Problem Summary Peers in WLAN may relay streaming content by uploading a lot of traffic: Congest the WLAN due to channel competitions Provide low quality of service to the Internet peers Downstreams have lower priority due to upstreams Extra upstream traffic: further increase the number of transmission errors increase the cost of contention window back-off Major problem source: upstream relay traffic  Can we minimize upstream traffic with low overhead? to improve WLAN throughput to improve service quality for Internet peers Here is a formal summary of the problems we just presented: 1) wireless peers may upload a lot of traffic, which will overload the Access Point or congest the WLAN due to the channel competitions, and may also lead to low quality of service to the Internet viewing peers. 2) downstream traffic will have a lower priority comparing with the upstream traffic due to the competition among physical stations, specifically, the Access Point and the upstream wireless station. 3) extra upstream traffic generated by the wireless peer can increase the number of transmission errors happened in a WLAN, and may further increase the channel back-off time. In fact, the major source of these problems is the upstream relay traffic. So can we minimize the upstream traffic with trivial overhead so that the WLAN throughput can be improved and the streaming quality for Internet peers can also be maintained?

P2P Streaming for Wired/wireless Users: Workflow
The same content is transferred twice in the WLAN!  Duplicated traffic Internet Source Peer Access Point Back to the illustration of the workflow for P2P streaming wireless users. We observe that, in the view of the Access Point and the wireless peer, the same streaming content is transferred twice in the network because of the successive downloading and uploading, which means the upstream traffic is largely is duplicated from the downstream. This observation provides a unique opportunity for minimize the upstream traffic by add caching at the access point. Viewing Peer Wireless Peer WLAN

Contributions Our measurements show that > 75% upstream traffic is duplicated with the downstream traffic for three representative applications SCAP: Smart Caching in the Access Point for minimizing upstream traffic: design & prototype implementation Evaluation results show SCAP can improve the throughput of the WLAN by up to 88%: SCAP also reduces the delay to Internet peers Specifically, the cache is designed as a FIFO buffer for P2P live streaming applications because the upstream always duplicated from the latest downstream, and its efficiency is highly depending on the temporal locality of successive downstream and upstream traffic. So we first did a measurement study of typical P2P streaming applications, and found that an intelligent duplication detection method can detect more than 75% duplications between the upstream and downstream traffic. Motivated by our measurement results, we design and implemented SCAP scheme: namely, smart caching in the access point, for minimizing upstream traffic in a WLAN for P2P streaming applications. And the evaluation of our prototype implementation shows that SCAP can improve the WLAN throughput by up to 88% and can also effectively reduce the streaming packet delay to the Internet peers.

Outline Problem Summary and Contributions
Measurement & Analysis of P2P Streaming Traffic SCAP Design & Implementation Evaluation Summary After introduced the problems we study and our contributions. Next, I will present the measurement and analysis results of P2P streaming traffic, and then give an overview and detailed design and implementation of SCAP scheme. Finally, I will talk about the evaluation and summarize this presentation.

Measurement & Analysis of P2P Streaming Traffic
Aim to answer two questions: How much duplicated traffic in practice? How much overhead in identifying such duplications? Measurement: Collect traces of three representative P2P live streaming applications: PPLive, ESM, and TVAnts In LAN (100Mbps) and WLAN (802.11b) In the measurement study of P2P streaming traffic, we aim to answer two questions. First, we want to know how much duplicated traffic exists in practice. Secondly, how about the processing speed and required memory space for detecting such duplications. We collected traces for three representative P2P live streaming applications: PPLive, ESM, and TVAnts. PPLive is one of the most popular P2P streaming applications. ESM is developed by CMU and served the streaming for a number of computer conferences, such as SIGCOMM 2002 & And we also chose TVAnts developed by Zhejiang University from China because of its unique traffic pattern (mostly are UDP). Our trace collection environment is a 100Mbps LAN, and an b WLAN.

Workload Statistics Downstream throughput is typically 300~400Kbps
Upstream traffic to downstream traffic: Can be as large as 10 times for PPLive due to its popularity Between 2 to 4 times for TVAnts Not too much for ESM PPLive and ESM: most in TCP TVAnts: 74% in UDP for WLAN We first briefly introduce the statistics of our collected workloads. The downstream throughput, or say the streaming throughput is typically between 300 Kbps and 400 Kbps. The upstream traffic for PPLive can be as large as 10 times the downstream traffic due to the high bandwidth of LAN and its popularity. While for TVAnts, it’s mostly between 2 to 4 times. For ESM, the upstream traffic is not too much. And regarding the network protocols being used, we found more than 70% of TVAnts traffic is in UDP, while the other applications are mostly using TCP.

Duplication Detection Methods: Fixed Hashing
Offline workload analysis: Fixed Hashing (FH) Compute only 1 fingerprint (hash value) for a downstream packet; store this fingerprint in a hash table, and cached the packet in FIFO buffer For each upstream packet, also compute the fingerprint, and look it up in the hash table to locate the duplicated downstream packet; If found the same fingerprint, do further byte-to-byte comparison Lookup Downstream packet fingerprint hash table Upstream packet Downstream packet FIFO buffer In order to detect the duplications we anticipated in the offline analysis of the workloads, we tried two methods. The first one called Fixed Hashing, short as FH, which simply compute one fingerprint (hash value) for a downstream packet for its first 64 bytes after the application level header. The fingerprints are stored in a hash table, and the downstream packets are cached in a FIFO buffer. For each upstream packet, we need to lookup its fingerprint in the downstream fingerprints hash table for locating the duplicated downstream packet. If a upstream fingerprint is found in the hash table, further byte-to-byte comparison is performed to identify the exact duplicated bytes. Upstream packet fingerprint

Duplication Detection Methods: Rabin Fingerprinting
Rabin Fingerprinting (RF) A unique hash function: produce fingerprints for a continuous data stream quickly (NSDI’07 BitTyrant) We scan the whole packet and only store fingerprints ending with 8 zeros over 64 bytes content averagely 5 fingerprints for a 1400 bytes packet (1/28) FIFO Buffer: stores latest 50,000 downstream packets Buffer + hash table: need about 75MB memory totally We also used another more intelligent method for duplication detection, called Rabin Fingerprinting, short as RF. RF can produce fingerprints for a continuous data stream quickly due to its unique mathematic properties comparing with other hash functions. This method is recently used by a BitTorrent paper in NSDI 2007 for identifying duplicated file blocks. We use RF to scan the whole packet and selectively store those fingerprints satisfying our criteria. Our typical settings lead to about 5 fingerprints for a 1,400 bytes packet. These extra fingerprints is worthful because it ends up with higher detected duplication ratios. As we just mentioned, in order to detect duplications, we need to cache the latest downstream packets in a FIFO buffer. If we store up to 50,000 packets, the total memory cost for the buffer and the hash table is about 75MB and most of the memory is used by the buffer.

Dup Ratio & Tput RF can detect more duplications than FH
Offline analysis processing throughput of RF is less than FH: Still large enough (> 90Mbps) for process P2P streaming (400 Kbps) RF can detect more duplications than FH All the duplication ratios are larger than 75% In this slide, we shows the duplication ratios detected by our two methods and also the processing speed of our two methods. The duplication ratio is the percentage of upstream traffic duplicated in the downstream buffer, which is directly related to how much traffic we can reduce. As shown in the left figure, RF can detect much more duplications than FH for PPLive and TVAnts. And all the duplications detected by RF are larger than 75%, which is a promising result. The right figure shows that the processing throughput of RF is less than FH due to its extra costs. However, RF’s throughputs for the three applications are all larger than 90 Mbps, which implies that the pure CPU cost for using RF to detect duplicated traffic is very low for processing a typical 400Kbps P2P streaming. RF’s high performance and low overhead motivated us to use it in the implementation of our traffic reduction scheme.

Duplication Beginning Offset
FH can only detect the duplication when the offsets for up/downstream packets are the same (no re-packetizing) ESM does not have any offset differences  FH performs well TVAnts has a lot of re- packetizing  FH performs the worst We also measured the offset in the up and downstream packet where the duplicated content begins. If the two offsets are different for a duplication, Fixed Hashing or FH method will not be able to find the duplication, which usually happens for TCP re-packetizings. However, RF can effectively detect this kind of duplications for the multiple fingerprints it generated. The Cumulative Distribution Function of these offsets for ESM shows that they are exactly the same for upstream and downstream packets, which explains why FH works well for ESM. And we can see for TVAnts the offsets are largely different between upstream and downstream packets, which leads to the worst performance for FH among these three applications.

Forwarding Delay 200 seconds 200 seconds 10 seconds 20 seconds PPLive and TVAnts: most upstream packets forwarded in 200 seconds <20 seconds for 70% ESM: within 10 ms Implies the downstream buffer can be quite small 10 ms A very important metric related to the memory cost is the forwarding delay between two duplicated packets, that is the time difference for a packet being downloaded and uploaded again. Small forwarding latency means that a small buffer size for downstream packets will be sufficient for duplication detection. The two above figures show that the forwarding delay for PPLive and TVAnts are within 200 seconds, which also means their content delivery liveness is with 3 minutes. And actually for PPLive, more than 70% of the content are forwarded in 10 seconds. For TVAnts, most are forwarded within 20 seconds. For ESM, it has a very small forwarding delay. These results suggest that a small downstream buffer is enough for identifying the duplicated traffic for P2P live streaming applications.

Measurement & Analysis of P2P Streaming Traffic SCAP Design & Implementation Evaluation Summary The above measurement and analysis results of P2P streaming traffic further motivated our design and implementation of SCAP scheme. I will first give an overview of the scheme.

SCAP (Smart Caching in Access Points) Overview
Internet Access Point Metadata upstream packet (If duplications found in downstream buffer) Downstreams buffer In order to detect the duplicated upstream packet, we first deploy downstream packet buffer in the Access Point and a smaller buffer at the wireless peer. When the peer needs to upload a packet, SCAP will detect the duplication with the help of the buffer, and if duplication found, the upstream packet will be compressed to a nearly empty one with extra meta information (we called metadata upstream packet), and then is sent to the access point. The access point locates the duplicated packets in its buffer, and reassembles the original upstream packet by copying the duplicated bytes from the buffer, and then send the restored packet to the Internet for the viewing peers being served. Downstream buffer Relay/Viewing Peer Original upstream packet WLAN

Design Issues Buffer size:
Need 7.5MB for storing recent 200 seconds traffic (in 300Kbps rate), which is affordable for a wireless station But AP will need to buffer for multiple stations: AP should dynamically adjust the buffer space for each station according to its duplication ratios in order to achieve highest traffic reduction with limited buffer space Buffer synchronization between AP and station: If a metadata upstream packet cannot be reassembled on AP due to a cache miss, TCP flow will be stalled Wireless station caches several copies of recent sent upstream packets and resends the uncompressed packet when needed Although our scheme is not complex, there still exists several challenges in the design. First, if we only store the latest 200-second downstream content in the buffer as suggested by our measurement results, the buffer size is about 7.5MB, which is affordable for a wireless station. However, the access point may need to serve a couple of wireless peers at the same time, so its buffer should be more carefully designed in order to achieve the best performance with a limited buffer size. Another issue raised by the limited buffer size is that the buffer information of AP and the wireless station should be synchronized in order to ensure that the metadata upstream packet can be reassembled to the original upstream packet at the access point. Otherwise the TCP flow may be delayed by corrupted packet. A simple method to avoid this situation is to let the station detect such abnormal delay and resend the original upstream packet being cached in a small buffer.

Prototype Implementation
Modified HostAP driver in Linux kernel for the AP and stations Wireless card is based on Intersil Prism 2.5 chipset (802.11b) Identification of the downstream packet For AP to locate the packet in decompressing the upstream packet Cannot use Sequence Control field (2 bytes) because it is filled by the firmware Have to use the first fingerprint value (8 bytes) In the prototype implementation of SCAP, we modified the HostAP driver in the Linux kernel for the access point and the wireless stations. This driver needs wireless card based on Intersil Prism 2.5 chipset, which only supports b network. (In the future, we aim to port SCAP to Madwifi driver which can support g network.) During the implementation, we also found that we cannot simply use the Sequence Control field in the MAC header for identifying the same downstream packet between the access point and the wireless station because this field is generated by the firmware. Utilizing the characteristics of our scheme, we use the first fingerprint of a packet as its identification.

Measurement & Analysis of P2P Streaming Traffic SCAP Overview Design & Implementation Evaluation Summary Next, I will present the performance evaluation results of our prototype.

Performance Evaluation: LAN Experiment
4.50 8.9 Mbps 4.43 Mbps 4.7 Station first receives a file from a server, then sends it back RF: little overhead for the downstream throughput (1.5% decrease), and 88% improvement for the upstream throughput FH: cannot have any improvement due to constant TCP re- packetizing We first did an experiment by using a file server within our LAN. A wireless station will establish a TCP connection with the file server, and then receive a sized file from the server. After that, the station uploads the file back to the server. This experiment emulates a typical P2P operation. The buffer size is set to about 70MB. As shown in the left figure, RF introduced a very small overhead for the downstream throughput (1.5% decrease), and the right figure shows that RF can improve the upstream throughput from nearly 5Mbps to 9Mbps, which is about 88% improvement. For the FH method, the cost reflected by downstream throughput decrease is nearly 0, but due to its inability to detect duplications for the constant TCP re-packetizing in this experiment, the upstream throughput is not improved at all.

Performance Evaluation: Internet Experiment
Evaluate PPLive, TVAnts, and ESM Run the applications in a VMWare-based Windows XP guest OS for HostAP driver to work Measurement methods: Because P2P Streaming is a Constant Bit Rate stream:  Upstream throughput will not change even if we reduces its traffic Running iperf on another wireless station to observe the impact to WLAN TCP throughput Running Ping to observe the impact to response time Run multiple trials to get comparable P2P downstream throughput for comparison Each trial runs for 600 seconds We then evaluated our scheme in a real Internet experiment. We still chose the three representative P2P applications for our evaluation. However, because all these applications only have Windows version, we have to run them in a Windows XP virtual machine on a Linux host for the modified HostAP driver to work. In this experiment, we can not simply measure the upstream throughput to determine the throughput improvement because P2P streaming is a constant bit rate stream, so even if its upstream traffic is heavily reduced, it still holds to the original throughput. We ran iperf (a network throughput measurement software) on another wireless station to passively observe the impact of our scheme to the WLAN TCP throughput. We also ran a ping session in order to observer the impact to packet response time. For each application, we ran multiple trials of the experiment with the original HostAP driver and our modified driver to get comparable P2P downstream throughput, and each trial runs for 10 minutes.

Internet Experiment: Evaluation Results
RF/FH performs best for TVAnts since it has the largest volume of upstream traffic: Increases TCP throughput by Mbps (54% of upstream traffic) Decrease Ping round-trip time by 83 ms (-26%) Also performs well for PPLive/ESM In our Internet experiment, TVAnts has the largest volume of upstream traffic, which results the best performance among the three applications. For example, RF method improves the iperf measured WLAN TCP throughput by nearly 1 Mbps, which is 54% of the upstream throughput, and also decrease the Ping Round-Trip Time by 83 ms, which is about 1/4 of the original Ping RTT. As for PPLive and ESM, the WLAN throughput and response time are also improved because of the upstream traffic reduction.

Summary With the increasing popularity of P2P streaming applications and pervasive deployment of WLANs, more peers will be connected by wireless We study the impact of wireless peers to the performance of wireless and Internet users Without a proper control of P2P traffic, the performance of both parties can be significantly affected We designed and implemented SCAP (Smart Caching in Access Points) in order to reduce the upstream traffic for P2P live streaming applications Our prototype based evaluation shows the effectiveness of SCAP: SCAP improves the throughput of the WLAN by up to 88% SCAP reduces the response delay to Internet peers as well Now I would like to summarize this talk. With the pervasive deployment of WLANs, more and more peers are wireless users. Without a proper control of the P2P traffic, the performance of the other users in the WLAN and the Internet peers being served will be significantly affected. Motivated by our measurement and analysis of P2P streaming traffic, we designed and implemented a prototype system called SCAP, Smart Caching in Access Points, in order to reduce the upstream traffic for P2P live streaming applications. Our evaluation results show that SCAP can effectively improve the WLAN throughput by up to 88%, and can decrease the streaming packet delay to Internet peers as well.

Thank you. Enhua Tan: etan@cse. ohio-state. edu http://www. cse
Thanks for your time, any questions?

SCAP (Smart Caching in Access Points) – Basic Idea
AP stores downstream data in buffer (1) Station stores downstream data in buffer (2) Compare upstream packet (3) with (2), upload difference (4) AP will assemble upstream packet with data in (1) to the Internet

Workflow of SCAP

Rabin Fingerprinting Rabin Fingerprinting (RF) can produce fingerprints for a continuous data stream quickly: Advance the fingerprint only requires an addition, a multiplication, and a mask Lack of this property for other hash functions like MD5/SHA (and they are also more complex)

Some Related Work XORs in the Air: Practical Wireless Network Coding (Sigcomm’06) Utilizing the broadcasting nature of wireless networks to improve throughput of multi-hop network (instead of application characteristics) Our scheme is utilizing the traffic pattern of P2P applications A Protocol-Independent Technique for Eliminating Redundant Network Traffic (Sigcomm’00) reduces redundant traffic using Rabin Fingerprinting A Low-bandwidth Network File System (SOSP’01) Exploits similarities between different versions of a file to reduce update traffic The most recent work published in Sigcomm 2006, namely, XORs in the Air, aims to improve the throughput of multi-hop network by utilizing the broadcasting nature of wireless networks instead of application characteristics. Our work differs with this work in the light of observing and utilizing the unique duplicated traffic pattern of P2P applications.

SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming

Similar presentations

Presentation on theme: "SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming

Similar presentations

Presentation on theme: "SCAP: Smart Caching in Wireless Access Points to Improve P2P Streaming"— Presentation transcript:

Similar presentations

About project

Feedback