REDUNDANCY IN NETWORK TRAFFIC: FINDINGS AND IMPLICATIONS Ashok Anand Ramachandran Ramjee Chitra Muthukrishnan Microsoft Research Lab, India Aditya Akella.

Slides:



Advertisements
Similar presentations
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC.
Advertisements

Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Managing Wire Delay in Large CMP Caches Bradford M. Beckmann David A. Wood Multifacet Project University of Wisconsin-Madison MICRO /8/04.
UDgateway WAN Optimization. 1. Why UDgateway? All-in-one solution Value added services – Networking project requirements Optimize IP traffic on constrained.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
REfactor-ing Content Overhearing to Improve Wireless Performance Shan-Hsiang Shen Aaron Gember Ashok Anand Aditya Akella abc 1d ab 1.
Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College.
REDUNDANCY ELIMINATION AS A NETWORK-WIDE SERVICE Aditya Akella UW-Madison Shuchi Chawla Ashok Anand Chitra Muthukrishnan UW-Madison Srinivasan Seshan Vyas.
SmartRE: An Architecture for Coordinated Network-Wide Redundancy Elimination Ashok Anand, Vyas Sekar, Aditya Akella University of Wisconsin, Madison Carnegie.
Estimating TCP Latency Approximately with Passive Measurements Sriharsha Gangam, Jaideep Chandrashekar, Ítalo Cunha, Jim Kurose.
Dynamic Adaptive Streaming over HTTP2.0. What’s in store ▪ All about – MPEG DASH, pipelining, persistent connections and caching ▪ Google SPDY - Past,
CacheCast: Eliminating Redundant Link Traffic for Single Source Multiple Destination Transfers Piotr Srebrny, Thomas Plagemann, Vera Goebel Department.
EndRE: An End-System Redundancy Elimination Service.
Packet Caches on Routers: The Implications of Universal Redundant Traffic Elimination Ashok Anand, Archit Gupta, Aditya Akella University of Wisconsin,
Packet Caches on Routers: The Implications of Universal Redundant Traffic Elimination Ashok Anand, Archit Gupta, Aditya Akella University of Wisconsin,
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
A Comparison of Layering and Stream Replication Video Multicast Schemes Taehyun Kim and Mostafa H. Ammar.
1 A Framework for Lazy Replication in P2P VoD Bin Cheng 1, Lex Stein 2, Hai Jin 1, Zheng Zhang 2 1 Huazhong University of Science & Technology (HUST) 2.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Web Caching Robert Grimm New York University. Before We Get Started  Illustrating Results  Type Theory 101.
1 The Mystery of Cooperative Web Caching 2 b b Web caching : is a process implemented by a caching proxy to improve the efficiency of the web. It reduces.
1 Minimization of Network Power Consumption with Redundancy Elimination T. Khoa Phan* Joint work with: Frédéric Giroire*, Joanna Moulierac* and Frédéric.
NET-REPLAY: A NEW NETWORK PRIMITIVE Ashok Anand Aditya Akella University of Wisconsin, Madison.
Redundancy elimination as a network service Aditya Akella UW-Madison.
UDgateway WAN Optimization. 1. Why UDgateway? All-in-one solution Value added services – Networking project requirements Optimize IP traffic on constrained.
Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.
On the Scale and Performance of Cooperative Web Proxy Caching University of Washington Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin,
Advanced Computer Networks1 Efficient Policies for Carrying Traffic Over Flow-Switched Networks Anja Feldmann, Jenifer Rexford, and Ramon Caceres Presenters:
1 A Comparative Study of Handheld and Non-Handheld Traffic in Campus Wi-Fi Networks Aaron Gember, Ashok Anand, and Aditya Akella University of Wisconsin—Madison.
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
RPT: Re-architecting Loss Protection for Content-Aware Networks Dongsu Han, Ashok Anand ǂ, Aditya Akella ǂ, and Srinivasan Seshan Carnegie Mellon University.
COST REDUCTION MECHANISM IN CLOUD USING PACK COST REDUCTION MECHANISM IN CLOUD USING PACK GUIDE PROJECT BY Dr. J.JAGADEESAN AVAYAMBIGAI.J SANTHI.K.S PARIMALA.S.
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Authors: Haowei Yuan, Tian Song, and Patrick Crowley Publisher: ICCCN 2012 Presenter: Chai-Yi Chu Date: 2013/05/22 1.
EndRE: An End-System Redundancy Elimination Service Bhavish Aggarwal, Aditya Akella, Ashok Anand, Athula Balachandran, Pushkar Chitnis, Chitra Muthukrishnan,
Web Cache Redirection using a Layer-4 switch: Architecture, issues, tradeoffs, and trends Shirish Sathaye Vice-President of Engineering.
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
May 30, 2016Department of Computer Sciences, UT Austin1 Using Bloom Filters to Refine Web Search Results Navendu Jain Mike Dahlin University of Texas at.
Peer-Assisted Content Distribution Pablo Rodriguez Christos Gkantsidis.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Cheap and Large CAMs for High Performance Data-Intensive Networked Systems Ashok Anand, Chitra Muthukrishnan, Steven Kappes, and Aditya Akella University.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
William Stallings Data and Computer Communications
A Low-bandwidth Network File System Athicha Muthitacharoen et al. Presented by Matt Miller September 12, 2002.
Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Applying Syntactic Similarity Algorithms.
Multihoming Performance Benefits: An Experimental Evaluation of Practical Enterprise Strategies Aditya Akella, CMU Srinivasan Seshan, CMU Anees Shaikh,
On the Impact of Clustering on Measurement Reduction May 14 th, D. Saucez, B. Donnet, O. Bonaventure Thanks to P. François.
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Piotr Srebrny 1.  Problem statement  Packet caching  Thesis claims  Contributions  Related works  Critical review of claims  Conclusions  Future.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Performance Limitations of ADSL Users: A Case Study Matti Siekkinen, University of Oslo Denis Collange, France Télécom R&D Guillaume Urvoy-Keller, Ernst.
Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research.
#16 Application Measurement Presentation by Bobin John.
Accurate WiFi Packet Delivery Rate Estimation and Applications Owais Khan and Lili Qiu. The University of Texas at Austin 1 Infocom 2016, San Francisco.
Wide-area Network Acceleration for the Developing World
Authors: Jiang Xie, Ian F. Akyildiz
Architecture and Algorithms for an IEEE 802
Web Caching? Web Caching:.
Memory Management for Scalable Web Data Servers
On the Scale and Performance of Cooperative Web Proxy Caching
Kalyan Boggavarapu Lehigh University
Edge computing (1) Content Distribution Networks
Natalie Enright Jerger, Li Shiuan Peh, and Mikko Lipasti
Yiannis Andreopoulos et al. IEEE JSAC’06 November 2006
Ch 17 - Binding Protocol Addresses
Presentation transcript:

REDUNDANCY IN NETWORK TRAFFIC: FINDINGS AND IMPLICATIONS Ashok Anand Ramachandran Ramjee Chitra Muthukrishnan Microsoft Research Lab, India Aditya Akella University of Wisconsin, Madison

Redundancy in network traffic  Redundancy in network traffic  Popular objects, partial content matches, headers  Redundancy elimination (RE) for improving network efficiency  Application layer object caching Web proxy caches  Recent protocol independent RE approaches WAN optimizers, De-duplication, WAN Backups, etc. 2

Protocol independent RE 3  Message granularity: packet or object chunk  Different RE systems operate at different granularity WAN link

RE applications 4  Enterprise and data centers  Accelerate WAN performance  As a primitive in network architecture  Packet Caches [Sigcomm 2008]  Ditto [Mobicom 2008]

ISP ISP Protocol independent RE in enterprises Enterprises Wan Opt Data centers  Globalized enterprise dilemma  Centralized servers Simple management Hit on performance  Distributed servers Direct request to closest servers Complex management  RE gives benefits of both worlds  Deployed in network middle-boxes  Accelerate WAN traffic while keeping management simple  RE for accelerating WAN backup applications 5

ISP ISP Recent proposals for protocol independent RE Enterprises Web content University RE deployment on ISP access links to improve capacity  Reduce load on ISP access links  Improve effective capacity  Packet caches [Sigcomm 2008]  RE on all routers  Ditto [Mobicom 2008]  Use RE on nodes in wireless mesh networks to improve throughput 6

Understanding protocol independent RE systems  Currently little insight into these RE systems  How far are these RE techniques from optimal?  Are there other better schemes?  When is network RE most effective?  Do end-to-end RE approaches offer performance close to network RE?  What fundamental redundancy patterns drive the design and bound the effectiveness?  Important for effective design of current systems as well as future architectures e.g. Ditto, packet caches 7

Large scale trace-driven study  First comprehensive study  Traces from multiple vantage points  Focus on packet level redundancy elimination  Performance comparison of different RE algorithms  Average bandwidth savings  Bandwidth savings in peak and 95 th percentile utilization  Impact on burstiness  Origins of redundancy  Intra-user vs. Inter-user  Different protocols  Patterns of redundancy  Distribution of match lengths  Hit distribution  Temporal locality of matches 8

Data sets  Enterprise packet traces (3 TB) with payload  11 enterprises Small (10-50 IPs) Medium ( IPs) Large (100+ IPs)  2 weeks  Protocol composition HTTP (20-55%) Spring et al. (64%) File sharing (25-70%) Centralization of servers  UW Madison packet traces (1.6 TB) with payload  IPs; trace collected at campus border router  Outgoing /24, web server traffic  2 different periods of 2 days each  Protocol composition Incoming, HTTP 60% Outgoing, HTTP 36% 9

Evaluation methodology  Emulate memory-bound (500 MB - 4GB) WAN optimizer  Entire cache resides in DRAM (packet-level RE)  Emulate only redundancy elimination WAN optimizers do other optimizations also  Deployment across both ends of access links Enterprise to data center All traffic from University to one ISP  Replay packet trace  Compute bandwidth savings as (saved bytes/total bytes)  Includes packet headers in total bytes  Includes overhead of shim headers used for encoding 10

Large scale trace-driven study  Performance comparison of different RE algorithms  Origins of redundancy  Patterns of redundancy  Distribution of match lengths  Hit distribution 11

Redundancy elimination algorithms Redundancy suppression across different packets (Use history) Data compression only within packets (No history) MODP (Spring et al.) MAXP (new algorithm) GZIP and other variants 12

MODP Packet payload Window Rabin fingerprinting Value sampling: sample those fingerprints whose value is 0 mod p Fingerprint table Packet store Payload-1 Payload-2  Spring et al. [Sigcomm 2000]  Compute fingerprints  Lookup fingerprints in Fingerprint table 13

MAXP Choose fingerprints that are local maxima ( or minima) for p bytes region  Similar to MODP  Only selection criteria changes MODP Sample those fingerprints whose value is 0 mod p No fingerprint to represent the shaded region Gives uniform selection of fingerprints 14

Optimal  Approximate upper bound on optimal  Store every fingerprint in a bloom filter  Identify fingerprint match if bloom filter contains the fingerprint  Low false positive for bloom filter: 0.1% 15

Comparison of MODP, MAXP and optimal  MAXP outperforms MODP by 5-10% in most cases  Uniform sampling approach of MAXP  MODP loses due to non uniform clustering of fingerprints  New RE algorithm which performs better than classical MODP 16

Comparison of different RE algorithms  GZIP offers 3-15% benefit  (10ms buffering) -> GZIP increases benefit up to 5%  MAXP significantly outperforms GZIP, offers 15-60% bandwidth savings  MAXP -> (10 ms) -> GZIP further enhances benefit up to 8%  We can use combination of RE algorithms to enhance the bandwidth savings 17 -> means followed by

Large scale trace-driven study  Performance study of different RE algorithms  Origins of redundancy  Patterns of redundancy  Distribution of match lengths  Match distribution 18

Origins of redundancy Enterprise Middlebox Data Centers Middlebox Flow-1 Flow-2 Flow-3 Flow-1 Flow-2 Flow-3  Different users accessing the same content, or same content being accessed repeatedly by same user?  Middle-box deployments can eliminate bytes shared across users  How much sharing across users in practice? INTER-USER: sharing across users (a)INTER-SRC (b)INTER-DEST (c)INTER-NODE INTRA-USER: redundancy within same user (a) INTRA-FLOW (b) INTER-FLOW 19

Study of composition of redundancy  90% savings is across destinations for Uout/24  For Uin/Uout, 30-40% savings is due to intra-user  For enterprises, 75-90% savings is due to intra-user Inter User Intra User 20

Implication: End-to-end RE as a promising alternative Enterprise Middlebox Data Centers Middlebox 21  End-to-end RE as a compelling design choice  Similar savings  Deployment requires just software upgrade Middle-boxes are expensive  Middle-boxes may violate end-to-end semantics

Large scale trace-driven study  Performance study of different RE algorithms  End-to-end RE versus network RE  Patterns of redundancy  Distribution of match lengths  Hit distribution 22

Match length analysis  Do most of the savings come from full packet matches?  Simple technique of indexing full packet will be good  For partial packet matches, what should be the minimum window size? 23

Match length analysis for enterprise  70% of the matches are less than 150 bytes and contribute 20% of savings  10% of the matches come from full matches and contribute 50% of savings  Need to index small chunks of size <= 150 bytes for maximum benefit 24 Bins of different match lengths (in bytes) Percentage

Hit distribution  Contributors of redundancy  Few pieces of content repeated multiple times Small packet store would be sufficient  Many pieces of content repeated few times Large packet store 25

Zipf-like distribution for chunk matches  Chunk ranking  Unique chunk matches sorted by their hit counts  Straight line shows the zip-fian distribution  Similar to web page access frequency  How much popular chunks contribute to savings? 26

Savings due to hit distribution  80% of savings come from 20% of chunks  Need to index 80% of chunks for remaining 20% of savings  Diminishing return for cache size 27

Savings vs. cache size  Small packet caches (250 MB) provide significant percentage of savings  Diminishing returns for increasing packet cache size after 250 MB 28

Conclusion  First comprehensive study of protocol independent RE systems  Key Results  15-60% savings using protocol independent RE  A new RE algorithm, which performs 5-10% better than Spring et al. approach  Zip-fian distribution of chunk hits; small caches are sufficient to extract most of the redundancy  End-to-end RE solutions are promising alternatives to memory-bound WAN optimizers for enterprises 29

Thank you! Questions ? 30

Backup slides 31

Peak and 95 th percentile savings 32

Effect on burstiness 33  Wavelet based multi-resolution analysis  Energy plot higher energy means more burstiness  Compared with uniform compression  Results  Enterprise No reduction in burstiness Peak savings lower than average savings  University Reduction in burstiness Positive correlation of link utilization with redundancy

Redundancy across protocols 34  Large enterprise  University ProtocolPercentage VolumePercentage redundancy HTTP SMB LDAP Src code ctrl ProtocolPercentage VolumePercentage redundancy HTTP DNS RTSP3.382 FTP