Towards a Scalable, Adaptive and Network-aware Content Distribution Network Yan Chen EECS Department UC Berkeley.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Intel Research Internet Coordinate Systems - 03/03/2004 Internet Coordinate Systems Marcelo Pias Intel Research Cambridge
Ningning HuCarnegie Mellon University1 Optimizing Network Performance In Replicated Hosting Peter Steenkiste (CMU) with Ningning Hu (CMU), Oliver Spatscheck.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
Spring 2003CS 4611 Content Distribution Networks Outline Implementation Techniques Hashing Schemes Redirection Strategies.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 Clustering Web Content for Efficient Replication Yan Chen, Lili Qiu*, Weiyu Chen, Luan Nguyen, Randy H. Katz EECS Department UC Berkeley *Microsoft Research.
Tomography-based Overlay Network Monitoring UC Berkeley Yan Chen, David Bindel, and Randy H. Katz.
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring Yan Chen, David Bindel, Hanhee Song, Randy H. Katz Presented by Mahesh Balakrishnan.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Internet Iso-bar: A Scalable Overlay Distance Monitoring System Yan Chen, Lili Qiu, Chris Overton and Randy H. Katz.
Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley.
1 Clustering Web Content for Efficient Replication Yan Chen, Lili Qiu*, Weiyu Chen, Luan Nguyen, Randy H. Katz EECS Department UC Berkeley *Microsoft Research.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
Application Layer Multicast
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring University of California at Berkeley David Bindel, Hanhee Song, and Randy H.
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
Tomography-based Overlay Network Monitoring UC Berkeley Yan Chen, David Bindel, and Randy H. Katz.
1 An Overlay Scheme for Streaming Media Distribution Using Minimum Spanning Tree Properties Journal of Internet Technology Volume 5(2004) No.4 Reporter.
Tomography-based Overlay Network Monitoring and its Applications Joint work with David Bindel, Brian Chavez, Hanhee Song, and Randy H. Katz UC Berkeley.
Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
1 Routing as a Service Karthik Lakshminarayanan (with Ion Stoica and Scott Shenker) Sahara/i3 retreat, January 2004.
Yao Zhao 1, Yan Chen 1, David Bindel 2 Towards Unbiased End-to-End Diagnosis 1.Lab for Internet & Security Tech, Northwestern Univ 2.EECS department, UC.
Tomography-based Overlay Network Monitoring Hugo Angelmar Slides courtesy of (Yan Chen, David Bindel, and Randy H. Katz)
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
Clustering of Web Content for Efficient Replication Yan Chen, Lili Qiu, Wei Chen, Luan Nguyen and Randy H. Katz {yanchen, wychen, luann,
Scalable and Deterministic Overlay Network Diagnosis Yao Zhao, Yan Chen Northwestern Lab for Internet and Security Technology (LIST) Dept. of Computer.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
End-to-end QoE Optimization Through Overlay Network Deployment Bart De Vleeschauwer, Filip De Turck, Bart Dhoedt and Piet Demeester Ghent University -
“Intra-Network Routing Scheme using Mobile Agents” by Ajay L. Thakur.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
Overlay Network Physical LayerR : router Overlay Layer N R R R R R N.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley.
A Routing Underlay for Overlay Networks Akihiro Nakao Larry Peterson Andy Bavier SIGCOMM’03 Reviewer: Jing lu.
Paper Group: 20 Overlay Networks 2 nd March, 2004 Above papers are original works of respective authors, referenced here for academic purposes only Chetan.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Towards a Transparent and Proactively-Managed Internet Ehab Al-Shaer School of Computer Science DePaul University Yan Chen EECS Department Northwestern.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
On Reducing Mesh Delay for Peer- to-Peer Live Streaming Dongni Ren, Y.-T. Hillman Li, S.-H. Gary Chan Department of Computer Science and Engineering The.
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
Efficient and Adaptive Replication using Content Clustering Yan Chen EECS Department UC Berkeley.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
MicroGrid Update & A Synthetic Grid Resource Generator Xin Liu, Yang-suk Kee, Andrew Chien Department of Computer Science and Engineering Center for Networked.
-1/16- Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks C.-K. Toh, Georgia Institute of Technology IEEE.
Efficient and Adaptive Replication using Content Clustering Yan Chen EECS Department UC Berkeley.
Drafting Behind Akamai (Travelocity-Based Detouring) Ao-Jan Su, David R. Choffnes, Aleksandar Kuzmanovic and Fabián E. Bustamante Department of EECS Northwestern.
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Department of Computer Science Northwestern University.
Dynamic Replica Placement for Scalable Content Delivery
EE 122: Lecture 22 (Overlay Networks)
Presentation transcript:

Towards a Scalable, Adaptive and Network-aware Content Distribution Network Yan Chen EECS Department UC Berkeley

Outline Motivation and Challenges Our Contributions: SCAN system Case Study: Tomography-based overlay network monitoring system Conclusions

Motivation The Internet has evolved to become a commercial infrastructure for service delivery –Web delivery, VoIP, streaming media … Challenges for Internet-scale services –Scalability: 600M users, 35M Web sites, 2.1Tb/s –Efficiency: bandwidth, storage, management –Agility: dynamic clients/network/servers –Security, etc. Focus on content delivery - Content Distribution Network (CDN) –Totally 4 Billion Web pages, daily growth of 7M pages –Annual traffic growth of 200% for next 4 years

How CDN Works

Challenges for CDN Replica Location –Find nearby replicas with good DoS attack resilience Replica Deployment –Dynamics, efficiency –Client QoS and server capacity constraints Replica Management –Replica index state maintenance scalability Adaptation to Network Congestion/Failures –Overlay monitoring scalability and accuracy

Provision: Dynamic Replication + Update Multicast Tree Building Replica Management: (Incremental) Content Clustering Network End-to-End Distance Monitoring Internet Iso-bar: latency TOM: loss rate Network DoS Resilient Replica Location: Tapestry SCAN: Scalable Content Access Network

Replica Location Existing Work and Problems –Centralized, Replicated and Distributed Directory Services –No security benchmarking, which one has the best DoS attack resilience? Solution –Proposed the first simulation-based network DoS resilience benchmark –Applied it to compare three directory services –DHT-based Distributed Directory Services has best resilience in practice Publication –3 rd Int. Conf. on Info. and Comm. Security (ICICS), 2001

Replica Placement/Maintenance Existing Work and Problems –Static placement –Dynamic but inefficient placement –No coherence support Solution –Dynamically place close to optimal # of replicas with clients QoS (latency) and servers capacity constraints –Self-organize replica into a scalable application-level multicast for disseminating updates –With overlay network topology only Publication –IPTPS 2002, Pervasive Computing 2002

Existing Work and Problems –Cooperative access for good efficiency requires maintaining replica indices –Per Website replication, scalable, but poor performance –Per URL replication, good performance, but unscalable Solution –Clustering-based replication reduces the overhead significantly without sacrificing much performance –Proposed a unique online Web object popularity prediction scheme based on hyperlink structures –Online incremental clustering and replication to push replicas before accessed Publication –ICNP 2002, IEEE J-SAC 2003 Replica Management

Adaptation to Network Congestion/Failures Existing Work and Problems –Latency estimation »Clustering-based: network proximity based, inaccurate »Coordinate-based: symmetric distance, unscalable to update –General metrics: n 2 measurement for n end hosts Solution –Latency: Internet Iso-bar - clustering based on latency similarity to a small number of landmarks –Loss rate: Tomography-based Overlay Monitoring (TOM) - selectively monitor a basis set of O(n logn) paths to infer the loss rates of other paths Publication –Internet Iso-bar: SIGMETRICS PER 2002 –TOM: SIGCOMM IMC 2003

SCAN Architecture Leverage Distributed Hash Table - Tapestry for –Distributed, scalable location with guaranteed success –Search with locality data plane network plane data source Web server SCAN server client replica always update cache Tapestry mesh Replica Location Dynamic Replication/Update and Replica Management adaptive coherence Overlay Network Monitoring

Methodology Network topology Web workload Network end-to-end latency measurement Analytical evaluation Algorithm design Realistic simulation iterate PlanetLab tests

Case Study: Tomography-based Overlay Network Monitoring

TOM Outline Goal and Problem Formulation Algebraic Modeling and Basic Algorithms Scalability Analysis Practical Issues Evaluation Application: Adaptive Overlay Streaming Media Conclusions

Existing Work General Metrics: RON (n 2 measurement) Latency Estimation –Clustering-based: IDMaps, Internet Isobar, etc. –Coordinate-based: GNP, ICS, Virtual Landmarks Network tomography –Focusing on inferring the characteristics of physical links rather than E2E paths –Limited measurements -> under-constrained system, unidentifiable links Goal: a scalable, adaptive and accurate overlay monitoring system to detect e2e congestion/failures

Problem Formulation Given an overlay of n end hosts and O(n 2 ) paths, how to select a minimal subset of paths to monitor so that the loss rates/latency of all other paths can be inferred. Assumptions: Topology measurable Can only measure the E2E path, not the link

Our Approach Select a basis set of k paths that fully describe O(n 2 ) paths (k «O(n 2 )) Monitor the loss rates of k paths, and infer the loss rates of all other paths Applicable for any additive metrics, like latency End hosts Overlay Network Operation Center topology measurements

Algebraic Model Path loss rate p, link loss rate l A D C B p1p1

Putting All Paths Together Totally r = O(n 2 ) paths, s links, s «r A D C B p1p1 … =

Sample Path Matrix x 1 - x 2 unknown => cannot compute x 1, x 2 Set of vectors form null space To separate identifiable vs. unidentifiable components: x = x G + x N A D C B b1b1 b2b2 b3b3 (1,-1,0) x2x2 x1x1 x3x3 (1,1,0) path/row space (measured) null space (unmeasured)

Intuition through Topology Virtualization Virtual links: Minimal path segments whose loss rates uniquely identified Can fully describe all paths x G is composed of virtual links A D C B b1b1 b2b2 b3b3 (1,-1,0) x2x2 x1x1 x3x3 (1,1,0) path/row space (measured) null space (unmeasured) 1 2 Virtualization Virtual links All E2E paths are in path space, i.e., Gx N = 0

More Examples Real links (solid) and all of the overlay paths (dotted) traversing them Virtualization Virtual links ’2’ Rank(G)= ’ 2’ 4 Rank(G)=3 3’ 4’ 1 2 3

Basic Algorithms Select k = rank(G) linearly independent paths to monitor –Use QR decomposition –Leverage sparse matrix: time O(rk 2 ) and memory O(k 2 ) »E.g., 79 sec for n = 300 (r = 44850) and k = 2541 Compute the loss rates of other paths –Time O(k 2 ) and memory O(k 2 ) »E.g., 1.89 sec for the example above … = … =

Scalability Analysis k « O(n 2 ) ? For a power-law Internet topology When the majority of end hosts are on the overlay When a small portion of end hosts are on overlay –If Internet a pure hierarchical structure (tree): k = O(n) –If Internet no hierarchy at all (worst case, clique): k = O(n 2 ) –Internet has moderate hierarchical structure [TGJ+02] k = O(n) (with proof) For reasonably large n, (e.g., 100), k = O(nlogn) (extensive linear regression tests on both synthetic and real topologies)

TOM Outline Goal and Problem Formulation Algebraic Modeling and Basic Algorithms Scalability Analysis Practical Issues Evaluation Application: Adaptive Overlay Streaming Media Summary

Practical Issues Topology measurement errors tolerance –Router aliases –Incomplete routing info Measurement load balancing –Randomly order the paths for scan and selection of Adaptive to topology changes –Designed efficient algorithms for incrementally update –Add/remove a path: O(k 2 ) time (O(n 2 k 2 ) for reinitialize) –Add/remove end hosts and Routing changes

Path loss rate estimation accuracy –Absolute error |p – p’ | –Error factor [BDPT02] –Lossy path inference: coverage and false positive ratio Measurement load balancing –Coefficient of variation (CV) –Maximum vs. mean ratio (MMR) Speed of setup, update and adaptation Evaluation Metrics

Areas and Domains # of hosts US (40).edu33.org3.net2.gov1.us1 Interna- tional (11) Europe (6) France1 Sweden1 Denmark1 Germany1 UK2 Asia (2) Taiwan1 Hong Kong1 Canada2 Australia1 Evaluation Extensive Simulations Experiments on PlanetLab –51 hosts, each from different organizations –51 × 50 = 2,550 paths –On average k = 872 Results on Accuracy –Avg real loss rate: –Absolute error mean: % < –Error factormean: % < 2.0 –On average 248 out of 2550 paths have no or incomplete routing information –No router aliases resolved

Evaluation (cont’d) Results on Speed –Path selection (setup): 0.75 sec –Path loss rate calculation: 0.16 sec for all 2550 paths Results on Load Balancing –Significantly reduce CV and MMR, up to a factor of 7.3 With load balancing Without load balancing

TOM Outline Goal and Problem Formulation Algebraic Modeling and Basic Algorithms Scalability Analysis Practical Issues Evaluation Application: Adaptive Overlay Streaming Media Conclusions

Motivation Traditional streaming media systems treat the network as a black box Adaptation only performed at the transmission end points Overlay relay can effectively bypass congestion/failures Built an adaptive streaming media system that leverages –TOM for real-time path info –An overlay network for adaptive packet buffering and relay

X UC Berkeley UC San Diego Stanford HP Labs Adaptive Overlay Streaming Media Implemented with Winamp client and SHOUTcast server Congestion introduced with a Packet Shaper Skip-free playback: server buffering and rewinding Total adaptation time < 4 seconds

Adaptive Streaming Media Architecture

Summary A tomography-based overlay network monitoring system –Selectively monitor a basis set of O(n logn) paths to infer the loss rates of O(n 2 ) paths –Works in real-time, adaptive to topology changes, has good load balancing and tolerates topology errors Both simulation and real Internet experiments promising Built adaptive overlay streaming media system on top of TOM –Bypass congestion/failures for smooth playback within seconds

Tie Back to SCAN Provision: Dynamic Replication + Update Multicast Tree Building Replica Management: (Incremental) Content Clustering Network End-to-End Distance Monitoring Internet Iso-bar: latency TOM: loss rate Network DoS Resilient Replica Location: Tapestry

Contribution of My Thesis Replica location –Proposed the first simulation-based network DoS resilience benchmark and quantify three types of directory services Dynamically place close to optimal # of replicas –Self-organize replicas into a scalable app-level multicast tree for disseminating updates Cluster objects to significantly reduce the management overhead with little performance sacrifice –Online incremental clustering and replication to adapt to users’ access pattern changes Scalable overlay network monitoring

Thank you !

Backup Materials

Existing CDNs Fail to Address these Challenges Non-cooperative replication inefficient No coherence for dynamic content Unscalable network monitoring - O(M × N) M: # of client groups, N: # of server farms X

Network Topology and Web Workload Network Topology –Pure-random, Waxman & transit-stub synthetic topology –An AS-level topology from 7 widely-dispersed BGP peers Web Workload Web Site PeriodDuration# Requests avg –min-max # Clients avg –min-max # Client groups avg –min-max MSNBCAug-Oct/199910–11am1.5M–642K–1.7M129K–69K–150K15.6K-10K-17K NASAJul-Aug/1995All day79K-61K-101K –Aggregate MSNBC Web clients with BGP prefix »BGP tables from a BBNPlanet router –Aggregate NASA Web clients with domain names –Map the client groups onto the topology

Network E2E Latency Measurement NLANR Active Measurement Project data set –111 sites on America, Asia, Australia and Europe –Round-trip time (RTT) between every pair of hosts every minute –17M daily measurement –Raw data: Jun. – Dec. 2001, Nov Keynote measurement data –Measure TCP performance from about 100 worldwide agents –Heterogeneous core network: various ISPs –Heterogeneous access network: »Dial up 56K, DSL and high-bandwidth business connections –Targets »40 most popular Web servers + 27 Internet Data Centers –Raw data: Nov. – Dec. 2001, Mar. – May 2002

Properties Web caching (client initiated) Web caching (server initiated) Conventional CDNs (Akamai) SCAN Replica access Non- cooperative Cooperative (bloomfilter) Non- cooperative Cooperative Load balancing No Yes Pull/push PullPushPullPush Transparent to clients No Yes Coherence support No Yes Network- awareness No Yes, unscalable monitoring system Yes, scalable monitoring system Internet Content Delivery Systems

Absolute and Relative Errors For each experiment, get its 95 percentile absolute and relative errors for estimation of 2,550 paths

Lossy Path Inference Accuracy 90 out of 100 runs have coverage over 85% and false positive less than 10% Many caused by the 5% threshold boundary effects

Loss rate distribution Metrics –Absolute error |p – p’ |: »Average for all paths, for lossy paths –Relative error [BDPT02] –Lossy path inference: coverage and false positive ratio On average k = 872 out of 2550 loss rate [0, 0.05) lossy path [0.05, 1.0] (4.1%) [0.05, 0.1)[0.1, 0.3)[0.3, 0.5)[0.5, 1.0)1.0 %95.9%15.2%31.0%23.9%4.3%25.6% PlanetLab Experiment Results

Areas and Domains # of hosts US (40).edu33.org3.net2.gov1.us1 Interna- tional (11) Europe (6) France1 Sweden1 Denmark1 Germany1 UK2 Asia (2) Taiwan1 Hong Kong1 Canada2 Australia1 Experiments on Planet Lab 51 hosts, each from different organizations –51 × 50 = 2,550 paths Simultaneous loss rate measurement –300 trials, 300 msec each –In each trial, send a 40-byte UDP pkt to every other host Simultaneous topology measurement –Traceroute Experiments: 6/24 – 6/27 –100 experiments in peak hours

Motivation With single node relay Loss rate improvement –Among 10,980 lossy paths: –5,705 paths (52.0%) have loss rate reduced by 0.05 or more –3,084 paths (28.1%) change from lossy to non-lossy Throughput improvement –Estimated with –60,320 paths (24%) with non-zero loss rate, throughput computable –Among them, 32,939 (54.6%) paths have throughput improved, 13,734 (22.8%) paths have throughput doubled or more Implications: use overlay path to bypass congestion or failures

SCAN Coherence for dynamic content Cooperative clustering-based replication X Scalable network monitoring O(M+N) s1, s4, s5

Problem Formulation Subject to certain total replication cost (e.g., # of URL replicas) Find a scalable, adaptive replication strategy to reduce avg access cost

CDN Applications (e.g. streaming media) SCAN: Scalable Content Access Network Provision: Cooperative Clustering-based Replication User Behavior/ Workload Monitoring Coherence: Update Multicast Tree Construction Network Performance Monitoring Network Distance/ Congestion/ Failure Estimation red: my work, black: out of scope

Evaluation of Internet-scale System Analytical evaluation Realistic simulation –Network topology –Web workload –Network end-to-end latency measurement Network topology –Pure-random, Waxman & transit-stub synthetic topology –A real AS-level topology from 7 widely-dispersed BGP peers

Web Workload Web Site PeriodDuration# Requests avg –min-max # Clients avg –min-max # Client groups avg –min-max MSNBCAug-Oct/199910–11am1.5M–642K–1.7M129K–69K–150K15.6K-10K-17K NASAJul-Aug/1995All day79K-61K-101K World Cup May-Jul/1998All day29M – 1M – 73M103K–13K–218KN/A Aggregate MSNBC Web clients with BGP prefix –BGP tables from a BBNPlanet router Aggregate NASA Web clients with domain names Map the client groups onto the topology

Simulation Methodology Network Topology –Pure-random, Waxman & transit-stub synthetic topology –An AS-level topology from 7 widely-dispersed BGP peers Web Workload Web Site PeriodDuration# Requests avg –min-max # Clients avg –min-max # Client groups avg –min-max MSNBCAug-Oct/199910–11am1.5M–642K–1.7M129K–69K–150K15.6K-10K-17K NASAJul-Aug/1995All day79K-61K-101K –Aggregate MSNBC Web clients with BGP prefix »BGP tables from a BBNPlanet router –Aggregate NASA Web clients with domain names –Map the client groups onto the topology

Online Incremental Clustering Predict access patterns based on semantics Simplify to popularity prediction Groups of URLs with similar popularity? Use hyperlink structures! –Groups of siblings –Groups of the same hyperlink depth: smallest # of links from root

Challenges for CDN Over-provisioning for replication –Provide good QoS to clients (e.g., latency bound, coherence) –Small # of replicas with small delay and bandwidth consumption for update Replica Management –Scalability: billions of replicas if replicating in URL »O(10 4 ) URLs/server, O(10 5 ) CDN edge servers in O(10 3 ) networks –Adaptation to dynamics of content providers and customers Monitoring –User workload monitoring –End-to-end network distance/congestion/failures monitoring »Measurement scalability »Inference accuracy and stability

SCAN Architecture Leverage Decentralized Object Location and Routing (DOLR) - Tapestry for –Distributed, scalable location with guaranteed success –Search with locality Soft state maintenance of dissemination tree (for each object) data plane network plane data source Web server SCAN server client replica always update adaptive coherence cache Tapestry mesh Request Location Dynamic Replication/Update and Content Management

Cluster A Clients Cluster B Monitors Cluster C Distance measured from a host to its monitor Distance measured among monitors SCAN edge servers Wide-area Network Measurement and Monitoring System (WNMMS) Select a subset of SCAN servers to be monitors E2E estimation for Distance Congestion Failures network plane

Dynamic Provisioning Dynamic replica placement –Meeting clients’ latency and servers’ capacity constraints –Close-to-minimal # of replicas Self-organized replicas into app-level multicast tree –Small delay and bandwidth consumption for update multicast –Each node only maintains states for its parent & direct children Evaluated based on simulation of –Synthetic traces with various sensitivity analysis –Real traces from NASA and MSNBC Publication –IPTPS 2002 –Pervasive Computing 2002

Effects of the Non-Uniform Size of URLs Replication cost constraint : bytes Similar trends exist –Per URL replication outperforms per Website dramatically –Spatial clustering with Euclidean distance and popularity- based clustering are very cost-effective

Provisioning (replica placement) Network Monitoring Coherence Support Ad hoc pair-wise monitoring O(M×N) Tomography -based monitoring O(M+N) IP multicast App-level multicast on P2P DHT Unicast SCAN: Scalable Content Access Network Granularity SCANPush Existing CDNsPull CooperativeNon-cooperative Per object Per Website Per cluster Access/Deployment Mechanisms

Clien t Local DNS serverProxy cache server Web content server Client Local DNS server Proxy cache server 1.GET request 4. Response 2.GET request if cache miss 3. Response ISP 2 ISP 1 Web Proxy Caching

CDN name server Client 1 Local DNS serverLocal CDN server 1. GET request 4. local CDN server IP address Web content server Client 2 Local DNS server Local CDN server 2. Request for hostname resolution 3. Reply: local CDN server IP address 5.GET request 8. Response 6.GET request if cache miss ISP 2 ISP 1 Conventional CDN: Non-cooperative Pull 7. Response Inefficient replication

CDN name server Client 1 Local DNS serverLocal CDN server 1. GET request 4. Redirected server IP address Web content server Client 2 Local DNS server Local CDN server 2. Request for hostname resolution 3. Reply: nearby replica server or Web server IP address ISP 2 ISP 1 5. GET request 6. Response 5.GET request if no replica yet SCAN: Cooperative Push 0. Push replicas Significantly reduce the # of replicas and update cost

Internet Content Delivery Systems Scalability for request redirection Pre- configured in browser Use Bloom filter to exchange replica locations Centralized CDN name server Decentra- lized P2P location Properties Web caching (client initiated) Web caching (server initiated) Pull-based CDNs (Akamai) Push- based CDNs SCAN Efficiency (# of caches or replicas) No cache sharing among proxies Cache sharing No replica sharing among edge servers Replica sharing Network- awareness No Yes, unscalable monitoring system NoYes, scalable monitoring system Coherence support No YesNoYes

Previous Work: Update Dissemination No inter-domain IP multicast Application-level multicast (ALM) unscalable –Root maintains states for all children (Narada, Overcast, ALMI, RMX) –Root handles all “join” requests (Bayeux) –Root split is common solution, but suffers consistency overhead

Comparison of Content Delivery Systems (cont’d) Properties Web caching (client initiated) Web caching (server initiated) Pull-based CDNs (Akamai) Push- based CDNs SCAN Distributed load balancing NoYes NoYes Dynamic replica placement Yes NoYes Network- awareness No Yes, unscalable monitoring system NoYes, scalable monitoring system No global network topology assumption Yes NoYes