Large-Scale Web Caching and Content Delivery Jeff Chase CPS 212: Distributed Information Systems Fall 2000.

Slides:



Advertisements
Similar presentations
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol Li Fan, Pei Cao and Jussara Almeida University of Wisconsin-Madison Andrei Broder Compaq/DEC.
Advertisements

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Latency-sensitive hashing for collaborative Web caching Presented by: Xin Qi Yong Yang 09/04/2002.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol By Abuzafor Rasal and Vinoth Rayappan.
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
1 The Content and Access Dynamics of a Busy Web Server: Findings and Implications Venkata N. Padmanabhan Microsoft Research Lili Qiu Cornell University.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Web Caching Robert Grimm New York University. Before We Get Started  Interoperability testing  Type theory 101.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Web Caching Robert Grimm New York University. Before We Get Started  Illustrating Results  Type Theory 101.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
Caching And Prefetching For Web Content Distribution Presented By:- Harpreet Singh Sidong Zeng ECE Fall 2007.
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
Web Caching Schemes For The Internet – cont. By Jia Wang.
1 The Mystery of Cooperative Web Caching 2 b b Web caching : is a process implemented by a caching proxy to improve the efficiency of the web. It reduces.
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
Web Caching and Content Delivery. Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve.
By Huang et al., SOSP 2013 An Analysis of Facebook Photo Caching Presented by Phuong Nguyen Some animations and figures are borrowed from the original.
By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching.
Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.
Hybrid Prefetching for WWW Proxy Servers Yui-Wen Horng, Wen-Jou Lin, Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
The Web: some jargon User agent for Web is called a browser: Web page:
On the Scale and Performance of Cooperative Web Proxy Caching University of Washington Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin,
Network Aware Resource Allocation in Distributed Clouds.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.
World Wide Web Caching: Trends and Technologys Gerg Barish & Katia Obraczka USC Information Sciences Institute, USA,2000.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.
Understanding the Performance of Web Caching System with an Analysis Model and Simulation Xiaosong Hu Nur Zincir-Heywood Sep
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Dr. Yingwu Zhu Summary Cache : A Scalable Wide- Area Web Cache Sharing Protocol.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Geo Location Service CS218 Fall 2008 Yinzhe Yu, et al : Enhancing Location Service Scalability With HIGH-GRADE Yinzhe Yu, et al : Enhancing Location Service.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Computer Science Lecture 14, page 1 CS677: Distributed OS Last Class: Concurrency Control Concurrency control –Two phase locks –Time stamps Intro to Replication.
Practical LFU implementation for Web Caching George KarakostasTelcordia Dimitrios N. Serpanos University of Patras.
ICP and the Squid Web Cache Duane Wessels and K. Claffy 산업공학과 조희권.
Network Protocols: Design and Analysis Polly Huang EE NTU
Hot Systems, Volkmar Uhlig
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.
The Measured Access Characteristics of World-Wide-Web Client Proxy Caches Bradley M. Duska, David Marwood, and Michael J. Feeley Department of Computer.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
An Analysis of Internet Content Delivery Systems 19 rd November, 2007 Youngsub CSE, SNU.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
On the scale and performance of cooperative Web proxy caching 2/3/06.
© 2003, Carla Ellis Strong Inference J. Pratt Progress in science advances by excluding among alternate hypotheses. Experiments should be designed to disprove.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Whole Page Performance Leeann Bent and Geoffrey M. Voelker University of California, San Diego.
Author : Tzi-Cker Chiueh, Prashant Pradhan Publisher : High-Performance Computer Architecture, Presenter : Jo-Ning Yu Date : 2010/11/03.
Does Internet media traffic really follow the Zipf-like distribution? Lei Guo 1, Enhua Tan 1, Songqing Chen 2, Zhen Xiao 3, and Xiaodong Zhang 1 1 Ohio.
The Impact of Replacement Granularity on Video Caching
Internet Networking recitation #12
On the Scale and Performance of Cooperative Web Proxy Caching
Lecture 1: Bloom Filters
Presentation transcript:

Large-Scale Web Caching and Content Delivery Jeff Chase CPS 212: Distributed Information Systems Fall 2000

Caching for a Better Web Performance is a major concern in the Web Proxy caching is the most widely used method to improve Web performance Duplicate requests to the same document served from cache Hits reduce latency, network utilization, server load Misses increase latency (extra hops) ClientsProxy CacheServers Hits Misses Internet [Source: Geoff Voelker]

Cache Effectiveness Previous work has shown that hit rate increases with population size [Duska et al. 97, Breslau et al. 98] However, single proxy caches have practical limits Load, network topology, organizational constraints One technique to scale the client population is to have proxy caches cooperate [Source: Geoff Voelker]

Cooperative Web Proxy Caching Sharing and/or coordination of cache state among multiple Web proxy cache nodes Effectiveness of proxy cooperation depends on:  Inter-proxy communication distance  Size of client population served  Proxy utilization and load balance Clients Proxy Clients Internet [Source: Geoff Voelker]

Resolve misses through the parent. Hierarchical Caches INTERNET clients origin Web site (e.g., U.S. Congress) clients Idea: place caches at exchange or switching points in the network, and cache at each level of the hierarchy. upstream downstream

Content-Sharing Among Peers INTERNET clients Idea: Since siblings are “close” in the network, allow them to share their cache contents directly.

Harvest-Style ICP Hierarchies INTERNET client query (probe) query response object request object response Examples Harvest [Schwartz96] Squid (NLANR) NetApp NetCache Idea: multicast probes within each “family”: pick first hit response or wait for all miss responses.

Issues for Cache Hierarchies With ICP: query traffic within “families” (size n) Inter-sibling ICP traffic (and aggregate overhead) is quadratic with n. Query-handling overhead grows linearly with n. miss latency Object passes through every cache from origin to client: deeper hierarchies scale better, but impose higher latencies. storage A recently-fetched object is replicated at every level of the tree. effectiveness Interior cache benefits are limited by capacity if objects are not likely to live there long (e.g., LRU).

Hashing: Cache Array Routing Protocol (CARP) INTERNET “GET Microsoft Proxy Server g-p v-z q-u a-f Advantages 1. single-hop request resolution 2. no redundant caching of objects 3. allows client-side implementation 4. no new cache-cache protocols 5. reconfigurable hash function

Issues for CARP no way to exploit network locality at each level e.g., relies on local browser caches to absorb repeats load balancing hash can be balanced and/or weighted with a load factor reflecting the capacity/power of each server must rebalance on server failures Reassigns (1/n)th of cached URLs for array size n. URLs from failed server are evenly distributed among the remaining n-1 servers. miss penalty and cost to compute the hash In CARP, hash cost is linear in n: hash with each node and pick the “winner”.

Directory-based: Summary Cache for ICP Idea: each caching server replicates the cache directory (“summary”) of each of its peers (e.g., siblings). [Cao et. al. Sigcomm98] Query a peer only if its local summary indicates a hit. To reduce storage overhead for summaries, implement the summaries compactly using Bloom Filters. May yield false hits (e.g., 1%), but not false misses. Each summary is three orders of magnitude smaller than the cache itself, and can be updated by multicasting just the flipped bits.

A Summary-ICP Hierarchy INTERNET client query query response object request object response Summary caches at each level of the hierarchy reduce inter-sibling miss queries by 95+%. hit miss e.g., Squid configured to use cache digests

Issues for Directory-Based Caches Servers update their summaries lazily. Update when “new” entries exceed some threshold percentage. Update delays may yield false hits and/or false misses. Other ways to reduce directory size? Vicinity cache [Gadde/Chase/Rabinovich98] Subsetting by popularity [Gadde/Chase/Rabinovich97] What are the limits to scalability? If we grow the number of peers? If we grow the cache sizes?

On the Scale and Performance.... [Wolman/Voelker/.../Levy99] is a key paper in this area over the last few years. first negative result in SOSP (?) illustrates tools for evaluating wide-area systems simulation and analytical modeling illustrates fundamental limits of caching benefits dictated by reference patterns and object rate of change forget about capacity, and assume ideal cooperation ties together previous work in the field wide-area cooperative caching strategies analytical models for Web workloads best traces

UW Trace Characteristics [Source: Geoff Voelker]

A Multi-Organization Trace University of Washington (UW) is a large and diverse client population Approximately 50K people UW client population contains 200 independent campus organizations Museums of Art and Natural History Schools of Medicine, Dentistry, Nursing Departments of Computer Science, History, and Music A trace of UW is effectively a simultaneous trace of 200 diverse client organizations Key: Tagged clients according to their organization in trace [Source: Geoff Voelker]

Cooperation Across Organizations Treat each UW organization as an independent “company” Evaluate cooperative caching among these organizations How much Web document reuse is there among these organizations? Place a proxy cache in front of each organization. What is the benefit of cooperative caching among these 200 proxies? [Source: Geoff Voelker]

Ideal Hit Rates for UW proxies Ideal hit rate - infinite storage, ignore cacheability, expirations Average ideal local hit rate: 43% [Source: Geoff Voelker]

Ideal Hit Rates for UW proxies Ideal hit rate - infinite storage, ignore cacheability, expirations Average ideal local hit rate: 43% Explore benefits of perfect cooperation rather than a particular algorithm Average ideal hit rate increases from 43% to 69% with cooperative caching [Source: Geoff Voelker]

Sharing Due to Affiliation UW organizational sharing vs. random organizations Difference in weighted averages across all orgs is ~5% [Source: Geoff Voelker]

Cacheable Hit Rates for UW proxies Cacheable hit rate - same as ideal, but doesn’t ignore cacheability Cacheable hit rates are much lower than ideal (average is 20%) Average cacheable hit rate increases from 20% to 41% with (perfect) cooperative caching [Source: Geoff Voelker]

Scaling Cooperative Caching Organizations of this size can benefit significantly from cooperative caching But…we don’t need cooperative caching to handle the entire UW population size A single proxy (or small cluster) can handle this entire population! No technical reason to use cooperative caching for this environment In the real world, decisions of proxy placement are often political or geographical How effective is cooperative caching at scales where a single cache cannot be used? [Source: Geoff Voelker]

Hit Rate vs. Client Population Curves similar to other studies [e.g., Duska97, Breslau98] Small organizations Significant increase in hit rate as client population increases The reason why cooperative caching is effective for UW Large organizations Marginal increase in hit rate as client population increases [Source: Geoff Voelker]

In the Paper Do we believe this? What are some possible sources of error in this tracing/simulation study? What impact might they have? 2. Why are “ideal” hit rates so much higher for the MS trace, but the cacheable hit rates are the same? What is the correlation between sharing and cacheability? 3. Why report byte hit rates as well as object hit rates? Is the difference significant? What does this tell us about reference patterns? 4. How can it be that byte hit rate increases with population, while bandwidth consumed is linear?

Trace-Driven Simulation: Sources of Error 1. End effects: is the trace interval long enough? Need adequate time for steady-state behavior to become apparent. 2. Sample size: is the population large enough? Is it representative? 3. Completeness: does the sample accurately capture the client reference streams? What about browser caches and lower-level proxies? How would they affect the results? 4. Client subsets: how to select clients to represent a subpopulation? 5. Is the simulation accurate/realistic? cacheability, capacity/replacement, expiration, latency

What about Latency? From the client’s perspective, latency matters far more than hit rate How does latency change with population? Median latencies improve only a few 100 ms with ideal caching compared to no caching. [Source: Geoff Voelker]

Questions/Issues 1. How did they obtain these reported latencies? 2. Why report median latency instead of mean? Is the difference significant? What does this tell us? Is it consistent with the reported byte hit ratios? 3. Why does the magnitude of the possible error decrease with population? 4. What about the future? What changes in Web behavior might lead to different conclusions in the future? Will latency be as important? Bandwidth?

Large Organization Cooperation What is the benefit of cooperative caching among large organizations? Explore three ways Linear extrapolation of UW trace Simultaneous trace of two large organizations (UW and MS) Analytic model for populations beyond trace limits [Source: Geoff Voelker]

Extrapolation to Larger Client Populations Use least squares fit to create a linear extrapolation of hit rates Hit rate increases logarithmically with client population, e.g., to increase hit rate by 10%: Need 8 UWs (ideal) Need 11 UWs (cacheable) “Low ceiling”, though: 61% at 2.1M clients (UW cacheable) A city-wide cooperative cache would get all the benefit [Source: Geoff Voelker]

UW & Microsoft Cooperation Use traces of two large organizations to evaluate caching systems at medium-scale client populations We collected a Microsoft proxy trace during same time period as the UW trace Combined population is ~80K clients Increases the UW population by a factor of 3.6 Increases the MS population by a factor of 1.4 Cooperation among UW & MS proxies… Gives marginal benefit: 2-4% Benefit matches “hit rate vs. population” curve [Source: Geoff Voelker]

UW & Microsoft Traces [Source: Geoff Voelker]

UW & MS Cooperative Caching Is this worth it? [Source: Geoff Voelker]

Analytic Model Use an analytic model to evaluate caching systems at very large client populations Parameterize with trace data, extrapolate beyond trace limits Steady-state model Assumes caches are in steady state, do not start cold Accounts for document rate of change Explore growth of Web, variation in document popularity, rate of change Results agree with trace extrapolations 95% of maximum benefit achieved at the scale of a medium- large city (500,000) [Source: Geoff Voelker]

Inside the Model [Wolman/Voelker/Levy et. al., SOSP 1999] refines [Breslau/Cao et. al., 1999], and others Approximates asymptotic cache behavior assuming Zipf-like object popularity caches have sufficient capacity Parameters: = per-client request rate  = rate of object change p c = percentage of objects that are cacheable  = Zipf parameter (object popularity)

Zipf [Breslau/Cao99] and others observed that Web accesses can be modeled using Zipf-like probability distributions. Rank objects by popularity: lower rank i ==> more popular. The probability that any given reference is to the ith most popular object is p i Not to be confused with p c, the percentage of cacheable objects. Zipf says: “p i is proportional to 1/i , for some  with 0 <  < 1 ”. Higher  gives more skew: popular objects are way popular. Lower  gives a more heavy-tailed distribution. In the Web,  ranges from 0.6 to 0.8 [Breslau/Cao99]. With  =0.8, 0.3% of the objects get 40% of requests.

Cacheable Hit Ratio: the Formula C N is the hit ratio for cacheable objects achievable by population of size N with a universe of n objects. N

Inside the Hit Ratio Formula Approximates a sum over a universe of n objects......of the probability of access to each object x... …times the probability x was accessed since its last change. C is just a normalizing constant for the Zipf-like popularity distribution, which must sum to 1. C is not to be confused with C N. C = 1/  in [Breslau/Cao 99] 0 <  < 1 N

Inside the Hit Ratio Formula, Part 2 What is the probability that i was accessed since its last invalidate? = (rate of accesses to i)/(rate of accesses or changes to i) = Np i / ( Np i +  ) N Divide through by Np i. Note: by Zipf p i = 1/C i  so: 1/( Np i ) = C i  / N

Hit Rates From Model Cacheable Hit Rate Focus on cacheable objects Four curves correspond to different rate of change distributions Believe even Slow and Mid-Slow are generous Knee at 500K – 1M [Source: Geoff Voelker]

Extrapolating UW & MS Hit Rates [Graph from Geoff Voelker] These are from the simulation results, ignoring rate of change (compare to graphs from analytic model). What is the significance of slope ?

Latency From Model Straightforward calculation from the hit rate results [Source: Geoff Voelker]

Rate of Change What is more important, the rate of change of popular objects or the rate of change of unpopular objects? Separate popular from unpopular objects Look at sensitivity of hit rate to variations in rate of change [Source: Geoff Voelker]

Rate of Change Sensitivity Popular docs sensitivity Top curve Unpopular low R-of-C Issue is minutes to hours Unpopular docs sensitivity Bottom curve Popular low R-of-C Days to weeks to month Unpopular more sensitive than popular! Compare differences in hit rates between A,C and B,C [Source: Geoff Voelker]

Hierarchical Caches and CDNS What are the implications of this study for hierarchical caches and Content Delivery Networks (e.g., Akamai)? Demand-side proxy caches are widely deployed and are likely to become ubiquitous. What is the marginal benefit from a supply-side CDN cache given ubiquitous demand-side proxy caching? What effect would we expect to see in a trace gathered at an interior cache? CDN interior caches can be modeled as upstream caches in a hierarchy, given some simplifying assumptions.

Level 2 Level 1 (Root) N 2 clients N 1 clients An Idealized Hierarchy Assume the trees are symmetric to simplify the math. Ignore individual caches and solve for each level.

Hit Ratio at Interior Level i C N gives us the hit ratio for a complete subtree covering population N The hit ratio predicted at level i or at any cache in level i over R requests is given by: “the hits for N i (at level i) minus the hits captured by level i+1, over the miss stream from level i+1”

Root Hit Ratio Predicted hit ratio for cacheable objects, observed at root of a two-level cache hierarchy (i.e. where r 2 =Rp c ):

N L clients N clients Generalizing to CDNs Request Routing Function Interior Caches (supply side “reverse proxy”) N I clients ƒ(leaf, object, state) Leaf Caches (demand side) N L clients Symmetry assumption: ƒ is stable and “balanced”. ƒ

Hit ratio in CDN caches Given the symmetry and balance assumptions, the cacheable hit ratio at the interior (CDN) nodes is: N I is the covered population at each CDN cache. N L is the population at each leaf cache.

Cacheable interior hit ratio cacheable hit ratio increasing N I and N L --> fixed fanout N I /N L Interior hit rates improve as leaf populations increase....

Interior hit ratio as percentage of all cacheable requests marginal cacheable hit ratio increasing N I and N L -->....but, the interior cache sees a declining share of traffic.