1 11 Web Caching Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB CACHING  Cache Definition  Goals of Web Caching  Motivations for.

Slides:



Advertisements
Similar presentations
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
Advertisements

W3C Workshop on Web Services Mark Nottingham
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
1 Caching in HTTP Representation and Management of Data on the Internet.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Hypertext Transfer Protocol Kyle Roth Mark Hoover.
EEC-484/584 Computer Networks Discussion Session for HTTP and DNS Wenbing Zhao
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Towards a Better Understanding of Web Resources and Server Responses for Improved Caching Craig E. Wills and Mikhail Mikhailov Computer Science Department.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Domain Name System: DNS
Web, HTTP and Web Caching
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Caching And Prefetching For Web Content Distribution Presented By:- Harpreet Singh Sidong Zeng ECE Fall 2007.
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
Web Caching Schemes For The Internet – cont. By Jia Wang.
HTTP HyperText Transfer Protocol Part 3.
1 ENHANCHING THE WEB’S INFRASTUCTURE: FROM CACHING TO REPLICATION ECE 7995 Presented By: Pooja Swami and Usha Parashetti.
Hypertext Transport Protocol CS Dick Steflik.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
Web Proxy Server Anagh Pathak Jesus Cervantes Henry Tjhen Luis Luna.
Web Cache. Introduction what is web cache?  Introducing proxy servers at certain points in the network that serve in caching Web documents for faster.
COMPUTER TERMS PART 1. COOKIE A cookie is a small amount of data generated by a website and saved by your web browser. Its purpose is to remember information.
1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients 
1 3 Web Proxies Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB PROXIES  Web Proxy Definition  Three of the Most Common Intermediaries.
Krerk Piromsopa. Web Caching Krerk Piromsopa. Department of Computer Engineering. Chulalongkorn University.
Web Caching: Replication on the World Wide Web Jonathan Bulava CSC8530 – Distributed Systems Dr. Paul Schragger.
Design and Implement an Efficient Web Application Server Presented by Tai-Lin Han Date: 11/28/2000.
Chapter 4. After completion of this chapter, you should be able to: Explain “what is the Internet? And how we connect to the Internet using an ISP. Explain.
Rensselaer Polytechnic Institute Shivkumar Kalvanaraman, Biplab Sikdar 1 The Web: the http protocol http: hypertext transfer protocol Web’s application.
CMPE 421 Parallel Computer Architecture
Web HTTP Hypertext Transfer Protocol. Web Terminology ◘Message: The basic unit of HTTP communication, consisting of structured sequence of octets matching.
10/8/2015CST Computer Networks1 IP Routing CST 415.
ECO-DNS: Expected Consistency Optimization for DNS Chen Stephanos Matsumoto Adrian Perrig © 2013 Stephanos Matsumoto1.
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 8 Omar Meqdadi Department of Computer Science and Software Engineering University of.
CSE 461 HTTP and the Web. This Lecture  HTTP and the Web (but not HTML)  Focus  How do Web transfers work?  Topics  HTTP, HTTP1.1  Performance Improvements.
1 Caching in HTTP Representation and Management of Data on the Internet.
GPSR: Greedy Perimeter Stateless Routing for Wireless Networks EECS 600 Advanced Network Research, Spring 2005 Shudong Jin February 14, 2005.
HTTP support for caching & replication. Conditional requests Server executes conditional request. Responds with a message body only if the condition is.
Web Cache Consistency. “Requirements of performance, availability, and disconnected operation require us to relax the goal of semantic transparency.”
On The Cooperation of Web Clients and Proxy Caches Yiu Fai Sit, Francis C.M. Lau, Cho-Li Wang Department of Computer Science The University of Hong Kong.
HTTP evolution - TCP/IP issues Lecture 4 CM David De Roure
27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
EE 122: Lecture 21 (HyperText Transfer Protocol - HTTP) Ion Stoica Nov 20, 2001 (*)
An Overview of Proxy Caching Algorithms Haifeng Wang.
ASP-2-1 SERVER AND CLIENT SIDE SCRITPING Colorado Technical University IT420 Tim Peterson.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
1 COMP 431 Internet Services & Protocols HTTP Persistence & Web Caching Jasleen Kaur February 11, 2016.
Ad Hoc On-Demand Distance Vector Routing (AODV) ietf
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
for all Hyperion video tutorial/Training/Certification/Material Essbase Optimization Techniques by Amit.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
© Janice Regan, CMPT 128, Jan 2007 CMPT 371 Data Communications and Networking HTTP 0.
HTTP Protocol Amanda Burrows. HTTP Protocol The HTTP protocol is used to send HTML documents through the Internet. The HTTP protocol sends the HTML documents.
A tutorial on Web Caching
HTTP request message: general format
Web Caching? Web Caching:.
Internet Networking recitation #12
Distributed Systems CS
CSE 461 HTTP and the Web.
HyperText Transfer Protocol
EE 122: HyperText Transfer Protocol (HTTP)
Internet: Quality of Service Mechanisms at Application Level
Presentation transcript:

1 11 Web Caching Web Protocols and Practice

2 Topics Web Protocols and Practice WEB CACHING  Cache Definition  Goals of Web Caching  Motivations for Caching  What is Cacheable?  Protocol-specific Considerations  Content-specific Considerations  Where is Caching Done?  How is Caching Done?  Returning a Cached Response  Maintaining a Cache  Cache Replacement  Cache Coherency

3 Cache Definition Web Protocols and Practice WEB CACHING  With the rapid increase of traffic on the Web, caching was the first major technique that attempted to  reduce user-perceived latency  reduce transmission of redundant traffic on the network  Cache is a local store of response messages.  Cache is the movement of Web content closer to the users.

4 Goals of Web Caching Web Protocols and Practice WEB CACHING  The goals of caching are to reduce:  The user-experienced latency between the time of the initial Web requests and the time the response is displayed by the user agent »Reducing user-perceived latency has an important implications not just for the user’s Web experience, but also for content developers.  The load on the network, which could be a local area network or the Internet, by avoiding repeated transmission of the same response »Transferring only necessary information reduces the overall congestion in the network

5 Goals of Web Caching Web Protocols and Practice WEB CACHING »Reduction of congestion leads to improved performance for everyone using the network, because fewer packets are lost and there is less need for retransmission resulting from packet drops.  The load on the origin server by having an intermediary on the path between the client and the origin server handle the requests »The origin server can handle more requests from a diverse set of clients

6 Motivations for Caching Web Protocols and Practice WEB CACHING  Web-hosting companies must pay for the bandwidth they use and might want to increase cacheability to reduce costs.  The end users gain significantly from caching, because their latency in obtaining a response is lowered.  Reducing traffic or moving it to the edge of the network and away from the backbone would be beneficial:  Only necessary data traverses the network  There is bandwidth available for other data

7 Motivations for Caching Web Protocols and Practice WEB CACHING  Following delay factors affect on fetching a resource:  The network connectivity of the user to their ISP and the connection between the ISP and the Internet  Unless the DNS lookup is cached, the DNS lookup time to locate the server to contact, even if the server being contacted is a proxy  The congestion in the network and the bandwidth available on the path between user and origin server

8 Motivations for Caching Web Protocols and Practice WEB CACHING  The load on the origin server  The time to generate the response  The time to render the response by the browser

9 What is Cacheable? Web Protocols and Practice WEB CACHING  A cache can decide whether a response is cacheable based on two factors:  Protocol-specific considerations »Protocol-specific caching considerations require that a cache obey the various directives regarding cacheability of a message.  Content-specific considerations »The content-specific requirements are affected by the business requirements of a cache and policies that affect the frequency of cache revalidation. »The policies in turn may be affected by attributes of the message, such as size or content type.

10 Protocol-specific Considerations Web Protocols and Practice WEB CACHING  The request method, request header fields, response status, and response headers all have to indicate that the response is cacheable.  Responses to the OPTIONS, PUT, and DELETE methods are not cacheable.  Responses to the POST method are not cacheable unless the response has the necessary Cache-Control and Expires headers.  If a cache does not support the range header, any response that has a response status code of 206 Partial Content cannot be cached.

11 Protocol-specific Considerations Web Protocols and Practice WEB CACHING  Some responses include resource-specific information from the origin server that may preclude caching of the message. Such information is of two kinds:  Cacheability information »If the response includes the cacheability information, the decision to cache should be driven by that. »For example, the server might provide explicit freshness duration via headers such as Expires. »If the time specified in Expires is a short time away from the time the response was received, the source may not be cached.

12 Protocol-specific Considerations Web Protocols and Practice WEB CACHING  Cache directives »The Cache-Control directive may preclude caching of certain responses.  Cache-Control: private – A shared cache must not cache the response.  Cache-Control: no-store – A cache must not store a response message. This directive can appear in a request or response.  Cache-Control: no-cache – A cache must not cache the response, because the cached response would have to be revalidated each time before it is returned as a possible cache hit.  The Authorization request header indicates that the requested resource is not available for everyone and can not be cached.

13 Protocol-specific Considerations Web Protocols and Practice WEB CACHING  The Vary header indicates that an acceptable cached response would be constrained by the values specified in the Vary header.

14 Content-specific Considerations Web Protocols and Practice WEB CACHING  Just because a resource is cacheable does not mean that it will be cached.  Messages could be large, dynamically generated, or include cookies, all of which could affect cacheability of a message.  Cache policy may be driven by factors such as attributes of a message.  The frequency with which caches revalidate resources with the origin server.

15 Content-specific Considerations Web Protocols and Practice WEB CACHING  A shared cache may not want to cache responses to queries that have personal information.  Active Server Pages (ASP) and requests for documents triggering authentication are not good candidates for caching.  large resources may not be cached even though they may be cacheable.

16 Content-specific Considerations Web Protocols and Practice WEB CACHING  The basic assumption in caching is that the same response is likely to be generated in the future, and a request for such a response might occur in the near future.  The presence of cacheability information in a dynamic response such as an Expires or ETag header may indicate that the resource is actually cacheable.

17 Content-specific Considerations Web Protocols and Practice WEB CACHING  Responses that include data tailored to a specific user may be viewed as uncacheable.  Responses with cookie information in them are considered uncacheable.  The decision to cache is affected by the rate of change of resources.  Examining the rate of change of a resource is a valid metric for deciding cacheability.

18 Content-specific Considerations Web Protocols and Practice WEB CACHING  One early heuristic for deciding on the cacheability of a resource was the last modification time of a resource.  The load on a cache may also have impact on whether a response should be cached.

19 Where is Caching Done? Web Protocols and Practice WEB CACHING  Caches are found in browsers and in any of the Web intermediaries between the user agent and the origin server.  A cache is located in a proxy, in addition to in a browser.  A browser cache can avoid having to refetch pages the user examined during the same session. However, a browser cache does not take advantage of frequently requested resources by other users in the same local environment.

20 Where is Caching Done? Web Protocols and Practice WEB CACHING  A caching proxy can help dozens of users.  A browser cache can store a reasonable set of recently received responses for a longer time than a caching proxy.  A caching proxy, being a resource shared by hundreds of users, may have to evict some responses sooner than a browser cache.  A regional cache can help several geographically colocated caches in one or more administrative entities.

21 Where is Caching Done? Web Protocols and Practice WEB CACHING  A national cache can group a set of regional caches and help reduce costs in countries facing high traffic for moving data across national boundries.  In a reverse proxy, caching occurs on behalf of origin servers and not on behalf of users.  Interception proxies can be placed anywhere on the network and can examine the network and transport layer of the protocol stack.

22 How is Caching Done? Web Protocols and Practice WEB CACHING  First, a cache must decide whether a message is cacheable, then decide if space is available and, if not, how to replace some of the existing cached objects.  The cache, upon receiving a request must decide whether it can satisfy the request and, if so, return the cached response while updating some information.  The cache must have a coherency policy for maintaining freshness information of the cached resource.

23 How is Caching Done? Web Protocols and Practice WEB CACHING  The common criteria used to decide on cacheability of a message are as follows:  Are there protocol requirements that prevent the response from being cached?  Is the content typically uncacheable?  Is the cached response likely to be reused again?  Will the decision to cache a particular response lead to replacement of one or more resources?

24 How is Caching Done? Web Protocols and Practice WEB CACHING  After deciding to store the message, the cache checks to see whether the message can be stored without evicting other objects from the cache. If not, the cache replacement algorithm is triggered.  Often, resources known to be stale are evicted from a cache even if the cache is not full.

25 How is Caching Done? Web Protocols and Practice WEB CACHING  This reduces the need for triggering the cache replacement algorithm at the time a request is being handled, thus lowering user-perceived latency.  Once space becomes available, the cache extracts information about the message, such as last modification time, and expiry, or staleness- related information.  Message headers like Expire and Cache- Control: max-stale carry information about expiration.

26 How is Caching Done? Web Protocols and Practice WEB CACHING  Expire and Cache-Control header fields help the cache comply with restrictions on the length of time a cached response can be returned as a valid response.  In the absence of specific expiration time information in the message, the cache uses a heuristic expiration time to decide when the message becomes stale.  The heuristic expiration time could be based on the Last-Modified time associated with the resource.

27 How is Caching Done? Web Protocols and Practice WEB CACHING  A cache could add a fixed amount of time, say ten minutes, to the Last-Modified value and use that as a freshness interval.

28 Returning a Cached Response Web Protocols and Practice WEB CACHING  When a response is found in the cache, a “cache hit” has occurred.  A revalidation may be performed to ensure that the cashed response is still fresh.  If revalidation indicates that the response is still fresh, the request is satisfied from the cache.  Otherwise, the cache gets a new copy of the resource and uses its caching policy while forwarding it to the client.

29 Returning a Cached Response Web Protocols and Practice WEB CACHING  If the request is not found in the cache (i.e., a “cache miss”), the request is forwarded.

30 Maintaining a Cache Web Protocols and Practice WEB CACHING  Periodically, a cache may check to see if the objects in the cache are still fresh and trigger eviction of stale objects.  A cache might want to prevalidate popular objects to ensure that more frequently requested objects are fresh.  Prevalidation could be done via the HTTP HEAD request.

31 Maintaining a Cache Web Protocols and Practice WEB CACHING  A cache could also contact the origin server to see if the resource has changed and, if so, prefetch it to update its cache.  Such approaches trade off bandwidth against latency.

32 Cache Replacement Web Protocols and Practice WEB CACHING  Once the cache is full, the objects must be removed to make room to cache new responses.  The caching approaches consist of a combination of a set of metrics that includes the size of cached objects, their content type, and even a notion of network distance to the origin server.  The usefulness of retaining a response in the cache can be gauged by the following factors:  Cost of fetching the resource »keep resources that were expensive to fetch

33 Cache Replacement Web Protocols and Practice WEB CACHING  Cost of storing the resource »Large resources take more space, but if they were replaced, fetching them again would also be more expensive.  The number of accesses to the resource in the past »keep objects that have been accessed many times in the past  The probability of the resource being accessed in the future »If a resource is likely to be retrieve in the near future, it would not make sense to remove it from the cache.

34 Cache Replacement Web Protocols and Practice WEB CACHING  The time since the last modification of the resource »keep resources that have not been modified for a long time.  The heuristic expiration time »remove resources that are close to their expiration time.

35 Cache Replacement Web Protocols and Practice WEB CACHING  Several algorithms for replacement have been proposed:  Least Recently Used (LRU) »Removes the oldest object (in terms of the time at which it was last accessed) from the cache. »Objects that have been accessed more recently are likely to be accessed again, and so less accessed objects should be evicted.  Least Frequently Used (LFU) »Ranks the objects in terms of frequency of access »Removes the object that is the least frequently used

36 Cache Replacement Web Protocols and Practice WEB CACHING  Size of object (SIZE) »Delete the largest object in the cache  Hyper-G (LFU/LRU/SIZE) »Combines the LFU, LRU, and SIZE policies. »First consideration for replacement is LFU, then LRU, then SIZE  GreedyDual-Size »Associates a utility value for each resource »Replaces the resource that has the lowest utility »utility uses the cost of fetching the resource, its size, and age (that is updated as resources leave the cache).

37 Cache Replacement Web Protocols and Practice WEB CACHING  Cache replacement has generally faded from the practical arena for the following four main reasons:  Steadily falling cost of storage leads to caches of sizes large enough to hold most of the resources requested.  An overall reduction in the fraction of traffic that is cacheable.  The “good-enough” algorithms that satisfy most situations in which cache replacement is used. Algorithms such as Greedy Dual-Size and Hyper- G are in the good enough category.

38 Cache Replacement Web Protocols and Practice WEB CACHING  Change in resources over time reduces the value of having a large cache that can store them longer.

39 Cache Coherency Web Protocols and Practice WEB CACHING  A cache may have to ensure that a cached response is still fresh before returning it to the client requesting the resource.  Caches may simply return an older cached value due to:  The connection to the origin server is down  The cache is busy  The most common approach in the Web to check the coherency is to send a GET or a HEAD request with an If-Modified-Since request header.

40 Cache Coherency Web Protocols and Practice WEB CACHING  Entity tags, in conjunction with the If-Match header, can be used to perform coherency checks against specific variants of a resource.  If a caching proxy sends a revalidation request each time a cache hit occurs, the policy is called strong consistency.  If the cache uses a heuristic to decide whether the cached response is still fresh, without the consulting the origin server each time a cache hit occurs, such a policy is called weak consistency.

41 Cache Coherency Web Protocols and Practice WEB CACHING  The following two heuristics are among weak consistency approaches:  A lease-based approach »A cache agrees to store a response for a fixed amount of time (the lease period) without revalidating. »The server promises to notify the cache if a cached resource changes within the lease period.  A time to live (TTL) approach »Responses have a cache expiration time associated with them. »When the time interval passes, the responses are considered stale.

42 Cache Coherency Web Protocols and Practice WEB CACHING »The TTL value can vary with the response and can be based on the following factors:  The expiration time specified in the response header field  The frequency of request for a cached resource  Mobile environment  The last modification time of the resource  Maintaining consistency can have a serious impact on cache response time because each revalidation request has the overhead of contacting the origin server.

43 Cache Coherency Web Protocols and Practice WEB CACHING  The dominance of the connection cost to the origin server points to the need for reducing the number of revalidation requests.