1 11 Web Caching Web Protocols and Practice
2 Topics Web Protocols and Practice WEB CACHING Cache Definition Goals of Web Caching Motivations for Caching What is Cacheable? Protocol-specific Considerations Content-specific Considerations Where is Caching Done? How is Caching Done? Returning a Cached Response Maintaining a Cache Cache Replacement Cache Coherency
3 Cache Definition Web Protocols and Practice WEB CACHING With the rapid increase of traffic on the Web, caching was the first major technique that attempted to reduce user-perceived latency reduce transmission of redundant traffic on the network Cache is a local store of response messages. Cache is the movement of Web content closer to the users.
4 Goals of Web Caching Web Protocols and Practice WEB CACHING The goals of caching are to reduce: The user-experienced latency between the time of the initial Web requests and the time the response is displayed by the user agent »Reducing user-perceived latency has an important implications not just for the user’s Web experience, but also for content developers. The load on the network, which could be a local area network or the Internet, by avoiding repeated transmission of the same response »Transferring only necessary information reduces the overall congestion in the network
5 Goals of Web Caching Web Protocols and Practice WEB CACHING »Reduction of congestion leads to improved performance for everyone using the network, because fewer packets are lost and there is less need for retransmission resulting from packet drops. The load on the origin server by having an intermediary on the path between the client and the origin server handle the requests »The origin server can handle more requests from a diverse set of clients
6 Motivations for Caching Web Protocols and Practice WEB CACHING Web-hosting companies must pay for the bandwidth they use and might want to increase cacheability to reduce costs. The end users gain significantly from caching, because their latency in obtaining a response is lowered. Reducing traffic or moving it to the edge of the network and away from the backbone would be beneficial: Only necessary data traverses the network There is bandwidth available for other data
7 Motivations for Caching Web Protocols and Practice WEB CACHING Following delay factors affect on fetching a resource: The network connectivity of the user to their ISP and the connection between the ISP and the Internet Unless the DNS lookup is cached, the DNS lookup time to locate the server to contact, even if the server being contacted is a proxy The congestion in the network and the bandwidth available on the path between user and origin server
8 Motivations for Caching Web Protocols and Practice WEB CACHING The load on the origin server The time to generate the response The time to render the response by the browser
9 What is Cacheable? Web Protocols and Practice WEB CACHING A cache can decide whether a response is cacheable based on two factors: Protocol-specific considerations »Protocol-specific caching considerations require that a cache obey the various directives regarding cacheability of a message. Content-specific considerations »The content-specific requirements are affected by the business requirements of a cache and policies that affect the frequency of cache revalidation. »The policies in turn may be affected by attributes of the message, such as size or content type.
10 Protocol-specific Considerations Web Protocols and Practice WEB CACHING The request method, request header fields, response status, and response headers all have to indicate that the response is cacheable. Responses to the OPTIONS, PUT, and DELETE methods are not cacheable. Responses to the POST method are not cacheable unless the response has the necessary Cache-Control and Expires headers. If a cache does not support the range header, any response that has a response status code of 206 Partial Content cannot be cached.
11 Protocol-specific Considerations Web Protocols and Practice WEB CACHING Some responses include resource-specific information from the origin server that may preclude caching of the message. Such information is of two kinds: Cacheability information »If the response includes the cacheability information, the decision to cache should be driven by that. »For example, the server might provide explicit freshness duration via headers such as Expires. »If the time specified in Expires is a short time away from the time the response was received, the source may not be cached.
12 Protocol-specific Considerations Web Protocols and Practice WEB CACHING Cache directives »The Cache-Control directive may preclude caching of certain responses. Cache-Control: private – A shared cache must not cache the response. Cache-Control: no-store – A cache must not store a response message. This directive can appear in a request or response. Cache-Control: no-cache – A cache must not cache the response, because the cached response would have to be revalidated each time before it is returned as a possible cache hit. The Authorization request header indicates that the requested resource is not available for everyone and can not be cached.
13 Protocol-specific Considerations Web Protocols and Practice WEB CACHING The Vary header indicates that an acceptable cached response would be constrained by the values specified in the Vary header.
14 Content-specific Considerations Web Protocols and Practice WEB CACHING Just because a resource is cacheable does not mean that it will be cached. Messages could be large, dynamically generated, or include cookies, all of which could affect cacheability of a message. Cache policy may be driven by factors such as attributes of a message. The frequency with which caches revalidate resources with the origin server.
15 Content-specific Considerations Web Protocols and Practice WEB CACHING A shared cache may not want to cache responses to queries that have personal information. Active Server Pages (ASP) and requests for documents triggering authentication are not good candidates for caching. large resources may not be cached even though they may be cacheable.
16 Content-specific Considerations Web Protocols and Practice WEB CACHING The basic assumption in caching is that the same response is likely to be generated in the future, and a request for such a response might occur in the near future. The presence of cacheability information in a dynamic response such as an Expires or ETag header may indicate that the resource is actually cacheable.
17 Content-specific Considerations Web Protocols and Practice WEB CACHING Responses that include data tailored to a specific user may be viewed as uncacheable. Responses with cookie information in them are considered uncacheable. The decision to cache is affected by the rate of change of resources. Examining the rate of change of a resource is a valid metric for deciding cacheability.
18 Content-specific Considerations Web Protocols and Practice WEB CACHING One early heuristic for deciding on the cacheability of a resource was the last modification time of a resource. The load on a cache may also have impact on whether a response should be cached.
19 Where is Caching Done? Web Protocols and Practice WEB CACHING Caches are found in browsers and in any of the Web intermediaries between the user agent and the origin server. A cache is located in a proxy, in addition to in a browser. A browser cache can avoid having to refetch pages the user examined during the same session. However, a browser cache does not take advantage of frequently requested resources by other users in the same local environment.
20 Where is Caching Done? Web Protocols and Practice WEB CACHING A caching proxy can help dozens of users. A browser cache can store a reasonable set of recently received responses for a longer time than a caching proxy. A caching proxy, being a resource shared by hundreds of users, may have to evict some responses sooner than a browser cache. A regional cache can help several geographically colocated caches in one or more administrative entities.
21 Where is Caching Done? Web Protocols and Practice WEB CACHING A national cache can group a set of regional caches and help reduce costs in countries facing high traffic for moving data across national boundries. In a reverse proxy, caching occurs on behalf of origin servers and not on behalf of users. Interception proxies can be placed anywhere on the network and can examine the network and transport layer of the protocol stack.
22 How is Caching Done? Web Protocols and Practice WEB CACHING First, a cache must decide whether a message is cacheable, then decide if space is available and, if not, how to replace some of the existing cached objects. The cache, upon receiving a request must decide whether it can satisfy the request and, if so, return the cached response while updating some information. The cache must have a coherency policy for maintaining freshness information of the cached resource.
23 How is Caching Done? Web Protocols and Practice WEB CACHING The common criteria used to decide on cacheability of a message are as follows: Are there protocol requirements that prevent the response from being cached? Is the content typically uncacheable? Is the cached response likely to be reused again? Will the decision to cache a particular response lead to replacement of one or more resources?
24 How is Caching Done? Web Protocols and Practice WEB CACHING After deciding to store the message, the cache checks to see whether the message can be stored without evicting other objects from the cache. If not, the cache replacement algorithm is triggered. Often, resources known to be stale are evicted from a cache even if the cache is not full.
25 How is Caching Done? Web Protocols and Practice WEB CACHING This reduces the need for triggering the cache replacement algorithm at the time a request is being handled, thus lowering user-perceived latency. Once space becomes available, the cache extracts information about the message, such as last modification time, and expiry, or staleness- related information. Message headers like Expire and Cache- Control: max-stale carry information about expiration.
26 How is Caching Done? Web Protocols and Practice WEB CACHING Expire and Cache-Control header fields help the cache comply with restrictions on the length of time a cached response can be returned as a valid response. In the absence of specific expiration time information in the message, the cache uses a heuristic expiration time to decide when the message becomes stale. The heuristic expiration time could be based on the Last-Modified time associated with the resource.
27 How is Caching Done? Web Protocols and Practice WEB CACHING A cache could add a fixed amount of time, say ten minutes, to the Last-Modified value and use that as a freshness interval.
28 Returning a Cached Response Web Protocols and Practice WEB CACHING When a response is found in the cache, a “cache hit” has occurred. A revalidation may be performed to ensure that the cashed response is still fresh. If revalidation indicates that the response is still fresh, the request is satisfied from the cache. Otherwise, the cache gets a new copy of the resource and uses its caching policy while forwarding it to the client.
29 Returning a Cached Response Web Protocols and Practice WEB CACHING If the request is not found in the cache (i.e., a “cache miss”), the request is forwarded.
30 Maintaining a Cache Web Protocols and Practice WEB CACHING Periodically, a cache may check to see if the objects in the cache are still fresh and trigger eviction of stale objects. A cache might want to prevalidate popular objects to ensure that more frequently requested objects are fresh. Prevalidation could be done via the HTTP HEAD request.
31 Maintaining a Cache Web Protocols and Practice WEB CACHING A cache could also contact the origin server to see if the resource has changed and, if so, prefetch it to update its cache. Such approaches trade off bandwidth against latency.
32 Cache Replacement Web Protocols and Practice WEB CACHING Once the cache is full, the objects must be removed to make room to cache new responses. The caching approaches consist of a combination of a set of metrics that includes the size of cached objects, their content type, and even a notion of network distance to the origin server. The usefulness of retaining a response in the cache can be gauged by the following factors: Cost of fetching the resource »keep resources that were expensive to fetch
33 Cache Replacement Web Protocols and Practice WEB CACHING Cost of storing the resource »Large resources take more space, but if they were replaced, fetching them again would also be more expensive. The number of accesses to the resource in the past »keep objects that have been accessed many times in the past The probability of the resource being accessed in the future »If a resource is likely to be retrieve in the near future, it would not make sense to remove it from the cache.
34 Cache Replacement Web Protocols and Practice WEB CACHING The time since the last modification of the resource »keep resources that have not been modified for a long time. The heuristic expiration time »remove resources that are close to their expiration time.
35 Cache Replacement Web Protocols and Practice WEB CACHING Several algorithms for replacement have been proposed: Least Recently Used (LRU) »Removes the oldest object (in terms of the time at which it was last accessed) from the cache. »Objects that have been accessed more recently are likely to be accessed again, and so less accessed objects should be evicted. Least Frequently Used (LFU) »Ranks the objects in terms of frequency of access »Removes the object that is the least frequently used
36 Cache Replacement Web Protocols and Practice WEB CACHING Size of object (SIZE) »Delete the largest object in the cache Hyper-G (LFU/LRU/SIZE) »Combines the LFU, LRU, and SIZE policies. »First consideration for replacement is LFU, then LRU, then SIZE GreedyDual-Size »Associates a utility value for each resource »Replaces the resource that has the lowest utility »utility uses the cost of fetching the resource, its size, and age (that is updated as resources leave the cache).
37 Cache Replacement Web Protocols and Practice WEB CACHING Cache replacement has generally faded from the practical arena for the following four main reasons: Steadily falling cost of storage leads to caches of sizes large enough to hold most of the resources requested. An overall reduction in the fraction of traffic that is cacheable. The “good-enough” algorithms that satisfy most situations in which cache replacement is used. Algorithms such as Greedy Dual-Size and Hyper- G are in the good enough category.
38 Cache Replacement Web Protocols and Practice WEB CACHING Change in resources over time reduces the value of having a large cache that can store them longer.
39 Cache Coherency Web Protocols and Practice WEB CACHING A cache may have to ensure that a cached response is still fresh before returning it to the client requesting the resource. Caches may simply return an older cached value due to: The connection to the origin server is down The cache is busy The most common approach in the Web to check the coherency is to send a GET or a HEAD request with an If-Modified-Since request header.
40 Cache Coherency Web Protocols and Practice WEB CACHING Entity tags, in conjunction with the If-Match header, can be used to perform coherency checks against specific variants of a resource. If a caching proxy sends a revalidation request each time a cache hit occurs, the policy is called strong consistency. If the cache uses a heuristic to decide whether the cached response is still fresh, without the consulting the origin server each time a cache hit occurs, such a policy is called weak consistency.
41 Cache Coherency Web Protocols and Practice WEB CACHING The following two heuristics are among weak consistency approaches: A lease-based approach »A cache agrees to store a response for a fixed amount of time (the lease period) without revalidating. »The server promises to notify the cache if a cached resource changes within the lease period. A time to live (TTL) approach »Responses have a cache expiration time associated with them. »When the time interval passes, the responses are considered stale.
42 Cache Coherency Web Protocols and Practice WEB CACHING »The TTL value can vary with the response and can be based on the following factors: The expiration time specified in the response header field The frequency of request for a cached resource Mobile environment The last modification time of the resource Maintaining consistency can have a serious impact on cache response time because each revalidation request has the overhead of contacting the origin server.
43 Cache Coherency Web Protocols and Practice WEB CACHING The dominance of the connection cost to the origin server points to the need for reducing the number of revalidation requests.