Web Cache Replacements 張燕光 資訊工程系 成功大學

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

CS 241 Spring 2007 System Programming 1 Memory Replacement Policies Lecture 32 Klara Nahrstedt.
Cache Replacement Algorithm Outline Exiting document replacement algorithm Squids cache replacement algorithm Ideal Problem.
Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
Paging: Design Issues. Readings r Silbershatz et al: ,
Chapter 8 Virtual Memory
Chapter 4 Memory Management Page Replacement 补充:什么叫页面抖动?
Part IV: Memory Management
Design of the fast-pick area Based on Bartholdi & Hackman, Chpt. 7.
Chapter 6: Memory Management
Virtual Memory Background Demand Paging Performance of Demand Paging
Data Structures Hash Tables
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Virtual Memory Chapter 8.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
Virtual Memory Chapter 8.
1 Virtual Memory Chapter 8. 2 Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
Memory Management Chapter 5.
CSI 400/500 Operating Systems Spring 2009 Lecture #9 – Paging and Segmentation in Virtual Memory Monday, March 2 nd and Wednesday, March 4 th, 2009.
Web Cache Replacements 張燕光 資訊工程系 成功大學
A Hybrid Caching Strategy for Streaming Media Files Jussara M. Almeida Derek L. Eager Mary K. Vernon University of Wisconsin-Madison University of Saskatchewan.
Proxy Caching the Estimates Page Load Delays Roland P. Wooster and Marc Abrams Network Research Group, Computer Science Department, Virginia Tech 元智大學.
A Case for Delay-conscious Caching of Web Documents Peter Scheuermann, Junho Shim, Radek Vingralek Department of Electrical and Computer Engineering Northwestern.
Hashing General idea: Get a large array
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
Web Caching Schemes For The Internet – cont. By Jia Wang.
Evaluating Content Management Techniques for Web Proxy Caches Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich and Tai Jin Hewlett-Packard.
Memory Management Last Update: July 31, 2014 Memory Management1.
Virtual Memory.
O RERATıNG S YSTEM LESSON 10 MEMORY MANAGEMENT II 1.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.
1. Memory Manager 2 Memory Management In an environment that supports dynamic memory allocation, the memory manager must keep a record of the usage of.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
ICS 145B -- L. Bic1 Project: Main Memory Management Textbook: pages ICS 145B L. Bic.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
Proxy Cache and YOU By Stuart H. Schwartz. What is cache anyway? The general idea of cache is simple… Buffer data from a slow, large source within a (usually)
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
Web Cache Replacements 張燕光 資訊工程系 成功大學
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 10: Virtual Memory Background Demand Paging Page Replacement Allocation of.
Energy-Efficient Data Caching and Prefetching for Mobile Devices Based on Utility Huaping Shen, Mohan Kumar, Sajal K. Das, and Zhijun Wang P 邱仁傑.
Project Presentation By: Dean Morrison 12/6/2006 Dynamically Adaptive Prepaging for Effective Virtual Memory Management.
Informationsteknologi Wednesday, October 3, 2007Computer Systems/Operating Systems - Class 121 Today’s class Memory management Virtual memory.
An Overview of Proxy Caching Algorithms Haifeng Wang.
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
Page Buffering, I. Pages to be replaced are kept in main memory for a while to guard against poorly performing replacement algorithms such as FIFO Two.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Memory management The main purpose of a computer system is to execute programs. These programs, together with the data they access, must be in main memory.
WATCHMAN: A Data Warehouse Intelligent Cache Manager Peter ScheuermannJunho ShimRadek Vingralek Presentation by: Akash Jain.
Memory Management.
Chapter 2 Memory and process management
The Impact of Replacement Granularity on Video Caching
Computer Architecture
Memory Management 6/20/ :27 PM
How will execution time grow with SIZE?
Web Caching? Web Caching:.
Chapter 8 Virtual Memory
Main Memory Management
Chapter 9: Virtual-Memory Management
Operating Systems CMPSC 473
Operating Systems: Internals and Design Principles, 6/E
CSE 542: Operating Systems
Presentation transcript:

Web Cache Replacements 張燕光 資訊工程系 成功大學

2 Introduction Which page to be removed from its cache? –Finding a replacement algorithm that can yield high hit rate. Differences from traditional caching –nonhomogeneity of the object sizes –same frequency and different size, favor smaller objects if consider only hit rate, Byte hit rate

3 Introduction Other consideration –transfer time cost –Expiration time –Frequency Measurement metrics? admission control? When or how often to perform the replacement operations? How many documents to remove?

4 Measurement Metrics Hit Rate (HR): –% requests satisfied by cache –(shows fraction of requests not sent to server) Volume measures: –Weighted hit rate (WHR): Byte Hit Ratio % client-requested bytes returned by proxy (shows fraction of bytes not sent by server) –Fraction of packets not sent –Reduction in distance traveled (e.g., hop count) Latency Time

5 Three Categories Traditional replacement policies and its direct extensions: –LRU, LFU, … Key-based replacement policies: Cost-based replacement policies:

6 Traditional replacement Least Recently Used (LRU) evicts the object which was requested the least recently –prune off as many of the least recently used objects as is necessary to have sufficient space for the newly accessed object. –This may involve zero, one, or many replacements.

7 Traditional replacement Least Frequently used (LFU) evicts the object which is accessed least frequently. Pitkow/Recker evicts objects in LRU order, except if all objects are accessed within the same day, in which case the largest one is removed.

8 Key-based Replacement The idea in key-based policies is to sort objects based upon a primary key, break ties based on a secondary key, break remaining ties based on a tertiary key, and so on.

9 Key-based Replacement LRUMIN: –This policy is biased in favor of smaller sized objects so as to minimize the number of objects replaced. –Let the size of the incoming object be S. Suppose that this object will not fit in the cache. If there are any objects in the cache which have size at least S, we remove the least recently used such object from the cache. If there are no objects with size at least S, then we start removing objects in LRU order of size at least S/2, then objects of size at least S/4, and so on until enough free cache space has been created.

10 Key-based Replacement SIZE policy: –In this policy, the objects are removed in order of size, with the largest object removed first. time since last access –Ties based on size are somewhat rare, but when they occur they are broken by considering the time since last access. Specifically, objects with higher time since last access are removed first.

11 Key-based Replacement LRU-Threshold is the same as LRU, but objects larger than a certain threshold size are never cached. Hyper-G is a refinement of LFU, break ties using the recency of last use and size. Lowest Latency First minimizes average latency by evicting the document with the lowest download latency first.

12 Cost-based Replacement cost functionEmploy a potential cost function derived from different factors such as –time since last access, –entry time of the object in the cache, –transfer time cost, –object expiration time and so on. GreedyDual-Size (GD-Size) associates a cost with each object and evicts object with the lowest cost/size. Hybrid associates a utility function with each object and evicts the one has the least utility to reduce the total latency.

13 Cost-based Replacement Lowest Relative Value evicts the object with the lowest utility value. Least Normalized Cost Replacement (LCN- R) employs a rational function of the access frequency, the transfer time cost and the size. Bolot/Hoschka employs a weighted rational function of transfer time cost, the size, and the time last access.

14 Cost-based Replacement Size-Adjusted LRU (SLRU) orders the object by ratio of cost to size and choose objects with the best cost-to-size ratio. Server-assisted scheme models the value of caching an object in terms of its fetching cost, size, next request time, and cache prices during the time period between requests. It evicts the object of the least value. Hierarchical GreedyDual (Hierarchical GD) does object placement and replacement cooperatively in a hierarchy.

15 GreedyDual GreedyDualGreedyDual is originally proposed by Young and Tarjan, concerned with the case when pages in a cache have the same size but incur different costs to fetch from a secondary storage HA value H is initiated with each cached page p when a page is brought into cache. –H is set to be the cost of bringing p into the cache –the cost is always nonnegative. (1) Page with the lowest H value (min H ) is replaced and (2) then all pages reduce their H values by min H

16 GreedyDual If a page is accessed, its H value is restored to the cost of bringing it into the cache Thus the H values of recently accessed pages retain a larger portion of the original cost than the pages that have not been accessed for a long time By reducing the H values as time goes on and restoring them upon access, GreedyDual integrates the locality and cost concerns in a seamless fashion

17 GreedyDual-Size Setting H to cost/size upon accesses to a document, where cost is the cost of bringing the document and size is the size of the document in bytes –call this extended version as GreedyDual-Size costThe definition of cost depends on the goal of the replacement algorithm cost is set to –1 –1 if the goal is to maximize hit ratio –the downloading latency –the downloading latency if the goal is to minimize average latency –network cost –network cost if the goal is to minimize the total cost

18 GreedyDual-Size Implementation: –Need to decrement all the pages in cache by Min(q) every time a page q is replaced, which may be very inefficient –Improved algorithm is in the next page –Maintaining a priority queue based on H –Handling a hit requires O(log k) time and –Handling an eviction requires O(log k) time since in both cases the queue needs update

19 GreedyDual-Size Algorithm GreedyDual (document p) /* Initialize L  0 */ (1)If p is already in memory, (2) H(p)  L + cost(p)/size(p) (3)If p is not in memory, (4) while there is not enough room in memory for p, (5) Let L  min H(q) for all q in cache (6) Evict q such that H(q) = L (7) Put p into memory & set H(p)  L+cost(p)/size(p)

20 Hybrid Algorithm (HYB) Motivated by Bolot and Hoschka's algorithm. HYB is a hybrid of several factors, considering not only download time but also number of references to a document and document size. HYB selects for replacement the document i with the lowest value of the following expression:

21 HYB Utility function is defined as follows –C s is the estimated time to connect to the server –b s is the estimated bandwidth to the server –Z p is the size of the document –n p is the # of times document has been referenced –W b and W n are constants that set the relative importance of the variables b s and n p, respectively WnWn (n p ) ZpZp CsCs WbWb bsbs + ()

22 Latency Estimation Algo. (LAT) [REF]REF Motivated by estimating the time required to download a document, and then replace the document with the smallest download time. Apply some function to combine (e.g., smooth) these time samples to form an estimate of how long it will take to download the document –keeping a per-document estimate is probably not practical. –Alternative: keep statistics of past downloads on a per- server basis, rather than a per-document basis. (less storage) For each server j, the proxy maintains an –Clat j –Clat j : estimated latency (time) to open connection to server –Cbw j –Cbw j : estimated bandwidth of the connection (in bytes/second),

23 Latency Estimation Algo. (LAT) [REF]REF –When a new document is received from server, the connection establishment latency (s clat ) and bandwidth for that document (s cbw ) are measured, the estimates are updated as follows: clat j = (1-ALPHA) clat j + ALPHA s clat cbw j = (1-ALPHA) cbw j + ALPHA s cbw –ALPHA is a smoothing constant, set to 1/8 as it is in the TCP smoothed estimation of RTT –Let ser(i) denote the server on which document i resides, and s i denote the document size. Cache replacement algorithm LAT selects for replacement the document i with the smallest download time estimate, denoted d i : –d i = clat ser(i) + s i /cbw ser(i) Replacement Algorithm

24 Latency Estimation Algo. (LAT) One detail remains: –a proxy runs at the application layer of a network protocol stack, and therefore would not be able to obtain the connection latency samples s clat. –Therefore the following heuristic is used to estimate connection latency. A constant CONN is chosen (e.g., 2Kbytes). Every document that the proxy receives whose size is less than CONN is used as an estimate of connection latency s clat. –Every document whose size exceeds CONN is used as a bandwidth sample as follows: s cbw = download time of document – current value of clat j.

25 Lowest Relative Value (LRV) ttime from the last access t : for its large influence on the probability of a new access –the probability of a new access conditioned to the time from the last access can be expressed as (1 - D(t)) # of previous accesses i : this parameter allows the proxy to select a relatively small number of documents with a much higher probability of being accessed again sdocument size s: This seems to be the most effective parameter that make a selection among documents with only one access

26 Distribution of interaccess times, D(t)

27 Prob. Density function of interaccess times, d(t)

28 Lowest Relative Value (LRV) We compute the probability that a document is accessed again, Pr( i, t, s), as follows Pr( i, t, s) = P 1 (s)(1 - D(t)) if i = 1 Pr( i, t, s) = P i (1 – D(t)) otherwise –P i : conditional probability that a document is reference i+1 times given that it has been accessed i times –P 1 (s): Percentage of size s with at least 2 accesses –D(t): density distribution of times between consecutive requests to the same document, derived as D(t) = 0.035log(t+1) (1 - e ) 2E6 t

29 Lowest Relative Value (LRV)LRV

30 Lowest Relative Value (LRV)

31 Performance from Pei Cao Use hit ratio, byte hit ratio, reduced latency and reduced hops –reduced latency = the sum of downloading latency for the pages that hit in cache as a percentage of the sum of all downloading latencies –reduced hops = the sum of the network costs for the pages that hit in cache as a percentage of the sum of the network costs of all Web pages model network cost of each document as hops –Web server has hop value: 1 or 32; we assign 1/8 of servers with hop value 32 and 7/8 with hop value 1 –The hop value can be thought of either as the number of network hops traveled by a document or as the monetary cost associated with the document

32 Performance from Pei Cao GD-Size(1) sets cost of each document to be 1, thus trying to maximize hit ratio GD-Size(packets) sets the cost for each document to 2+size/536, i.e. estimated number of network packets sent and received if a miss to the document happens –1 packet for the request, 1 packet for the reply and size/536 for extra data packets assuming a 536- byte TCP segment size. –It tries to maximize both hit ratio and byte hit ratio Finally GD-Size(hops) sets the cost for each document to the hop value of the document trying to minimize network costs

33 Performance from Pei Cao See Cao’s paper: page 4page 4

34 Weighted Hit Rate Results on best primary key are inconclusive Most references are from small files, but most bytes are from large files Why Size? –Most accesses are for smaller documents –A few large documents take the space of many small documents –Concentration of large inter-reference times

35 Exp. 3: Partitioning Cache by Media Idea –Do clients that listen to music degrade the performance of clients using text and graphics? –Could a partitioned cache with one portion dedicated to audio, and the other to non-audio documents increase the WHR experienced by either audio or non-audio documents? Simulate –cache size = 10% of max needed –two partitions: audio and non-audio

36 Exp. 4: Partitioning Cache by Media In Experiment 4, –a one-level cache with SIZE as the primary key –random as the secondary key –three partition sizes: dedicate 1/4, 1/2, or 3/4 of the cache to audio; –the rest is dedicated to non-audio documents.

37 Exp. 4: Partitioning Cache by Media

38 Exp. 4: Partitioning Cache by Media

39 Problems to solve Certain sorting keys have intuitive appeal. document type –The first is document type. A sorting key that puts text documents at the front of the removal queue would insure low latency for text in Web pages, at the expense of latency for other document type. refetch latency –The second sorting key is refetch latency. To a user of international documents, the most obvious caching criteria is one that caches documents to minimize overall latency. A European user of North American documents would preferentially cache those documents over ones from other European servers to avoid using heavily utilized transatlantic network links. Therefore a means of estimating the latency for refetching documents in a cache could be used as a primary sorting key.

40 Problems to solve caching dynamic documents. Cache is only useless for dynamic documents if the document content completely changes; otherwise a portion but not all of the cached copy remains valid. –allow caches to request the differences between the cached version and the latest version of a document.

41 Problems to solve For example, in response to a conditional GET a server could send the “diff" of the current version and the version matching the Last- Modified date sent by the client; or a specific tag could allow a server to “fill-in“ a previously cached static “query response form." –Another approach to changing semi-static pages (i.e., pages that are HTML but replaced often) is to allow Web servers to preemptively update inconsistent document copies, at least for the most popular.

42 Randomized Strategies These strategies use randomized decisions to find an object for replacement.

43 Randomized Strategies 1. RAND –This strategy removes a random object. 2. HARMONIC [Hosseini-Khayat 1997] –RAND uses equal probability for each object, HARMONIC removes from cache one item at random with a probability inversely proportional to its specific cost cost = c i /s i. Randomized Strategies

44 Randomized Strategies 3. LRU-C and LRU-S [Starobinski and Tse 2001]. –LRU-C is a randomized version of LRU. Let C max ={c 1,…c N } be the maximum of the access costs of all N objects of a request sequence. Let ĉ i = c i /c max be the normalized cost for object i. When an object i is requested, it is moved to the head of the cache with probability ĉ i ; otherwise, nothing is done. Randomized Strategies

45 Randomized Strategies LRU-S uses the size instead of the cost. Let s min ={s 1,…s N } be the size of the smallest objects among the N documents, and d i =s min /s i be the normalized density of object i. LRU-S acts as LRU with probability d i ; otherwise the cache state is left unmodified. –Furthermore, Starobinski and Tse [2001] proposed an algorithm which deals with both varying-size and varying- cost objects. –The following quantities were defined: Upon a request for object i, this algorithm performs the same operation as LRU with probability and with will leave the cache state unmodified. Randomized Strategies

46 Randomized Strategies 4. Randomized replacement with general value functions [Psounis and Prabhakar 2001]. –This strategy draws N objects randomly from the cache and evicts the least useful object in the sample. The usefulness of a document can be determined by any utility function. After replacing the least useful object, the next M(M < N) least useful objects are retained in memory. –At the next replacement, N − M new samples are drawn from the cache and the least useful of these N−M and M previously retained is evicted. The M least useful of the remaining are stored in memory and so on. Randomized Strategies

47 Randomized Strategies Summary 1. Randomization presents a different approach to cache replacement. 2. Randomized strategies try to reduce the complexity of the replacement process without sacrificing the quality too much. Randomized Strategies

48 Admission control If we store the response in cache or not? First time not save

49 Admission control heuristic to make this decision: the most frequently accessed objects recently will most likely be accessed again. The words “frequently” and “recently” imply that access frequency of objects and a decay function applied on frequency are needed. an extra space called URL memory cache is introduced to store URLs and the associated access frequency of the requested objects.

50 Admission control If the requested object is cacheable, the process of storing the object in disk cache is delayed until the same object is accessed again. (Or we can say that cacheable objects are not stored in disk cache unless they have been accessed before. ) Since the access stream is infinite, the size of URL cache must be limited. A replacement policy is also needed in URL cache.

51 Admission control: operations Cache hits: –The operations are similar to the original algorithm. –In addition to unused non-cacheable objects and hot objects in memory cache, the cacheable objects without disk copies are also the candidates for replacement in memory cache. –Consider the case that a copy of the requested object exists in memory cache but not in disk cache. The reference count associated with the requested object in memory cache is incremented by one and the data is then stored in disk cache. If the evicted objects from memory cache are cacheable, its URL along with its reference count is then stored in URL cache.

52 Admission control: operations Cache misses for cacheable objects : –If the requested object is cacheable, the caching algorithm checks (1) if its URL is not stored in URL cache. Replacement operations are performed for allocating enough space for holding the requested object. The URL of the replaced object is now stored in URL cache along with its reference count. The replacement operations in URL cache must be performed. The evicted URLs from URL cache are released. The requested object itself is not stored in disk cache at this moment. Thus, no replacement in disk cache is needed.

53 Admission control: operations Cache misses for cacheable objects : (2) if the URL of the requested object is stored in URL cache, its associated record in URL cache is removed, the requested object is stored in disk cache, and the reference count is set to one. Similarly, the replacement operations in disk cache must be performed. The URLs of the evicted objects from disk cache are stored in URL cache and again the replacement operations in URL cache are performed.

54 Admission control: operations Cache misses for no-cacheable objects : –For a cache miss, if the object is non-cacheable, the operations are similar to original algorithm. If the evicted object from memory cache is cacheable and it does not exist in disk cache, its URL along with the reference count is stored in URL cache. –Notice that the proposed approach may lose some possible hits on the disk cache when objects are accessed the second time. However, it removes all the disk activity that disk cache stores the objects that will not be accessed again before evicted.

55 Admission control Efficient Management of URL Cache –A separate hash table similar to that in memory/disk cache is used in URL cache to support efficient search for the URL of requested object. –The MD5 of URL is employed as the search key. –We employ a replacement policy that is based on the URL access frequency. –The least frequently accessed entry in URL cache is first selected for replacement. –A priority queue with access frequency as the key is a suitable implementation for such replacement policy.

56 Admission control Efficient Management of URL Cache –Each entry of the URL cache records the MD5 of URL, access frequency, and a few pointers for facilitating priority queue and hash table data structures. –The required memory space for each entry in URL cache is constant. –The size of hash table and priority queue itself is small and does not depend on the number of entries hashed, thus can be ignored. –Based the size of the UC trace we studied in this paper, keeping all the URLs of the requests from one- day period in URL cache is reasonable. This accounts for 400k URLs. Therefore, assuming 80 bytes is needed for each entry in URL cache, 32 MB of the memory space is needed for the URL cache.

57 hit ratio h(S) HR CHU h eff (S) h(S)

58 Removal frequency On-demand: Run policy when the size of the requested document exceeds the free room in a cache. (take time to do the removal) Periodically: Run policy every T time units, for some T. –If removal is time consuming Both on-demand and periodically: Run policy at the end of each day and on- demand (Pitkow/Recker [13]).

59 On-demand Two arguments suggest that overhead of simply using on-demand replacement will not be significant. –First, the class of removal policies maintains a sorted list. If the list is kept sorted as the proxy operates, then the removal policy merely removes the head of the list for removal, which should be a fast and constant time operation. –Second, a proxy server keeps read-only documents. Thus there is no overhead for “writing-back" a document, as there is in a virtual memory system upon removal of a page that was modified since being loaded.

60 How many to remove Removal process is stopped when the free cache area equals or exceeds the requested document size. Replace documents until a certain threshold (Pitkow and Recker's comfort level) is reached.