March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) 2004 1 Dimitrios Katsaros Yannis Manolopoulos Data Engineering Lab Department of Informatics.

Slides:



Advertisements
Similar presentations
Song Jiang1 and Xiaodong Zhang1,2 1College of William and Mary
Advertisements

Cost-Based Cache Replacement and Server Selection for Multimedia Proxy Across Wireless Internet Qian Zhang Zhe Xiang Wenwu Zhu Lixin Gao IEEE Transactions.
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
Dynamic Task Assignment Load Index for Geographically Distributed Web Services PhD Research Proposal By: Dhiah Al-Shammary Supervised.
ARC: A SELF-TUNING, LOW OVERHEAD REPLACEMENT CACHE
Outperforming LRU with an Adaptive Replacement Cache Algorithm Nimrod megiddo Dharmendra S. Modha IBM Almaden Research Center.
October 15, 2002MASCOTS WebTraff: A GUI for Web Proxy Cache Workload Modeling and Analysis Nayden Markatchev Carey Williamson Department of Computer.
Computer Science Generating Streaming Access Workload for Performance Evaluation Shudong Jin 3nd Year Ph.D. Student (Advisor: Azer Bestavros)
Latency-sensitive hashing for collaborative Web caching Presented by: Xin Qi Yong Yang 09/04/2002.
1 11 Web Caching Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB CACHING  Cache Definition  Goals of Web Caching  Motivations for.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 CPSC : Project Brainstorming Session Carey Williamson Department of Computer Science University of Calgary.
October 14, 2002MASCOTS Workload Characterization in Web Caching Hierarchies Guangwei Bai Carey Williamson Department of Computer Science University.
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
1 Simulation Evaluation of a Heterogeneous Web Proxy Caching Hierarchy Mudashiru Busari Carey Williamson University of Saskatchewan University of Calgary.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies John Dilley and Martin Arlitt IEEE internet computing volume3 Nov-Dec 1999 Chun-Fu.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
November 22, 2003 BCI 2003 Aristotle University of Thessaloniki 1 Updating Web views distributed over wide area networks Sidiropoulos Antonis Katsaros.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Differentiated Multimedia Web Services Using Quality Aware Transcoding S. Chandra, C.Schlatter Ellis and A.Vahdat InfoCom 2000, IEEE Journal on Selected.
ECE7995 Caching and Prefetching Techniques in Computer Systems Lecture 8: Buffer Cache in Main Memory (IV)
Web Caching Robert Grimm New York University. Before We Get Started  Illustrating Results  Type Theory 101.
A Case for Delay-conscious Caching of Web Documents Peter Scheuermann, Junho Shim, Radek Vingralek Department of Electrical and Computer Engineering Northwestern.
Caching And Prefetching For Web Content Distribution Presented By:- Harpreet Singh Sidong Zeng ECE Fall 2007.
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
Cost-Aware WWW Proxy Caching Algorithms Pei Cao University of Wisconsin-Madison Sandy Irani University of California-Irvine Proceedings of the USENIX Symposium.
Evaluating Content Management Techniques for Web Proxy Caches Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich and Tai Jin Hewlett-Packard.
Least Popularity-per-Byte Replacement Algorithm for a Proxy Cache Kyungbaek Kim and Daeyeon Park. Korea Advances Institute of Science and Technology (KAIST)
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching.
1 Ekow J. Otoo Frank Olken Arie Shoshani Adaptive File Caching in Distributed Systems.
Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.
Storage Allocation in Prefetching Techniques of Web Caches D. Zeng, F. Wang, S. Ram Appeared in proceedings of ACM conference in Electronic commerce (EC’03)
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
On the Scale and Performance of Cooperative Web Proxy Caching University of Washington Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin,
1 Objective-Optimal Algorithms for Long-term Web Prefetching Bin Wu & Ajay Kshemkalyani Dept. of Computer Science, Univ. of Illinois at Chicago
Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer.
Segment-Based Proxy Caching of Multimedia Streams Authors: Kun-Lung Wu, Philip S. Yu, and Joel L. Wolf IBM T.J. Watson Research Center Proceedings of The.
« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Design and Analysis of Advanced Replacement Policies for WWW Caching Kai Cheng, Yusuke Yokota, Yahiko Kambayashi Department of Social Informatics Graduate.
Proxy Cache and YOU By Stuart H. Schwartz. What is cache anyway? The general idea of cache is simple… Buffer data from a slow, large source within a (usually)
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
An Effective Disk Caching Algorithm in Data Grid Why Disk Caching in Data Grids?  It takes a long latency (up to several minutes) to load data from a.
Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN.
System Software Lab 1 Enhancement and Validation of Squid ’ s Cache Replacement Policy John Delley Martin Arlitt Stephane Perret WCW99 김 재 섭 EECS System.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments IEEE Infocom, 1999 Anja Feldmann et.al. AT&T Research Lab 발표자 : 임 민 열, DB lab,
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Evaluating Content Management Techniques for Web Proxy Caches Martin Arlitt, Ludmila Cherkasova, John Dilley, Rich Friedrich and Tai Jin Proceeding on.
Hot Systems, Volkmar Uhlig
August 23, 2001ITCom2001 Proxy Caching Mechanisms with Video Quality Adjustment Masahiro Sasabe Graduate School of Engineering Science Osaka University.
An Overview of Proxy Caching Algorithms Haifeng Wang.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Evaluating Content Management Technique for Web Proxy Cache M. Arlitt, L. Cherkasova, J. Dilley, R. Friedrich and T. Jin MinSu Shin.
Web Prefetching Lili Qiu Microsoft Research March 27, 2003.
Video Caching in Radio Access network: Impact on Delay and Capacity
Jiahao Chen, Yuhui Deng, Zhan Huang 1 ICA3PP2015: The 15th International Conference on Algorithms and Architectures for Parallel Processing. zhangjiajie,
#16 Application Measurement Presentation by Bobin John.
Modeling and Caching of P2P Traffic Osama Saleh Thesis Defense and Seminar 21 November 2006.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Does Internet media traffic really follow the Zipf-like distribution? Lei Guo 1, Enhua Tan 1, Songqing Chen 2, Zhen Xiao 3, and Xiaodong Zhang 1 1 Ohio.
The Impact of Replacement Granularity on Video Caching
Memory Management for Scalable Web Data Servers
On the Scale and Performance of Cooperative Web Proxy Caching
Evaluating Proxy Caching Algorithms in Mobile Environments
Web Proxy Caching Model
Presentation transcript:

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Dimitrios Katsaros Yannis Manolopoulos Data Engineering Lab Department of Informatics Aristotle Univ. of Thessaloniki, Greece Caching in Web Memory Hierarchies

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) reverse- proxy cache Origin server proxy caches Web performance: the ubiquitous content cache cooperating hierarchical

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Web caching benefits Caching is important because by reducing the number of requests –the network bandwidth consumption is reduced –the user-perceived delay is reduced ( popular objects are moved closer to clients) –the load on the origin servers is reduced ( servers handle fewer requests)

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Content caching is still strategic Is the optimization of fine tuning of cache replacement a “moot point” due to the ever decreasing prices of memory? Such a conclusion is ill guided for several reasons : First, studies have shown that the cache HR and BHR grow in a log-like fashion as a function of cache size [3]. Thus, a better algorithm that increases HR by only several percentage points would be equivalent to a several-fold increase in cache size Second, the growth rate of Web content is much higher than the rate with which memory sizes for Web caches are likely to grow Finally, the benefit of even a slight improvement in cache performance may have an appreciable effect on network traffic, especially when such gains are compounded through a hierarchy of caches

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Web cache performance metrics Replacement policies aim at improving cache effectiveness by optimising two performance measures: the hit ratio: the cost savings ratio: where h i is the number of references to object i satisfied by the cache, r i is the total number of references to I, and c i is the cost of fetching object i in cache. The cost can be defined as: the object size s i. Then, CSR coincides with BHR (byte hit ratio) the downloading latency c i. Then, CSR coincides with DSR (delay savings ratio)

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Challenges for a caching strategy Several factors distinguish Web caching from caching in traditional computer architectures (a)the heterogeneity in objects' sizes, (b)the heterogeneity in objects' fetching costs, (c)the depth of the Web caching hierarchy, and (d)the access patterns, which are not generated by a few programmed processes, but mainly originate from large human populations with diverse and varying interests

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) What has been done to address them? (1) The majority of the replacement policies proposed so far fail to achieve a balance between (or optimize both) HR and CSR: The recency-based policies, favour the HR, e.g., the family of GreedyDualSize algorithms [3, 7] The frequency-based policies, favour the CSR (BHR or DSR), e.g., LFUDA [5] Exceptions are the LUV [2] and GD* [7], which combine recency and frequency. The drawback of LUV is the existence of a manually tunable parameter λ, used to “select” the recency-based or frequency- based behaviour of the algorithm. GD* has a similar drawback, since it requires manual tuning of the parameter β

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) What has been done to address them ? (2) Regarding the depth of the caching hierarchy: Carey Williamson [15] Proved an alteration in the access pattern, which is characterized by weaker temporal locality Proposed the use of different replacement policies (LRU, LFU, GD-Size) in different levels of the caching hierarchies This solution though is not feasible and/or acceptable: the caches are administratively independent the adoption of a replacement policy (e.g., LFU) at any level of the hierarchy favours one performance metric (CSR) over the other (HR)

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) What has been done to address them ? (3) The origin of the request streams received little attention It is (in combination with the caching hierarchy depth) responsible for the large number of one-timers, objects requested only once Only SLRU [1] deals explicitly with this factor: –Proposed the use of a small auxiliary cache to maintain metadata for past evicted objects This approach: –needs to heuristically determine the size of the auxiliary cache –precludes some objects from entering into the cache. Thus, it may result in slow adaptation of the cache in a changing request pattern

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Why do we need a new caching policy? Need to optimize not only one of the two performance metrics in a heterogeneous environment, like the Web. We would like a balance between HR and CSR (balance between the average latency that the user sees and the traffic performance) Need to deal with the weak temporal locality in Web request streams Need to eliminate any “administratively” tunable parameters. The existence of parameters whose value is derived from statistical information extracted from Web traces (e.g., LNC-R-W3 [14] or LRV [12]) is not desirable due to the difficulty of tuning these parameters Our contribution: CRF, a new caching policy dealing with all the particularities of the Web environment

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: BHR vs. DSR The delay savings ratio is affected very much by the transient network and Web server conditions Two more reasons bring about significant variation in the connection time for identical connections –The persistent HTTP connections, which avoid reconnection costs, and –Connection caching [4], which reduces connection costs We favour the size (BHR) instead of the latency (DSR) of fetching an object as a measure of the cost

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: One-timers We partition the cache space –Cache partitioning has been followed by prior algorithms, e.g. FBR [13], but not for the purpose of the isolation of one-timers –Only Segmented LRU [8] adopted partitioning for isolating one- timers. Experiments showed that (in the Web) it suffers from cache pollution The cache has two segments: R-segment and I-segment –The cache segments are allowed to grow and shrink deliberately depending on the characteristics of the request stream –The one-timers are accommodated into the R-segment. We do not further partition the I-segment since it makes very difficult to decide the segment from which the victim will be selected and it incurs maintenance cost for moving the objects from one segment to the other

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: Ranking (1) A couple of decisions must be made, which regard: the ranking of objects within each segment, and the selection of replacement victims These decisions must assure 3 constraints/targets: (a)balance between hit and byte hit ratio, (b)protect the cache from one-timers, but without preventing the cache from adapting to a changing access patterns, and (c)because of the weak temporal locality, exploit frequency- based replacement criteria

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: Ranking (2) Aim for the R-segment (one-timers): –accommodate as many objects as possible –exploit any short-term temporal locality of the request stream the ranking function for the R-segment: the ratio of object’s entry time over its size

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: Ranking (3) Aim for the I-segment (heart of the cache): –provide a balance between HR and BHR –deal with the weak temporal locality the ranking function for the I-segment: the product of the last inter-reference time of an object times the recency of the object –the inter-reference time stands for the steady- state popularity (frequency of reference) of an object –the recency stands for a transient preference to an object

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: Replacement victim (1) R-victim : the candidate victim from R-segment I-victim : the candidate victim from the I-segment t c : the current time R 1 : the reference time of the R-victim I 1 : the time of the penultimate reference to the I-victim I 2 : the time of the last reference to it δ 1 (= t c - I 2 ) : the reference recency of the I-victim δ 2 (= t c - R 1 ) : the reference recency of the R-victim δ 3 (= I 2 -I 1 ) : the last inter-reference time of the I-victim Estimate whether or not the I-victim loses its popularity and also the potential of the R-victim to get a second reference

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s design principles: Replacement victim (2) R-victim I-victim R-victim I-victim

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s pseudocode (1)

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s pseudocode (2)

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s performance evaluation Examined CRF against LRU, LFU, Size, LFUDA, GDS, SLRU, LUV, HLRU, LNCRW3 –GDS be the representative of the family which includes GDS, GDSF –HRLU(6) be the representative of the HLRU family –LNCRW3 implemented so as to optimise the BHR instead of DSR –LUV tuning: we tried several values for the λ parameter, and we selected the value 0.01, because it gave the best performance for small caches and the best performance in most cases Generated synthetic Web request streams with the ProWGen tool [15]

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) CRF ’s performance evaluation Input parameters to ProWGen tool

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Sensitivity to one-timers : recency-based Left: Hit Rate Right: Byte Hit Rate

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Sensitivity to one-timers : frequency-based Left: Hit Rate Right: Byte Hit Rate

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Sensitivity to one-timers (aggregate) CRF’s gain-loss wrt one-timers

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Sensitivity to Zipfian slope : recency-based Left: Hit Rate Right: Byte Hit Rate

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Sensitivity to Zipfian slope : frequency-based Left: Hit Rate Right: Byte Hit Rate

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Sensitivity to Zipfian slope (aggregate) CRF’s gain-loss wrt Zipfian slope

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Conclusions We proposed a new replacement policy for Web caches, the CRF policy CRF was designed to address all the particularities of the Web environment The performance evaluation confirmed that CRF is a hybrid between recency and frequency-based policies CRF depicts a stable and overall improved performance

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) Thank you for your attention

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) C. Aggrawal, J. Wolf and P.S. Yu. Caching on the World Wide Web. IEEE Transactions on Knowledge and Data Engineering, 11(1):94–107, H. Bahn, K. Koh, S.H. Noh and S.L. Min. Efficient replacement of nonuniform objects in Web caches. IEEE Computer, 35(6):65–73, L. Breslau, P. Cao, L. Fan, G. Phillips and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. Proceedings IEEE INFOCOM Conf, pp , P. Cao and S. Irani. Cost-aware WWW proxy caching algorithms. Proceedings USITS Conf, pp.193–206, E. Cohen, H. Kaplan and U. Zwick. Connection caching: model and algorithms. Journal of Computer and System Sciences, 67(1):92–126, J. Dilley and M. Arlitt. Improving proxy cache performance: analysis of three replacement policies. IEEE Internet Computing, 3(6):44–50, S. Jiang and X. Zhang. LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance. Proceedings ACM SIGMETRICS Conf, pp.31–42, S. Jin and A. Bestavros. GreedyDual* Web caching algorithm: exploiting the two sources of temporal locality in Web request streams. Computer Communications, 24(2):174–183, References (1)

March 15, 2004 ACM Symposium on Applied Computing (ACM SAC) R. Karedla, J.S. Love and B.G. Wherry. Caching strategies to improve disk system performance. IEEE Computer, 27(3):38–46, N. Megiddo and D. S. Modha. ARC: a self-tuning low overhead replacement cache. Proceedings USENIX FAST Conf, A. Nanopoulos, D. Katsaros and Y. Manolopoulos. A data mining algorithm for generalized Web prefetching. IEEE Transactions on Knowledge and Data Engineering, 15(5):1155–1169, L. Rizzo and L. Vicisano. Replacement policies for a proxy cache. IEEE/ACM Transactions on Networking, 8(2):158–170, J. Shim, P. Scheuermann and R. Vingralek. Proxy cache algorithms: design, implementation and performance. IEEE Transactions on Knowledge and Data Engineering, 11(4):549–562, A. Vakali. Proxy cache replacement algorithms: a history-based approach. World Wide Web Journal, 4(4):277–297, C. Williamson. On filter effects in Web caching hierarchies. ACM Transactions on Internet Technology, 2(1):47–77, References (2)