Multicache-Based Content Management for Web Caching Kai Cheng and Yahiko Kambayashi Graduate School of Informatics, Kyoto University Kyoto JAPAN
Outline of the Presentation Introduction –Localizing Web Contents –Why Content Management –Contributions of Our Work Multicache-Based Content Management Content Management Scheme for LRU-SP Experimental Evaluation Concluding Remarks
Web Caching For Localizing Web Contents World Wide Content Access/Delivery –Bandwidth Constraints –“Hot-Spot” Servers –Inherent Latency (200 300ms) Web Caching For Localizing Web Contents –Reduce Network Traffic –Distribute Server Load –Reduce Response Times –Can We Expect More ?
Characteristics and Implications Traditional Caching Web CachingImplications Process Oriented Human-User Oriented User Preferences System-LevelApplication-Level Semantic Information Data Block BasedDocument-Based Varying Sizes, Types Memory-BasedDisk-Based Persistent Storage, Large Size,
Limitations of Current Caching Schemes Document Managed As Physical Unit, Not Semantic Unit. Only Physical Properties Being Used Less Organized, Less Structured Only Support Simple Control Logic Beyond Simple Priority Queues, Towards Sophisticated Content Management
Content Management Basic Features –Larger Cache Space –Sophisticated Control Logic More Challenging –Sophisticated Replacement Policies With User-Oriented Performance Metrics Document Managed as Semantic Unit
Contributions of This Work A Multicache Architecture for Implementing Sophisticated Content Management A Study of Content Management for LRU-SP Simulations to Compare LRU-SP Against Others
Previous Work Classifications (Cache Data ) –LRV, LNC-W3-U, etc. Segmentation (Cache Space) –Segmented FIFO, FBR, 2Q etc. Features –Differentiating Data With Different Properties Shortages: – No Sophisticated Category – No Semantic-Based Classification
Managing LFU Contents in Multiple Priority Queues 2 1 >2 B(8)C(6)D(3) A(10)E(2)F(2) F(1)G(1)H(1) Hit Outs First In First Out Order References A(10) B(8)C(6) D(3) E(2) F(2) F(1)G(1) H(1)
Basics of Cache Space –Limit Storage Space Contents –Objects Selected for Caching Policies –Replacement Policies Constraints –Special Conditions Space Contents Policies Constraints Space
Constraints for Cache Admission Constraints –Define Conditions for Objects Eligible For Caching e.g. (size < 2MB) && !(Source = local) Freshness Constraints –Define Conditions for Objects Fresh Enough For Re- Use e.g. (Type = news) && (Last-Modified < 1week) Miscellaneous Constraints e.g. (Time= end-of-day) (Total-Size< 95%*Cache- Size)
Multicache Architecture SUBCACHE CENTRAL ROUTER CENTRAL ROUTER Client WWW Web Cache With Multiple Subcaches JUDGE CONSTRAINTS CKB IN-CACHE Request/Response Cache Knowledge Base
Components of the Architecture Central Router – Control and Mediate the Cache Cache Knowledge Base (CKB) –A Set of Rule Based To Allocate Objects R1. Allocate(X, 1):-url(X, U), match(U, *.jp),content(X, baseball) Subcaches –Keep Objects With Special Characteristics Cache Judge –Make Final Decisions From A Set of Eviction Candidates
Central Router services each request. Suppose current request is for document p; 1.Locating p by In-cache Index 2.If p is not in cache, download p; i.Validate Constraints, if false, loop; ii.Fire rules in CKB, let subcache ID = K ; iii.While no enough space in subcache K for p –Subcache K selects an eviction ; – If space sharing, other subcaches do same; –Judge assesses the eviction candidates; –Purge the victim; iv. Cache p in subcache K 3.If p is in subcache, do i) - iv) re-cache p. The Procedural Description
Content Management for LRU-SP LRU (Least Recently Used) –Primarily Designed for Equal Sized Objects, and Only Recency of Reference In Use Extended LRUs –Size-Adjusted LRU (SzLRU) –Segmented LRU (SgLRU) LRU-SP (Size-Adjusted and Popularity-Aware LRU) –Make SzLRU Aware of Popularity Degree
Probability of Re-Reference As a Function of Current Reference Times
Cost -To-Size Ratio Model An Object A In Cache Saves Cost nref * (1/atime) –nref is the frequency of reference –atime is the time since last access, (1/atime) is the dynamic frequency of A When Put In Cache, It Takes Up Space size –Cost-to-size ratio = nref /(size*atime) The Object With Least Ratio Is Least Beneficial One
Content Management of LRU-SP CKB Rule: –Allocate(X, log(size/nref)):-Size(X, size), Freq(X, nref) Subcaches –Least Recently Used (LRU) Judge –Find the One With Largest (size*atime)/nref –The Larger and Older and Colder, the Fast An Object Will Be Purged
Multicache Architecture for LRU-SP LRU Subcache ① LRU Subcache ② LRU Subcache ③ CKB Hits A, B A B C Judge a b c Ca Computational Complexity O(1)
Predicted Results A higher Hit Rate is expectable for LRU-SP, because it utilizes three indicators to document popularity. However, higher Hit Rates are usually at the cost of lower Byte Hit Rates, given a similar popularity, because smaller documents contribute less to bytes of hit data.
Experiment Results Better Than Expected * *
Results & Explanations LRU-SP really obtained a much higher Hit Rate than SzLRU, SgLRU and LRV. LRU-SP also obtained a high Byte Hit Rate, especially when cache space exceeds 3% of total required space. Really Popular Objects Are Saved, So Both Hit Rate and Byte Hit Rate are Improved. LRU-SP only incurs O(1) time complexity in content management.
Concluding Remarks Multicahe-Based Architecture Has Proved Well- Performed In Balancing High Performance and Low Overhead Possible To Incorporate Semantic Information as Well as User Preference In Caching It Can Work With General Database Systems to Support Web Information Integration. (Future Work)
Thank You ! And Welcome To