Ymir Vigfusson, Emory University Hjortur Bjornsson University of Iceland Ymir Vigfusson Emory University / Reykjavik University Trausti Saemundsson Reykjavik University Gregory Chockler University of London, Royal Holloway
Queries Results Queries Results Clients Memcache servers Database tier
Queries Results Queries Results Clients Memcache servers Database tier Too many cache servers is a waste of resources
Queries Results Queries Clients Memcache servers Database tier Results DRAMATIZATION Results Queries Too few cache servers overload the database
How do we optimize cache resources?
The key parameter is the cache size Hit rate for the current allocation
How do we optimize cache resources? Efficiency Time overheadSpace overhead Accuracy High fidelity to true hit rate curves Provable guarantees Usability Simple interfaceModularity
M IMIR Cache server HRC estimator Ghost list/ghost filter Hit(e) Miss(e) Set(e) Evict(e) Replacement algorithm Get/Set(e ) Aging policy Export-HRC()
Inclusion property: Contents of a smaller cache are contained in a bigger cache given same input Holds for LRU, LFU, OPT,... LRU list Idea: Produce LRU hit rate curves by tracking distance from head Stack distance
ghufertb On every hit, determine stack distance Accumulate in a Hit Rate Curve PDF ghufertb Stack distance # of hits eghufrtbeghufrtbteghufrb Mattson et al. (1970) Walking a linked list on every access is inefficient
Bennett & Kruskal (1975) Almasi et al. (2002) ghufertb Could use self-balancing binary search trees AVL tree, red-black trees,... log N Trees accentuate lock contention, hurting performance
Prior focus on exact results Prior focus on very high stack distances Can we trade off accuracy for efficiency?
Extend LRU list with dataless „ghost“ entries 2N N
LRU list before hit on item e B Buckets g,h,ue,r,tfb,c,d,a Bucket # Each item tracks what bucket it is in
PDF update Hit rate curve (PDF) Stack distance Estimated # of hits B Buckets g,h,ue,r,tfb,c,d,a Bucket # Update statistics for e’s bucket Unit area uniformly distributed in bucket
e,g,h,ur,tfb,c,d,a PDF update Hit rate curve (PDF) Stack distance Estimated # of hits g,h,ue,r,tfb,c,d,a Bucket # Move e to front, tag with first bucket Unit area uniformly distributed in bucket Overflow looming
e,g,h,ur,tfb,c,d,ag,h,ufr,t,b,c,d,ae g,h,ue,r,tfb,c,d,a Bucket # When first bucket full, perform aging Overflow looming Stacker: Walk list, bump items below average reuse distance to an older bucket on the right. Rounder: Decrement bucket identifiers, shifting the frame of reference. Coalesce two oldest buckets. O(B) amortized O(1)
Hit rate curve (PDF) Stack distance Estimated # of hits Periodically calculate and export the hit rate curve Exponentially average the PDF Hit rate curve (CDF) Stack distance Cumulative hits
Maintain ghost list of length N Hit on item e: Update PDF statistics for e’s bucket Move e to front of list, tag with first bucket When front bucket full, perform aging Periodically calculate and export HRC O(B) or O(1) O(1) O(log B)
ROUNDER STACKER 90% Accuracy: %
Hit rate curve (PDF) Stack distance Estimated # of hits OPT True stack distance Possible location ALG Unit area
Memcached + YCSB #BucketsMAE 81.2% 160.6% 320.4% 640.3% % x 10 Each node has 6 Intel Xeon 2.4GHz; 48GB DRAM; 40Gbps QDR Infiniband interconnect; shared storage
Memcached + YCSB #BucketsMAE 81.2% 160.6% 320.4% 640.3% % 2-5% throughput and latency degradation
PaperAreaKey ideaOnlinePrecisionParallel Mattson et al. 1970Storage Stack distance, LRU linear search NoExactNo Kim et al. OSDI´00V-MemoryNoApproxNo Almási et al. MSP’02Storage?LRU AVL-TreeNoExactNo Ding&Zhong PLDI’03CompilersCompressed treesNoApprox.No Zhou et al. ASPLOS’04V-MemoryNo Geiger ASPLOS’06V-Memory Infer page fault from I/O, ghost lists NoApprox.No RapidMRC ASPLOS’09V-MemoryFixed LRU bucketsNoApprox.No Hwang&Wood ICAC’13 Network caches Hash rebalancingYesN/A Wires et al. OSDI ´14StorageLRU counter stacksNo*Approx.No M IMIR SOCC ´14 Network caches Variable LRU buckets YesApprox.Yes
Optimizing cache resources by profiling hit rates online Efficiency 2-5% performance degradation Accuracy % on traces Usability M IMIR is modularSimple algorithms
Optimizing cache resources by profiling hit rates online Efficiency Time: O(log B) Space: O(1) 0-2% throughput degradation Scalability Online profiling Composable estimate Accuracy % on tracesError = O(1/B) Can we do cost-benefit analysis of other cloud services? What should be evicted from a distributed cache? How should variable miss penalties be treated? Where should we place data in a cache hierarchy? Can we relieve cache hot spots? Can we do cost-benefit analysis of other cloud services? What should be evicted from a distributed cache? How should variable miss penalties be treated? Where should we place data in a cache hierarchy? Can we relieve cache hot spots?
Divide cache resources between services Progressively allocate space to service with highest marginal hit rate Optimal for concave HRC N [IBM J. of R&D ’11, LADIS ‘11]