TinyLFU: A Highly Efficient Cache Admission Policy

Slides:



Advertisements
Similar presentations
Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Advertisements

Counting Distinct Objects over Sliding Windows Presented by: Muhammad Aamir Cheema Joint work with Wenjie Zhang, Ying Zhang and Xuemin Lin University of.
CS4432: Database Systems II Buffer Manager 1. 2 Covered in week 1.
Qinqing Gan Torsten Suel Improved Techniques for Result Caching in Web Search Engines Presenter: Arghyadip ● Konark.
Virtual Memory Background Demand Paging Performance of Demand Paging
Virtual Memory Introduction to Operating Systems: Module 9.
Today’s Agenda  Stacks  Queues  Priority Queues CS2336: Computer Science II.
CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.
Indian Statistical Institute Kolkata
SIGMOD 2006University of Alberta1 Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters Presented by Fan Deng Joint work with.
Heavy hitter computation over data stream
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
CSC1016 Coursework Clarification Derek Mortimer March 2010.
Bloom Filters Kira Radinsky Slides based on material from:
Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies John Dilley and Martin Arlitt IEEE internet computing volume3 Nov-Dec 1999 Chun-Fu.
Reverse Hashing for Sketch Based Change Detection in High Speed Networks Ashish Gupta Elliot Parsons with Robert Schweller, Theory Group Advisor: Yan Chen.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Bloom filters Probability and Computing Randomized algorithms and probabilistic analysis P109~P111 Michael Mitzenmacher Eli Upfal.
A Hybrid Caching Strategy for Streaming Media Files Jussara M. Almeida Derek L. Eager Mary K. Vernon University of Wisconsin-Madison University of Saskatchewan.
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
1 The Mystery of Cooperative Web Caching 2 b b Web caching : is a process implemented by a caching proxy to improve the efficiency of the web. It reduces.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Page 19/17/2015 CSE 30341: Operating Systems Principles Optimal Algorithm  Replace page that will not be used for longest period of time  Used for measuring.
Fast and deterministic hash table lookup using discriminative bloom filters  Author: Kun Huang, Gaogang Xie,  Publisher: 2013 ELSEVIER Journal of Network.
Quasar A Probabilistic Publish-Subscribe System for Social Networks over P2P Kademlia network David Arinzon Supervisor: Gil Einziger April
CSCE Database Systems Chapter 15: Query Execution 1.
Compact Data Structures and Applications Gil Einziger and Roy Friedman Technion, Haifa.
Shades: Expediting Kademlia’s Lookup Process Gil Einziger, Roy Friedman, Yoav Kantor Computer Science, Technion 1.
CSC 213 – Large Scale Programming Lecture 37: External Caching & (a,b)-Trees.
An Object-Oriented Approach to Programming Logic and Design Fourth Edition Chapter 5 Arrays.
Qingqing Gan Torsten Suel CSE Department Polytechnic Institute of NYU Improved Techniques for Result Caching in Web Search Engines.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
CacheLab Recitation 7 10/8/2012. Outline Memory organization Caching – Different types of locality – Cache organization Cachelab – Tips (warnings, getopt,
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
A Formal Analysis of Conservative Update Based Approximate Counting Gil Einziger and Roy Freidman Technion, Haifa.
CSC 211 Data Structures Lecture 13
Kaleidoscope – Adding Colors to Kademlia Gil Einziger, Roy Friedman, Eyal Kibbar Computer Science, Technion 1.
The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.
Searching Given distinct keys k 1, k 2, …, k n and a collection of n records of the form »(k 1,I 1 ), (k 2,I 2 ), …, (k n, I n ) Search Problem - For key.
Practical LFU implementation for Web Caching George KarakostasTelcordia Dimitrios N. Serpanos University of Patras.
Eddies: Continuously Adaptive Query Processing Ross Rosemark.
The Bloom Paradox Ori Rottenstreich Joint work with Isaac Keslassy Technion, Israel.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
CPSC 252 Hashing Page 1 Hashing We have already seen that we can search for a key item in an array using either linear or binary search. It would be better.
1 Algorithms Queues, Stacks and Records stored in Linked Lists or Arrays.
University of Illinois at Urbana-Champaign
Cuckoo Filter: Practically Better Than Bloom Author: Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Publisher: ACM CoNEXT 2014 Presenter:
Jiahao Chen, Yuhui Deng, Zhan Huang 1 ICA3PP2015: The 15th International Conference on Algorithms and Architectures for Parallel Processing. zhangjiajie,
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Cache Data Compaction: Milestone 2 Edward Ma, Siva Penke, Abhijeeth Nuthan.
Updating SF-Tree Speaker: Ho Wai Shing.
Computer Architecture
The Variable-Increment Counting Bloom Filter
Pyramid Sketch: a Sketch Framework
Optimal Elephant Flow Detection Presented by: Gil Einziger,
TLC: A Tag-less Cache for reducing dynamic first level Cache Energy
Module IV Memory Organization.
Edge computing (1) Content Distribution Networks
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
By: Ran Ben Basat, Technion, Israel
Network-Wide Routing Oblivious Heavy Hitters
Operating Systems CMPSC 473
Heavy Hitters in Streams and Sliding Windows
By: Ran Ben Basat, Technion, Israel
Lecture 1: Bloom Filters
Mitali Rawat, Sonu Agarwal, Sudarshan Avish Maru
10/18: Lecture Topics Using spatial locality
Presentation transcript:

TinyLFU: A Highly Efficient Cache Admission Policy Gil Einziger and Roy Friedman Technion Speaker: Gil Einziger

Caching Internet Content The access distribution of most content is skewed Often modeled using Zipf-like functions, power-law, etc. A small number of very popular items For example~(50% of the weight) Frequency Long Heavy Tail For example~(50% of the weight) Rank

Caching Internet Content Unpopular items can suddenly become popular and vice versa. Blackmail is such an ugly word. I prefer "extortion". The "X" makes it sound cool.‬ Frequency Rank

Caching Any cache mechanism has to give some answer to these questions… Eviction Admission However… Many works that describe caching strategies for many domains completely neglect the admission question.

Eviction and Admission Policies Cache Victim New Item Eviction Policy Admission Policy Winner One of you guys should leave… is the new item any better than the victim? What is the common Answer?

Frequency based admission policy The item that was recently more frequent should enter the cache. I’ll just increase the cache size…

But what about the metadata size? Larger VS Smarter But what about the metadata size? Frequency based admission policy Better cache management Without admission policy Hit Rate More memory Cache Size

Window LFU A Sliding window based frequency histogram. A new item is admitted only if it is more frequent than the victim. 2 1 2 3 1 3

Eliminate The Sliding Window Keep inserting new items to the histogram until #items = W 7 #items 10 5 9 8 Once #items reaches W - divide all counters by 2. 1 2 1 4 2 3 1 1 2 3 1

Eliminating the Sliding Window Correct If the frequency of an item is constant over time… the estimation converges to the correct frequency regardless of initial value. Not Enough We still need to store the keys – that can be a lot bigger than the counters.

It is much cheaper to maintain an approximate view of the past. What are we doing? Past Approximate Future It is much cheaper to maintain an approximate view of the past.

Inspiration: Set Membership A simpler problem: Representing set membership efficiently One option: A hash table Problem: False positive (collisions) A tradeoff between size of hash table and false positive rate Bloom filters generalize hash tables and provide better space to false positive ratios

Inspiration: Bloom Filters An array BF of m bits and k hash functions {h1,…,hk} over the domain [0,…,m-1] Adding an object obj to the Bloom filter is done by computing h1(obj),…, hk(obj) and setting the corresponding bits in BF Checking for set membership for an object cand is done by computing h1(cand),…, hk(cand) and verifying that all corresponding bits are set BF= 1 1 1 √ h1(o1)=0, h2(o1)=7, h3(o1)=5 m=11, k=3, Not all 1. × h1(o2)=0, h2(o2)=7, h3(o2)=4

Counting with Bloom Filter A vector of counters (instead of bits) A counting Bloom filter supports the operations: Increment Increment by 1 all entries that correspond to the results of the k hash functions Decrement Decrement by 1 all entries that correspond to the results of the k hash functions Estimate (instead of get) Return the minimal value of all corresponding entries CBF= 4 3 9 8 6 7 k=3, h1(o1)=0, h2(o1)=7, h3(o1)=5 m=11 Estimate(o1)=4

Bloom Filters with Minimal Increment Scarifies the ability to Decrement in favor of accuracy/space efficiency During an Increment operation, only update the lowest counters MI-CBF= 4 3 8 6 k=3, h1(o1)=0, h2(o1)=7, h3(o1)=5 m=11 Increment(o1) only adds to the first entry (3->4)

Small Counters A naïve implementation would require counters of size Log(W). Can we do better? Assume that the cache size is bounded by C(<W) An item belongs to the cache if its access frequency is at least 1/C Hence, the counters can be capped by W/C (Log(W/C) bits) Example: Suppose the cache can hold 2K items and the window size is 16K => W/C=8 Each counter is only 3 bits long instead of 14 bits

Even Smaller Counters Observation: Doorkeeper In Skewed distributions, the vast majority of items appear at most once in each window Doorkeeper Divide the histogram into two MI-CBFs In the first level, have an unary MI-CBF (each counter is only 1-bit) During an increment, if all corresponding bits in the low level MI-CBF are set, then increment the corresponding counters of the second level

TinyLFU operation Estimate(item): Add(item): Bloom Filter MI-CBF Estimate(item): Return BF.contains(item) +MI-CBF.estimate(item) Add(item): W++ If(W == WindowSize) Reset() If(BF.contains(item)) Return MI-CBF.add(item) BF.add(item) Reset Divide W by 2, erase Bloom filter, divide all counters by 2. (in MI-CBF).

TinyLFU example TinyLFU Eviction Policy TiyLFU Algorithm: Cache Victim New Item Eviction Policy TinyLFU Winner TiyLFU Algorithm: Estimate both the new item and the victim. Declare winner the one with higher estimate

Bloom Filter (1 bit counter) 3-Bit counters (~500 items) Example Bloom Filter (1 bit counter) MI-CBF (3 bit counters) Few small counters Many 1-bit counters Numeric Example: (1,000 items cache) Cache Size (1000) Statistics Size (9,000) 1-Bit Counters (~7,200 items) 3-Bit counters (~500 items) 1.22 bits per counter, 1 byte per statistic item, 9 bytes per cache line.

Simulation Results: Wikipedia trace (Baaren & Pierre 2009) “10% of all user requests issued to Wikipedia during the period from September 19th 2007 to October 31th. “ YouTube trace (Cheng et al, QOS 2008) Weekly measurement of ~160k newly created videos during a period of 21 weeks. We directly created a synthetic distribution for each week.

Simulation Results: Zipf(0.9) Hit Rate Cache Size

Simulation Results: Zipf(0.7) Hit Rate Cache Size

Simulation Results: Wikipedia Hit Rate Cache Size

Simulation Results: YouTube Hit Rate Cache Size

Comparison with (Accurate) WLFU Comparable performance… but ~95% less metadata. Hit Rate Cache Size

Additional Work Complete analysis of the accuracy of the minimal increment method. Speculative routing and cache sharing for key/value stores. A smaller, better, faster TinyLFU. (with a new sketching technique) Applications in additional settings.

Thank you for your time!