Ymir Vigfusson, Emory University Hjortur Bjornsson University of Iceland Ymir Vigfusson Emory University / Reykjavik University Trausti Saemundsson Reykjavik.

Slides:



Advertisements
Similar presentations
CMSC420: Skip Lists Kinga Dobolyi Based off notes by Dave Mount.
Advertisements

Chapter 101 The LRU Policy Replaces the page that has not been referenced for the longest time in the past By the principle of locality, this would be.
D. Tam, R. Azimi, L. Soares, M. Stumm, University of Toronto Appeared in ASPLOS XIV (2009) Reading Group by Theo 1.
Scalable Content-Addressable Network Lintao Liu
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
Fast Algorithms For Hierarchical Range Histogram Constructions
RATHIJIT SEN DAVID A. WOOD Reuse-based Online Models for Caches 6/20/2013 ACM SIGMETRICS CMU, Pittsburgh, PA 1.
B+-trees and Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: page = b bytes or B records (or block) If r is the size of a.
1 Virtual Memory in the Real World Implementing exact LRU Approximating LRU Hardware Support Clock Algorithm Thrashing Cause Working Set.
B+-tree and Hashing.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Cooperative Caching Middleware for Cluster-Based Servers Francisco Matias Cuenca-Acuna Thu D. Nguyen Panic Lab Department of Computer Science Rutgers University.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
1 PATH: Page Access Tracking Hardware to Improve Memory Management Reza Azimi, Livio Soares, Michael Stumm, Tom Walsh, and Angela Demke Brown University.
CSI 400/500 Operating Systems Spring 2009 Lecture #9 – Paging and Segmentation in Virtual Memory Monday, March 2 nd and Wednesday, March 4 th, 2009.
Lecture 17: Virtual Memory, Large Caches
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
Proteus: Power Proportional Memory Cache Cluster in Data Centers Shen Li, Shiguang Wang, Fan Yang, Shaohan Hu, Fatemeh Saremi, Tarek Abdelzaher.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Memory Management Last Update: July 31, 2014 Memory Management1.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Network Topologies Topology – how nodes are connected – where there is a wire between 2 nodes. Routing – the path a message takes to get from one node.
Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.
Massively Distributed Database Systems Broadcasting - Data on air Spring 2014 Ki-Joune Li Pusan National University.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 17 Disk Storage, Basic File Structures, and Hashing.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Approximating Hit Rate Curves using Streaming Algorithms Nick Harvey Joint work with Zachary Drudi, Stephen Ingram, Jake Wires, Andy Warfield TexPoint.
1 Sampling-based Program Locality Approximation Yutao Zhong, Wentao Chang Department of Computer Science George Mason University June 8th,2008.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Lecture 5 Cost Estimation and Data Access Methods.
Cheap and Large CAMs for High Performance Data-Intensive Networked Systems Ashok Anand, Chitra Muthukrishnan, Steven Kappes, and Aditya Akella University.
Counter Stacks: Storage Workload Analysis via Streaming Algorithms Nick Harvey University of British Columbia and Coho Data Joint work with Zachary Drudi,
1 Virtual Machine Memory Access Tracing With Hypervisor Exclusive Cache USENIX ‘07 Pin Lu & Kai Shen Department of Computer Science University of Rochester.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Bits Eugene Wu, Carlo Curino, Sam Madden
Demand Paging Reference Reference on UNIX memory management
1 Lecture: Cache Hierarchies Topics: cache innovations (Sections B.1-B.3, 2.1)
An Overview of Proxy Caching Algorithms Haifeng Wang.
Massively Distributed Database Systems Broadcasting - Data on air Spring 2015 Ki-Joune Li Pusan National University.
Chapter 5 Record Storage and Primary File Organizations
1 Lecture: Large Caches, Virtual Memory Topics: cache innovations (Sections 2.4, B.4, B.5)
Memshare: a Dynamic Multi-tenant Key-value Cache
Lecture: Large Caches, Virtual Memory
Lecture: Large Caches, Virtual Memory
CS522 Advanced database Systems
Multilevel Memories (Improving performance using alittle “cash”)
Lecture: Cache Hierarchies
Demand Paging Reference Reference on UNIX memory management
(Find all PTEs that map to a given PPN)
Memory Management for Scalable Web Data Servers
Plethora: Infrastructure and System Design
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Demand Paging Reference Reference on UNIX memory management
Be Fast, Cheap and in Control
Lecture 12: Cache Innovations
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Predictive Performance
Qingbo Zhu, Asim Shankar and Yuanyuan Zhou
DDM – A Cache-Only Memory Architecture
Reuse-based Online Models for Caches
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
Lecture: Cache Hierarchies
Distributed Systems CS
Operating Systems CMPSC 473
Heavy Hitters in Streams and Sliding Windows
Module IV Memory Organization.
Fast Accesses to Big Data in Memory and Storage Systems
Virtual Memory CSE451 Andrew Whitaker.
Presentation transcript:

Ymir Vigfusson, Emory University Hjortur Bjornsson University of Iceland Ymir Vigfusson Emory University / Reykjavik University Trausti Saemundsson Reykjavik University Gregory Chockler University of London, Royal Holloway

Queries Results Queries Results Clients Memcache servers Database tier

Queries Results Queries Results Clients Memcache servers Database tier Too many cache servers is a waste of resources

Queries Results Queries Clients Memcache servers Database tier Results DRAMATIZATION Results Queries Too few cache servers overload the database

How do we optimize cache resources?

 The key parameter is the cache size Hit rate for the current allocation

How do we optimize cache resources? Efficiency Time overheadSpace overhead Accuracy High fidelity to true hit rate curves Provable guarantees Usability Simple interfaceModularity

M IMIR Cache server HRC estimator Ghost list/ghost filter Hit(e) Miss(e) Set(e) Evict(e) Replacement algorithm Get/Set(e ) Aging policy Export-HRC()

 Inclusion property:  Contents of a smaller cache are contained in a bigger cache given same input  Holds for LRU, LFU, OPT,... LRU list Idea: Produce LRU hit rate curves by tracking distance from head Stack distance

ghufertb  On every hit, determine stack distance  Accumulate in a Hit Rate Curve PDF ghufertb Stack distance # of hits eghufrtbeghufrtbteghufrb Mattson et al. (1970) Walking a linked list on every access is inefficient

Bennett & Kruskal (1975) Almasi et al. (2002) ghufertb  Could use self-balancing binary search trees  AVL tree, red-black trees,... log N Trees accentuate lock contention, hurting performance

 Prior focus on exact results  Prior focus on very high stack distances Can we trade off accuracy for efficiency?

 Extend LRU list with dataless „ghost“ entries 2N N

 LRU list before hit on item e B Buckets g,h,ue,r,tfb,c,d,a Bucket # Each item tracks what bucket it is in

PDF update Hit rate curve (PDF) Stack distance Estimated # of hits B Buckets g,h,ue,r,tfb,c,d,a Bucket #  Update statistics for e’s bucket Unit area uniformly distributed in bucket

e,g,h,ur,tfb,c,d,a PDF update Hit rate curve (PDF) Stack distance Estimated # of hits g,h,ue,r,tfb,c,d,a Bucket #  Move e to front, tag with first bucket Unit area uniformly distributed in bucket Overflow looming

e,g,h,ur,tfb,c,d,ag,h,ufr,t,b,c,d,ae g,h,ue,r,tfb,c,d,a Bucket #  When first bucket full, perform aging Overflow looming  Stacker: Walk list, bump items below average reuse distance to an older bucket on the right.  Rounder: Decrement bucket identifiers, shifting the frame of reference. Coalesce two oldest buckets. O(B) amortized O(1)

Hit rate curve (PDF) Stack distance Estimated # of hits  Periodically calculate and export the hit rate curve  Exponentially average the PDF Hit rate curve (CDF) Stack distance Cumulative hits

 Maintain ghost list of length N  Hit on item e:  Update PDF statistics for e’s bucket  Move e to front of list, tag with first bucket  When front bucket full, perform aging  Periodically calculate and export HRC O(B) or O(1) O(1) O(log B)

ROUNDER STACKER 90% Accuracy: %

Hit rate curve (PDF) Stack distance Estimated # of hits OPT True stack distance Possible location ALG Unit area

 Memcached + YCSB #BucketsMAE 81.2% 160.6% 320.4% 640.3% % x 10 Each node has 6 Intel Xeon 2.4GHz; 48GB DRAM; 40Gbps QDR Infiniband interconnect; shared storage

 Memcached + YCSB #BucketsMAE 81.2% 160.6% 320.4% 640.3% % 2-5% throughput and latency degradation

PaperAreaKey ideaOnlinePrecisionParallel Mattson et al. 1970Storage Stack distance, LRU linear search NoExactNo Kim et al. OSDI´00V-MemoryNoApproxNo Almási et al. MSP’02Storage?LRU AVL-TreeNoExactNo Ding&Zhong PLDI’03CompilersCompressed treesNoApprox.No Zhou et al. ASPLOS’04V-MemoryNo Geiger ASPLOS’06V-Memory Infer page fault from I/O, ghost lists NoApprox.No RapidMRC ASPLOS’09V-MemoryFixed LRU bucketsNoApprox.No Hwang&Wood ICAC’13 Network caches Hash rebalancingYesN/A Wires et al. OSDI ´14StorageLRU counter stacksNo*Approx.No M IMIR SOCC ´14 Network caches Variable LRU buckets YesApprox.Yes

Optimizing cache resources by profiling hit rates online Efficiency 2-5% performance degradation Accuracy % on traces Usability M IMIR is modularSimple algorithms

Optimizing cache resources by profiling hit rates online Efficiency Time: O(log B) Space: O(1) 0-2% throughput degradation Scalability Online profiling Composable estimate Accuracy % on tracesError = O(1/B) Can we do cost-benefit analysis of other cloud services? What should be evicted from a distributed cache? How should variable miss penalties be treated? Where should we place data in a cache hierarchy? Can we relieve cache hot spots? Can we do cost-benefit analysis of other cloud services? What should be evicted from a distributed cache? How should variable miss penalties be treated? Where should we place data in a cache hierarchy? Can we relieve cache hot spots?

 Divide cache resources between services  Progressively allocate space to service with highest marginal hit rate  Optimal for concave HRC N [IBM J. of R&D ’11, LADIS ‘11]