C-Hint: An Effective and Reliable Cache Management for RDMA- Accelerated Key-Value Stores Yandong Wang, Xiaoqiao Meng, Li Zhang, Jian Tan Presented by:

Slides:

Advertisements

Similar presentations

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.

Advertisements

ULC: An Unified Placement and Replacement Protocol in Multi-level Storage Systems Song Jiang and Xiaodong Zhang College of William and Mary.

A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

Hadi Goudarzi and Massoud Pedram

Chapter 6: Memory Management

1 Adapted from UCB CS252 S01, Revised by Zhao Zhang in IASTATE CPRE 585, 2004 Lecture 14: Hardware Approaches for Cache Optimizations Cache performance.

Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.

Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.

Kernel memory allocation

Web Server Benchmarking Using the Internet Protocol Traffic and Network Emulator Carey Williamson, Rob Simmonds, Martin Arlitt et al. University of Calgary.

Virtual Memory Introduction to Operating Systems: Module 9.

Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.

Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.

Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.

CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.

Internet and Intranet Protocols and Applications Section V: Network Application Performance Lecture 11: Why the World Wide Wait? 4/11/2000 Arthur P. Goldberg.

RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.

Client-Server Caching James Wann April 4, Client-Server Architecture A client requests data or locks from a particular server The server in turn.

Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.

Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.

Adaptive Web Caching Lixia Zhang, Sally Floyd, and Van Jacob-son. In the 2nd Web Caching Workshop, Boulder, Colorado, April 25, System Laboratory,

A Hybrid Caching Strategy for Streaming Media Files Jussara M. Almeida Derek L. Eager Mary K. Vernon University of Wisconsin-Madison University of Saskatchewan.

CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.

Maninder Kaur CACHE MEMORY 24-Nov

File Systems and N/W attached storage (NAS) | VTU NOTES | QUESTION PAPERS | NEWS | VTU RESULTS | FORUM | BOOKSPAR ANDROID APP.

Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy,

Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers K. Vaidyanathan, S. Narravula, P. Balaji and D. K. Panda Network Based.

Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.

1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.

CPU Cache Prefetching Timing Evaluations of Hardware Implementation Ravikiran Channagire & Ramandeep Buttar ECE7995 : Presentation.

1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.

Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.

Distributed File Systems

Dynamic Time Variant Connection Management for PGAS Models on InfiniBand Abhinav Vishnu 1, Manoj Krishnan 1 and Pavan Balaji 2 1 Pacific Northwest National.

« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

© Lindsay Bradford1 Scaling Dynamic Web Content Provision Using Elapsed-Time- Based Content Degradation Lindsay Bradford, Stephen Milliner and.

Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.

A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,

Operating Systems CMPSC 473 Virtual Memory Management (3) November – Lecture 20 Instructor: Bhuvan Urgaonkar.

1 Virtual Machine Memory Access Tracing With Hypervisor Exclusive Cache USENIX ‘07 Pin Lu & Kai Shen Department of Computer Science University of Rochester.

Client-Server Model of Interaction Chapter 20. We have looked at the details of TCP/IP Protocols Protocols Router architecture Router architecture Now.

PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.

Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.

Adaptive GPU Cache Bypassing Yingying Tian *, Sooraj Puthoor†, Joseph L. Greathouse†, Bradford M. Beckmann†, Daniel A. Jiménez * Texas A&M University *,

Project Presentation By: Dean Morrison 12/6/2006 Dynamically Adaptive Prepaging for Effective Virtual Memory Management.

1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.

6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris

An Overview of Proxy Caching Algorithms Haifeng Wang.

Computer Organization CS224 Fall 2012 Lessons 41 & 42.

Energy Efficient Prefetching and Caching Athanasios E. Papathanasiou and Michael L. Scott. University of Rochester Proceedings of 2004 USENIX Annual Technical.

Brian Bershad, Thomas Anderson, Edward Lazowska, and Henry Levy Presented by: Byron Marohn Published: 1991.

EuroSys Doctoral Workshop 2011 Resource Provisioning of Web Applications in Heterogeneous Cloud Jiang Dejun Supervisor: Guillaume Pierre

P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.

An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing Liquin Cheng, John B. Carter and Donglai Dai cs.utah.edu by Evangelos Vlachos.

FaRM: Fast Remote Memory Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson and Miguel Castro, Microsoft Research NSDI’14 January 5 th, 2016 Cho,

Slicer: Auto-Sharding for Datacenter Applications

Xiaodong Wang, Shuang Chen, Jeff Setter,

The Impact of Replacement Granularity on Video Caching

Alternative system models

SABRes: Atomic Object Reads for In-Memory Rack-Scale Computing

Memory Management for Scalable Web Data Servers

Providing Secure Storage on the Internet

Chapter 9: Virtual-Memory Management

Outline Midterm results summary Distributed file systems – continued

Operating Systems CMPSC 473

Performance-Robust Parallel I/O

Presentation transcript:

C-Hint: An Effective and Reliable Cache Management for RDMA- Accelerated Key-Value Stores Yandong Wang, Xiaoqiao Meng, Li Zhang, Jian Tan Presented by: Guangxiang Du

What’s RDMA and why RDMA? Definition: RDMA is the ability of accessing memory on a remote machine without interrupting the processing of the CPU on remote machine. Characteristics: -Kernel Bypass: Applications can perform data transfer directly from userspace, without the need of data-copy and processing in network software layer (TCP/IP) in OS kernel. -No remote CPU involvement: Applications can access remote memory without consuming any CPU power in the remote machine

cont'd Benefits: -Low latency -High Throughput -Low Remote side CPU footprint

System Background - Clients have operations like GET, SET, REMOVE, etc. -Using two-side Verb-based Send/Recv for SET, REMOVE -First time access to an item with two-side Send/Recv (obtain remote address), later GET operations will be one-side RDMA read First time read Later read Two side verb- based SET

-one side RDMA read and server write create race condition? -There have been several solutions proposed, such as checksum or cache line versioning

-Server Unaware of RDMA reads, difficult to keep track of popularity of cached items. (inefficient cache replacement scheme leads to severe performance downgrade) - when the server evicts an items, it needs to invalidate the remote pointer cached in the client side. (broadcast on every eviction is significant overhead) Challenges for Cache Management brought by RDMA

-Goals: -1) deliver sustainable high hit rate with limited cache capacity. -2) not to compromise the benefit of RDMA read -3) provide reliable resource reclamation scheme. -Design decisions: -1) Client-Assisted Cache Management -2) Managing Key-Value Pairs via Leases -3) Popularity Differentiated Lease Assignment -4) Classifying Hot-Cold Items for Access History Report - 5) Delayed Memory Reclamation Overview of C-Hint

Client-Assisted Cache Management -As said before, server alone can’t track the accesses of the cached items. -Therefore, Clients are required to propagate partial access history of key-value pairs to server periodically. -The server aggregate the information to establish a global view of access pattern

-The remote pointer contains a time range (lease) within which the client is allowed to conduct GET with RDMA read. (server side guarantee that the particular item is available and not corrupted until the lease expires.) -Lease will not be extended unless the clients send a verb-based renewing message to server asking for extension. Managing Key-Value Pairs via Leases

-How long a lease term should be? -too short? Remote pointers cached by clients get invalidated frequently, low utilization of RDMA. -too long? Unpopular items consume much of the space of server cache. -Static determined lease term? Or changing dynamically? -C-hint use approximation of Multi-queue algorithm to determine the lease term for different key-value pairs adaptively according to their recency and popularity. Popularity Differentiated Lease Assignment

Cont’d

Classifying Hot-Cold Items for Access History Report -Server needs to aggregate history report from lots of clients, heavy loading, harmful to QoS of latency sensitive operations. -Clients use Adaptive Replacement Algorithm (2 LRU queues) to only report history of hot items. (smaller report) -Further optimization: clients use RDMA write to write reports to dedicated memory locations

-After REMOVE, C-hint mark item deleted, but reclaiming the memory only after the lease expires. -Avoid expensive broadcast to invalidate the client-cached remote pointers. Delayed Memory Reclamation

Performance Evaluation -Hit Ratio -Comparison between: -Original HydraDB (random replacement scheme) -Faithful LRU (disabled RDMA, as baseline) -Apprxi-LRU (C-hint design, except not Multi-queue, single LRU queue) -Faithful Multi-queue (disabled RDMA, as baseline) -C-Hint design

Experimental Environment - 5 x86_64 machines connected through InfiniBand -4 server instances on 2 different machines, clients on the rest -Benchmark -Yahoo! Cloud Serving Benchmark (YCSB 0.14) -Generate 3 workloads, (100% GET + 0% SET) (95% GET + 5% SET) (90% GET + 10% SET) -Each 300 million operation, 80 million key-value pairs (23B key + 1KB value)

Hit Ratio Analysis

Latency Analysis RDMA read slow verb-based read Fast verb-based read Fast verb-based write Slow verb-based write There are more requests whose latencies are below 20 us in the c-hint curve, which respond to rdma read. That means PDLA is better that static determined lease term in terms of rdma utilization

Impact of Access History Report

-1) vulnerable to malicious clients in a wild environment -Defer to future research topic -2) the scale of experiment may not be enough -Only with 5 machines and 4 servers instances -3) unsynchronized clock between clients and server lead to inconsistent view of lease term Cons/Criticism/thoughts

Summary -C-Hint addresses 3 challenges: -1) how to track popularity of cached items although server unaware of RDMA reads -2) effectively, reliably reclaim memory on server -3) improve the hit ratio without compromising the performance benefit of RDMA