Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir.

Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir

GOAL Explore the feasibility of a distributed caching mechanism inside Hadoop

Presentation Overview Motivation Design Experimental Results Future Work

Motivation Disk Access Times are a bottleneck in cluster computing Large amount of data is read from disk DARE RAMClouds PACMan – Coordinated Cache Replacement We want to strike a balance between RAM and Disk Storage

Our Approach Integrate Memcached with Hadoop Used Quickcached and Spymemcached Reserve a portion of the main memory at each node to serve as local cache Local caches aggregate to abstract a distributed caching mechanism governed by Memcached Greedy caching strategy Least Recently Used (LRU) cache eviction policy

Design Overview

Memcached

Design Choice 1 Simultaneous requests to Namenode and Memcached Minimizes access latency with additional network overhead

Design Choice 2 Send request to Namenode only in the case of a cache miss Minimizes network overhead with increased latency

Design Choice 3 Datanodes send requests only to Memcached Memcached checks for cached blocks If cache miss occurs, it contacts the namenode and returns the replicas addresses to the datanodes

Global Cache Replacement LRU based Global Cache Eviction Scheme

Prefetching

Simulation Results Test data ranging from 2GB to 24GB Word Count and Grep

Word Count

Future Work Implement a pre-fetching mechanism Customized caching policies based on access patterns Compare and contrast caching with locality aware scheduling

Conclusion Caching can improve the performance of cluster based systems based on the access patterns of the workload being executed

Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir.

Similar presentations

Presentation on theme: "Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir.

Similar presentations

Presentation on theme: "Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir."— Presentation transcript:

Similar presentations

About project

Feedback