BD-Cache: Big Data Caching for Datacenters

Slides:

Advertisements

Similar presentations

Investigating Distributed Caching Mechanisms for Hadoop Gurmeet Singh Puneet Chandra Rashid Tahir.

Advertisements

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

U P C MICRO36 San Diego December 2003 Flexible Compiler-Managed L0 Buffers for Clustered VLIW Processors Enric Gibert 1 Jesús Sánchez 2 Antonio González.

Disk-Locality in Datacenter Computing Considered Irrelevant Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, Ion Stoica 1.

 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)

Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.

Outline | Motivation| Design | Results| Status| Future

Big Data and Hadoop and DLRL Introduction to the DLRL Hadoop Cluster Sunshin Lee and Edward A. Fox DLRL, CS, Virginia Tech 21 May 2015 presentation for.

Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.

Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

A Dynamic MapReduce Scheduler for Heterogeneous Workloads Chao Tian, Haojie Zhou, Yongqiang He,Li Zha 簡報人：碩資工一甲董耀文.

SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.

Flashing Up the Storage Layer I. Koltsidas, S. D. Viglas (U of Edinburgh), VLDB 2008 Shimin Chen Big Data Reading Group.

A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.

Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.

1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.

Cool ideas from RAMCloud Diego Ongaro Stanford University Joint work with Asaf Cidon, Ankita Kejriwal, John Ousterhout, Mendel Rosenblum, Stephen Rumble,

EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.

Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.

Comparing Memory Systems for Chip Multiprocessors Leverich et al. Computer Systems Laboratory at Stanford Presentation by Sarah Bird.

Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!

+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.

DCIM: Distributed Cache Invalidation Method for Maintaining Cache Consistency in Wireless Mobile Networks.

Academic Research IT Cost Overview Intended for storage needs < 5 TB.

ApproxHadoop Bringing Approximations to MapReduce Frameworks

Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.

Load Rebalancing for Distributed File Systems in Clouds.

1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.

COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University

PACMan: Coordinated Memory Caching for Parallel Jobs Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker,

Decentralized Distributed Storage System for Big Data Presenter: Wei Xie Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department.

CCD-410 Cloudera Certified Developer for Apache Hadoop (CCDH) Cloudera.

Parallel Virtual File System (PVFS) a.k.a. OrangeFS

- Inter-departmental Lab

Virtualizing Big Data/Hadoop Workloads on vSphere

How Alluxio (formerly Tachyon) brings a 300x performance improvement to Qunar’s streaming processing Xueyan Li (Qunar) & Chunming Li (Garena)

Achieving the Ultimate Efficiency for Seismic Analysis

Efficient data maintenance in GlusterFS using databases

Parallel-DFTL: A Flash Translation Layer that Exploits Internal Parallelism in Solid State Drives Wei Xie1 , Yong Chen1 and Philip C. Roth2 1. Texas Tech.

Diskpool and cloud storage benchmarks used in IT-DSS

Distributed Network Traffic Feature Extraction for a Real-time IDS

Running virtualized Hadoop, does it make sense?

BD-CACHE Big Data Caching for Datacenters

Authors: Sajjad Rizvi, Xi Li, Bernard Wong, Fiodar Kazhamiaka

Section 7 Erasure Coding Overview

HPE Persistent Memory Microsoft Ignite 2017

PA an Coordinated Memory Caching for Parallel Jobs

Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.

Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and Large-Scale SSD Array Systems Sungjoon Koh, Jie Zhang, Miryeong.

A Survey on Distributed File Systems

Gwangsun Kim Niladrish Chatterjee Arm, Inc. NVIDIA Mike O’Connor

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

The Composite-File File System: Decoupling the One-to-one Mapping of Files and Metadata for Better Performance Shuanglong Zhang, Helen Catanese, Andy An-I.

Presenter: Zhengyu Yang

Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

-A File System for Lots of Tiny Files

Verilog to Routing CAD Tool Optimization

Data-Intensive Computing: From Clouds to GPU Clusters

Overview of big data tools

Lecture 16 (Intro to MapReduce and Hadoop)

Specialized Cloud Architectures

Group Based Management of Distributed File Caches

CANDY: Enabling Coherent DRAM Caches for Multi-node Systems

High Performance Computing

PVFS: A Parallel File System for Linux Clusters

DryadInc: Reusing work in large-scale computations

Optimizing NAND Flash-Based SSDs via Retention Relaxation Ren-Shou Liu, Chia-Lin Yang, Wei Wu National Taiwan University and Intel Corporation USENIX.

LSbM-tree:一个读写兼优的大数据存储结构

Dong Hyun Kang, Changwoo Min, Young Ik Eom

Presentation transcript:

BD-Cache: Big Data Caching for Datacenters Boston University* Northeastern University † Problem Implementation Initial Results In a multi-tenant data center, the network (a) between the compute clusters and shared storage, (b) among different racks of compute clusters may become a bottleneck. Big Data Frameworks such as Hadoop and Spark are common residents of these datacenters, most of the jobs are IO bound that can get impacted by potential network bottlenecks. Previous studies shows that Big Data frameworks have high input data reuse, uneven data popularity, sequential data access. Two level caching mechanism implemented by modifying the original CEPH Rados Gateway. L1-Cache and L2-cache are logically separated, they physically share the same physical cache infrastructure. “BD-Cache” supports read/write traffics but only cache on read operations, stores data in SSDs running on EXT4, reads and writes data asynchronously for better performance, understands Swift and S3, uses random replacement. CACHE MISS PERFORMANCE CACHING PREFETCHING Cache-RGW imposes no overhead Our Architecture CACHE HIT PERFORMANCE Cache Nodes are placed per rack and each holds high performance Intel NVMe-SSDs. Node Rack 1 L1 CACHE CACHE NODE 1 Rack 2 CACHE NODE 2 Rack N CACHE NODE N L2 CACHE Compute Cluster STORAGE CLUSTER Methodology Experimental configurations: Unmodified-RGW Cache-RGW Ceph cluster: 10 Lenova storage nodes, each has 9 HDDs 128 GB DRAM per node Cache node 2x 1.5TB Intel SSD 128 GB DRAM Requests: 4GB Files requested in parallel by curl command Caching improves the read performance significantly. Cache-RGW saturates SSD. Future Work Evaluate caching architecture by benchmarking real-world workloads. Prefetching Cache replacement algorithms Enable Caching on write operations Project Webpage: http://info.massopencloud.org/blog/bigdata-research-at-moc Github Repo for Cache-RGW Code: https://github.com/maniaabdi/engage1 L1 Cache: Rack Local, thus reduces inter rack traffic among the cluster racks. L2 Cache: Distributed and shared among racks, therefore reduces traffic between the clusters and the back-end storage. Anycast Network Solution allows Nodes to access the nearest cache node, and if a cache node fails, the Nodes will be directed to a redundant cache node transparently.