Tradeoffs in Scalable Data Routing for Deduplication Clusters FAST '11 Wei Dong From Princeton University Fred Douglis, Kai Li, Hugo Patterson, Sazzala.

Slides:

Advertisements

Similar presentations

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.

Advertisements

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

LIBRA: Lightweight Data Skew Mitigation in MapReduce

Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.

Davide Frey, Anne-Marie Kermarrec, Konstantinos Kloudas INRIA Rennes, France Plug.

LOAD BALANCING IN A CENTRALIZED DISTRIBUTED SYSTEM BY ANILA JAGANNATHAM ELENA HARRIS.

1 Symmetrical Pair Scheme: a Load Balancing Strategy to Solve Intra- movie Skewness for Parallel Video Servers Song Wu and Hai Jin Huazhong University.

Scalable Content-aware Request Distribution in Cluster-based Network Servers Jianbin Wei 10/4/2001.

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.

1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.

Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.

Effectively Utilizing Global Cluster Memory for Large Data-Intensive Parallel Programs John Oleszkiewicz, Li Xiao, Yunhao Liu IEEE TRASACTION ON PARALLEL.

Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?

A Novel Video Layout Strategy for Near-Video-on- Demand Servers Shenze Chen & Manu Thapar Hewlett-Packard Labs 1501 Page Mill Rd. Palo Alto, CA

1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman

Yongtao Zhou, Yuhui Deng, Junjie Xie

1 04/18/2005 Flux Flux: An Adaptive Partitioning Operator for Continuous Query Systems M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin UC.

By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and

1© Copyright 2012 EMC Corporation. All rights reserved. WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane,

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.

Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.

RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.

DEDUPLICATION IN YAFFS KARTHIK NARAYAN PAVITHRA SESHADRIVIJAYAKRISHNAN.

Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.

© 2011 IBM Corporation 11 April 2011 IDS Architecture.

An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng.

1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.

Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.

1 The Google File System Reporter: You-Wei Zhang.

RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.

Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2

Parallel and Distributed IR. 2 Papers on Parallel and Distributed IR Introduction Paper A: Inverted file partitioning schemes in Multiple Disk Systems.

Segment-Based Proxy Caching of Multimedia Streams Authors: Kun-Lung Wu, Philip S. Yu, and Joel L. Wolf IBM T.J. Watson Research Center Proceedings of The.

 Protocols used by network systems are not effective to distributed system  Special requirements are needed here.  They are in cases of: Transparency.

THE DESIGN AND IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM M. Rosenblum and J. K. Ousterhout University of California, Berkeley.

File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.

Improving Content Addressable Storage For Databases Conference on Reliable Awesome Projects (no acronyms please) Advanced Operating Systems (CS736) Brandon.

PFPC: A Parallel Compressor for Floating-Point Data Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Cornell University.

EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.

CS 149: Operating Systems March 3 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak

Embedded System Lab 김해천 Thread and Memory Placement on NUMA Systems: Asymmetry Matters.

IIIT Hyderabad Scalable Clustering using Multiple GPUs K Wasif Mohiuddin P J Narayanan Center for Visual Information Technology International Institute.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng, Patrick P. C. Lee The Chinese University of Hong Kong.

Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.

Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.

Witold Litwin Université Paris Dauphine Darrell LongUniversity of California Santa Cruz Thomas SchwarzUniversidad Católica del Uruguay Combining Chunk.

6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.

Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.

An Evaluation of Partitioners for Parallel SAMR Applications Sumir Chandra & Manish Parashar ECE Dept., Rutgers University Submitted to: Euro-Par 2001.

Memory Coherence in Shared Virtual Memory System ACM Transactions on Computer Science(TOCS), 1989 KAI LI Princeton University PAUL HUDAK Yale University.

FALL 2005CENG 351 Data Management and File Structures1 External Sorting Reference: Chapter 8.

Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,

GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.

Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.

Memory Management.

Chris Cai, Shayan Saeed, Indranil Gupta, Roy Campbell, Franck Le

Optimizing Parallel Algorithms for All Pairs Similarity Search

Parallel Databases.

Edinburgh Napier University

Scalability to Hundreds of Clients in HEP Object Databases

Process Scheduling B.Ramamurthy 9/16/2018.

Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

Process Scheduling B.Ramamurthy 2/23/2019.

Fan Ni Xing Lin Song Jiang

Presentation transcript:

Tradeoffs in Scalable Data Routing for Deduplication Clusters FAST '11 Wei Dong From Princeton University Fred Douglis, Kai Li, Hugo Patterson, Sazzala Reddy, Philip Shilance From EMC (Thu) Kwangwoon univ. SystemSoftware Lab. HoSeok Seo 1

Introduction  This paper proposes  a deduplication cluster storage system having a primary node with the a hard disk  Basically cluster storage systems are...  a well-known technique to increase capacity  but have 2 problems -less deduplication than the single node system -not exhibit linear performance 2

Introduction  Goal  Scalable Throughput -using Super-chunk for data transfer -maximize the parallelism of disk I/O by balanced routing data to nodes -reduce bottleneck of disk I/O utilizing cache locality  Scalable Capacity -using a cluster storage system -route repeated data to the same node -maintain the balanced utilization between nodes  High Deduplication like single node system -using a super-chunk that consist of consecutive chunks 3

Introduction  Chunk  Definition -A segment of Data stream  Merits -when a chunk size is small... Show high deduplication -when a chunk size is big... Show high throughput 4

Introduction  Super-chunk  Definition -Consist of consecutive chunks  Merits -Maintain high cache locality -Reduce system overhead -Get similar deduplication rate of chunk  Demerits -Risk of duplication creation -Can result in imbalance utilization between nodes  Issues of super-chunk -How they are formed -How they are assigned to nodes -How they route super-chunks to nodes for a balance 5

Dataflow of Deduplication Cluster 1. Divide Data Streams into Chunks 2. Create fingerprints of chunks 3. Create a super-chunk 4. Select a representative for a super-chunk in chunks 5. Route a super-chunk to one of nodes 6

Deduplication flow at a node (cont.) 7

Deduplication flow at a node Dup? at dedup logic Fingerprint in cache? Fingerprint in index? Write Fingerprint & Chunk to a container no yes no Dediplication Done yes no Is a container full? Write a container to a disk A chunk Load fingerprints were written at the same time to cache yes Color box means that it requires disk access 8

What is Container?  Container  Definition -fixed-size large pieces in a disk -consist of two part : Fingerprint & Chunk Data  Usage -Use it to store Fingerprint & Chunk of non-duplicated data into a disk Fingerprints Chunk Data 9

Issue 1 : How super-chunk are formed?  How super-chunk are formed?  Determine an average super-chunk size -Experimented with a variety size from 8KB to 4MB -Generally 1MB is a good choice 10

Issue 2 : How they assigned to nodes  Use Bin Manager running on master node  Bin Manager executes rebalance between nodes by bin migration( For stateless routing ) 1. assign number of bin to a super-chunk node 1 node 2node 3node N bin1bin2bin3...bin M node1node2node3...node N bin manager M>N a super-chunk 2. find a node by number of bin 3. route a super-chunk to a node 11

Issue 3 : How they route super-chunks to nodes for balance  Use two DATA Routing to overcome demerits of super-chunk  stateless technique with a bin migration -light-weight and well suited for most balanced workloads  stateful technique -Improve deduplication while avoiding data skew 12

Stateless Technique  Basic  1. Create fingerprint about each chunks  2. Select a representative fingerprint in fingerprints  3. allocate a bin to super-chunk ( such mod #bin )  How to Create fingerprint  Hash all of chunk ( a.k.a hash(*) )  Hash N byte of chunk ( a.k.a hash(N) )  ※ Use SHA-1 Hash function  How to select representative fingerprint  first  maximum  minimum 13

Stateful Technique (cont.)  Merits compare to Stateless  Higher Deduplication like single node backup system  Balanced overload  Bin migration no longer needed  Demerits  Increased operations  Increased cost of memory or communication 14

Stateful Technique  Process  Calculate "weighted voting"  Select a node that has the highest weighted voting number of match * overloaded value

Datasets 16

Evaluation Metrics  Capacity  Total Deduplication (TD) -the original dataset size % deduplication size  Data Skew -Max node utilization % avg node utilization  Effective Deduplication (ED) -TD % Data Skew  Normalized ED -Show that how much deduplication close to a single-node system  Throughput  # of on-disk fingerprint index lookups 17

Experimental Results : Overall Effectiveness 18 Using Trace-driven simulation

Experimental Results : Overall Effectiveness with mig 19

Experimental Results : Feature Selection HYDRAstor - Routing chunks to nodes according to content - Good performance - Worse deduplication rate due to 64KB chunks 20

Experimental Results : Cache Locality and Throughput Logical Skew : max(size before dedupe) / avg ( size before dedupe) 21 Max lookup : maximum normalized total number of fingerprint index lookups ED : Effective Deduplication (32node)

Experimental Results : Effect of Bin Migration The ED drops between migration points due to increasing skew. 22

Summary StatelessStateful Small Clusters Large Clusters ALL DeduplicationGoodBadGood Data SkewGoodBadGood OverheadGood Bad 23

Conclusion  1. Using Super-chunks for data routing is superior to using individual chunks to achieve scalable throughput while maximizing deduplication  2. The stateless routing method (hash(64)) with bin migration is a simple and efficient way  3. The effective deduplication of the stateless routed cluster may drop quickly as the number of nodes increases. To solve this problem, proposed stateful data routing approach. Simulations show good performance when using up to 64 nodes in a cluster 24