DISCLAIMER: This material is based on work supported by the National Science Foundation and the Department of Defense under grant No. CNS-1263183. Any.

Slides:

Advertisements

Similar presentations

Tuning of Loop Cache Architectures to Programs in Embedded System Design Susan Cotterell and Frank Vahid Department of Computer Science and Engineering.

Advertisements

Integrating Cybersecurity Log Data Analysis in Hadoop Bryan Stearns, Susan Urban, Sindhuri Juturu Texas Tech University 2014 NSF Research Experience for.

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.

Difference Engine: Harnessing Memory Redundancy in Virtual Machines by Diwaker Gupta et al. presented by Jonathan Berkhahn.

Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.

Introduction  Data movement is a major bottleneck in data-intensive high performance computing  We propose a Fusion Active Storage System (FASS) to address.

Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu*, and Lide Duan Dept. of Electrical & Computer Engineering Louisiana State University.

CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.

1 Regular expression matching with input compression ： a hardware design for use within network intrusion detection systems Department of Computer Science.

CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.

Crossroads: A Practical Data Sketching Solution for Mining Intersection of Streams Jun Xu, Zhenglin Yu (Georgia Tech) Jia Wang, Zihui Ge, He Yan (AT&T.

A Virtual Environment for Investigating Counter Measures for MITM Attacks on Home Area Networks Lionel Morgan 1, Sindhuri Juturu 2, Justin Talavera 3,

Yongtao Zhou, Yuhui Deng, Junjie Xie

Cloud based Dynamic workflow with QOS for Mass Spectrometry Data Analysis Thesis Defense: Ashish Nagavaram Graduate student Computer Science and Engineering.

MetaSync File Synchronization Across Multiple Untrusted Storage Services Seungyeop Han Haichen Shen, Taesoo Kim*, Arvind Krishnamurthy,

A Critical Infrastructure Testbed for Cybersecurity Research and Education Ai Onda, Kalana Pothuvila, Joseph Urban, and Jordan Berg Abstract Awareness.

An Evaluation of Using Deduplication in Swappers Weiyan Wang, Chen Zeng.

1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.

Simple Interface for Polite Computing (SIPC) Travis Finch St. Edward’s University Department of Computer Science, School of Natural Sciences Austin, TX.

Understanding the Benefits and Costs of Deduplication Mahmoud Abaza, and Joel Gibson School of Computing and Information Systems, Athabasca University.

Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Background: MapReduce and FREERIDE Co-clustering on FREERIDE Experimental.

Microsoft ® SQL Server ® 2008 and SQL Server 2008 R2 Infrastructure Planning and Design Published: February 2009 Updated: January 2012.

Abstract A software development life cycle can be divided into requirements elicitation, specification, design, implementation, testing, and maintenance.

Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.

Chapter 13 File Structures. Understand the file access methods. Describe the characteristics of a sequential file. After reading this chapter, the reader.

University of Maryland Compiler-Assisted Binary Parsing Tugrul Ince PD Week – 27 March 2012.

Abstract Plant phenotyping involves the assessment of plant traits such as growth, tolerance, resistance, and yield. The Texas Tech Phenotyping Project.

Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.

Chao “Bill” Xie, Victor Bolet, Art Vandenberg Georgia State University, Atlanta, GA 30303, USA February 22/23, 2006 SURA, Washington DC Memory Efficient.

Breaking the Memory Wall in MonetDB

Yongzhi Wang, Jinpeng Wei VIAF: Verification-based Integrity Assurance Framework for MapReduce.

1. Department of Arts and Sciences, Georgia State University 2. Department of Electrical and Computer Engineering, Texas Tech University 3. Department.

DATA DEDUPLICATION By: Lily Contreras April 15, 2010.

Parallelizing Security Checks on Commodity Hardware E.B. Nightingale, D. Peek, P.M. Chen and J. Flinn U Michigan.

CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.

Improving Content Addressable Storage For Databases Conference on Reliable Awesome Projects (no acronyms please) Advanced Operating Systems (CS736) Brandon.

Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data Nihat Altiparmak, Ali Saman Tosun The University of Texas at San.

LATA: A Latency and Throughput- Aware Packet Processing System Author: Jilong Kuang and Laxmi Bhuyan Publisher: DAC 2010 Presenter: Chun-Sheng Hsueh Date:

Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,

Integrating and Optimizing Transactional Memory in a Data Mining Middleware Vignesh Ravi and Gagan Agrawal Department of ComputerScience and Engg. The.

Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:

EQC16: An Optimized Packet Classification Algorithm For Large Rule-Sets Author: Uday Trivedi, Mohan Lal Jangir Publisher: 2014 International Conference.

Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto.

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng, Patrick P. C. Lee The Chinese University of Hong Kong.

Computing Scientometrics in Large-Scale Academic Search Engines with MapReduce Leonidas Akritidis Panayiotis Bozanis Department of Computer & Communication.

Optimizing MapReduce for GPUs with Effective Shared Memory Usage Department of Computer Science and Engineering The Ohio State University Linchuan Chen.

Layali Rashid, Wessam M. Hassanein, and Moustafa A. Hammad*

OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.

Hyperion :High Volume Stream Archival Divya Muthukumaran.

Cuckoo Filter: Practically Better Than Bloom Author: Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Publisher: ACM CoNEXT 2014 Presenter:

1 Thierry Titcheu Chekam 1,2, Ennan Zhai 3, Zhenhua Li 1, Yong Cui 4, Kui Ren 5 1 School of Software, TNLIST, and KLISS MoE, Tsinghua University 2 Interdisciplinary.

Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,

Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.

Achieving the Ultimate Efficiency for Seismic Analysis

Efficient Multi-User Indexing for Secure Keyword Search

Database Management System

Chapter 12: Query Processing

Unistore: Project Updates

Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

Table 3: P-values for 2016 Data Table 1: Mean Percentages

Distributed P2P File System

Optimizing MapReduce for GPUs with Effective Shared Memory Usage

Indexing and Hashing Basic Concepts Ordered Indices

Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz

Database Design and Programming

Comparison to existing state of security experimentation

Author: Xianghui Hu, Xinan Tang, Bei Hua Lecturer: Bo Xu

Wavelet Compression for In Situ Data Reduction

Fan Ni Xing Lin Song Jiang

Presentation transcript:

DISCLAIMER: This material is based on work supported by the National Science Foundation and the Department of Defense under grant No. CNS Any opinions, findings, and conclusions expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the Department of Defense.  Objectives  To design a data deduplication system that provides 100% data integrity without a possibility of losing data, and  To introduce a multi-level data deduplication method to reduce costly byte-by-byte comparisons for 100% integrity  Background  Reaching high ratios of data deduplication in High Performance Computing is highly achievable  Prior art demonstrates magnitudes of reduction possible and 15 to 30 percent of redundant data reduction on average  Challenges  Existing deduplication systems can lose data due to hash collisions even though the probability is low  Byte-by-byte comparisons for all blocks can eliminate the potential data loss but are very costly  Contributions  Designed a multi-level dedup method to reduce byte-by-byte comparisons while providing 100% data integrity  Implemented multi-level dedup while talking advantage of XeonPhi to compute cryptographic fingerprints concurrently  Proof-of-concept evaluations confirm promising results Abstract B FNA832A PO23I912 FNA832A NH49HJ2 MN239A7DFA6723J CAS98L2 KNADU82 C D A  Xeon Phi  Intel, Many Integrated Core Architecture (MIC)  Compute cryptographic fingerprints concurrently for multi-level data deduplication for rapid computations of hashes  OpenMP  Xeon Phi uses OpenMP to compile parallel programs  With appropriate additions of “pragma” lines, the compiler interprets the program as parallel  Sequential Vs. Parallel  Coded a sequential and parallel program in C that performs multi level hashing and dedup for large climate data sets  Compare the speed performance on Xeon Phi Nodes  Goal:  Ensure no unique data is discarded in the process of data deduplication while maintaining exceptional performance Goal and Approach [1] Meister, D., Kaiser, J., Brinkmann, A., Cortes, T., Kuhn, M., & Kunkel, J. (2012, November). A study on data deduplication in HPC storage systems. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (p. 7). IEEE Computer Society Press. [2] Broder, A., & Mitzenmacher, M. (2001). Using multiple hash functions to improve IP lookups. In INFOCOM Twentieth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE(Vol. 3, pp ). IEEE. [3] Mark Ruijter. lessfs - open source data de-duplication. February [Online; Consulted on June 23, 2014]. [4] Koutoupis, P. (2011). Data deduplication with Linux. Linux Journal, 2011(207), 7. References  If several cryptographic fingerprints map to the same hash index, multiple hash functions are used as needed in data deduplication until we ensure the current data blocks being analyzed are in fact redundant  Multi-Level Hashing  When two data blocks produce the same cryptographic fingerprint, we cannot immediately tell if both blocks are redundant or if a collision has occurred  Existing systems ignore collisions citing the probability is low and can lose data, intolerable especially for mission critical systems  We conduct byte-by-byte comparisons for 100% data integrity  We also introduce multi-level dedup to reduce costly comparisons and leverage parallel architectures to speed up hash calculations Although using low bit hash functions increases the chances for hash collisions, compared to secure hashing algorithms (2 32 vs ), it has the potential to save memory when storing cryptographic fingerprints in the hash table Binary StreamData Chunks1 st Hash Function2 nd Hash Function  Although our research performs an additional check for handling hash collisions while most data deduplication file systems do not, we highly emphasize the importance of not replying on probability and not losing data in data deduplication  We provide this assurance by providing byte-by-byte comparisons and multi-level dedup technique  We conduct proof-of-concept verification with open source data deduplication file system Lessfs and Xeon Phi Conclusion and Future Work  Proof-of-concept prototyping with Lessfs  Open source inline data deduplication file system written for Linux  Lessfs, as most data deduplication file systems, do not implement byte-by-byte comparisons  Statistically, the hash functions provided are robust enough  Our motivation is to ensure no unique data is discarded as redundant data especially in reliable systems such as:  Aircrafts, space flight, military, and medicine Performance ResultsMethodology Two methods in Lessfs that are responsible for hashing the data blocks and determining whether the data blocks already exists  Tests with Climate Data Sets  We performed four test runs by gathering three unique climate data sets from open source databases  Our results demonstrate that when using more threads in the parallel program, the elapsed time for hashing significantly drops  For instance, some sets results demonstrate:  Sequential = 27m34.689s vs Xeon Phi = 7m11.970s  However, different threads may be more appropriate at different times depending on the data size  Three different buffer sizes were used: 1GB, 512MB, 256MB  A 90% improvement was achieved while parallelizing the program  Hashing 1 GB buffer on average:16.83s - Sequential  Lowest average for hashing 1 GB buffer: 2.35s – OpenMP w/ Xeon Phi Eric Valenzuela 1, Yin Lu 1, Susan Urban 2, Yong Chen 1 1 Department of Computer Science, 2 Department of Industrial Engineering, Texas Tech University Texas Tech University 2014 NSF Research Experiences for Undergraduates Site Project Multi-Level Hashing Dedup in HPC Storage Systems The OpenMP code that was written can be compile and ran as a parallel program without taking advantage of the Xeon Phi’s MIC’s. However, an even greater performance was achieved with the combination of parallelism and Xeon Phi characteristics