IPDPS 2013 - Boston Integrating Online Compression to Accelerate Large-Scale Data Analytics Applications Tekin Bicer, Jian Yin, David Chiu, Gagan Agrawal.

Slides:



Advertisements
Similar presentations
Dynamic Power Redistribution in Failure-Prone CMPs Paula Petrica, Jonathan A. Winter * and David H. Albonesi Cornell University *Google, Inc.
Advertisements

Evaluating Caching and Storage Options on the Amazon Web Services Cloud Gagan Agrawal, Ohio State University - Columbus, OH David Chiu, Washington State.
Towards Automating the Configuration of a Distributed Storage System Lauro B. Costa Matei Ripeanu {lauroc, NetSysLab University of British.
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
0 - 0.
Addition Facts
1 Processes and Threads Creation and Termination States Usage Implementations.
Availability in Globally Distributed Storage Systems
Shredder GPU-Accelerated Incremental Storage and Computation
Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.
Online Algorithm Huaping Wang Apr.21
Capacity-Approaching Codes for Reversible Data Hiding Weiming Zhang, Biao Chen, and Nenghai Yu Department of Electrical Engineering & Information Science.
Making Time-stepped Applications Tick in the Cloud Tao Zou, Guozhang Wang, Marcos Vaz Salles*, David Bindel, Alan Demers, Johannes Gehrke, Walker White.
HJ-Hadoop An Optimized MapReduce Runtime for Multi-core Systems Yunming Zhang Advised by: Prof. Alan Cox and Vivek Sarkar Rice University 1.
Addition 1’s to 20.
25 seconds left…...
Floating-Point Data Compression at 75 Gb/s on a GPU Molly A. O’Neil and Martin Burtscher Department of Computer Science.
Week 1.
CoMPI: Enhancing MPI based applications performance and scalability using run-time compression. Rosa Filgueira, David E.Singh, Alejandro Calderón and Jesús.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Adnan Ozsoy & Martin Swany DAMSL - Distributed and MetaSystems Lab Department of Computer Information and Science University of Delaware September 2011.
Computer Science and Engineering A Middleware for Developing and Deploying Scalable Remote Mining Services P. 1DataGrid Lab A Middleware for Developing.
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.
ANL Chicago Elastic and Efficient Execution of Data- Intensive Applications on Hybrid Cloud Tekin Bicer Computer Science and Engineering Ohio State.
IPDPS, Supporting Fault Tolerance in a Data-Intensive Computing Middleware Tekin Bicer, Wei Jiang and Gagan Agrawal Department of Computer Science.
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Background: MapReduce and FREERIDE Co-clustering on FREERIDE Experimental.
Tanzima Z. Islam, Saurabh Bagchi, Rudolf Eigenmann – Purdue University Kathryn Mohror, Adam Moody, Bronis R. de Supinski – Lawrence Livermore National.
High Throughput Compression of Double-Precision Floating-Point Data Martin Burtscher and Paruj Ratanaworabhan School of Electrical and Computer Engineering.
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.
An Approach for Processing Large and Non-uniform Media Objects on MapReduce-Based Clusters Rainer Schmidt and Matthias Rella Speaker: Lin-You Wu.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
HPDC 2014 Supporting Correlation Analysis on Scientific Datasets in Parallel and Distributed Settings Yu Su*, Gagan Agrawal*, Jonathan Woodring # Ayan.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
1 A Framework for Data-Intensive Computing with Cloud Bursting Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The Ohio.
Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.
Evaluating FERMI features for Data Mining Applications Masters Thesis Presentation Sinduja Muralidharan Advised by: Dr. Gagan Agrawal.
ICPP 2012 Indexing and Parallel Query Processing Support for Visualizing Climate Datasets Yu Su*, Gagan Agrawal*, Jonathan Woodring † *The Ohio State University.
PFPC: A Parallel Compressor for Floating-Point Data Martin Burtscher 1 and Paruj Ratanaworabhan 2 1 The University of Texas at Austin 2 Cornell University.
Integrating and Optimizing Transactional Memory in a Data Mining Middleware Vignesh Ravi and Gagan Agrawal Department of ComputerScience and Engg. The.
Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework.
HPDC 2013 Taming Massive Distributed Datasets: Data Sampling Using Bitmap Indices Yu Su*, Gagan Agrawal*, Jonathan Woodring # Kary Myers #, Joanne Wendelberger.
Computer Science and Engineering Parallelizing Defect Detection and Categorization Using FREERIDE Leonid Glimcher P. 1 ipdps’05 Scaling and Parallelizing.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
SC 2013 SDQuery DSI: Integrating Data Management Support with a Wide Area Data Transfer Protocol Yu Su*, Yi Wang*, Gagan Agrawal*, Rajkumar Kettimuthu.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.
Elastic Cloud Caches for Accelerating Service-Oriented Computations Gagan Agrawal Ohio State University Columbus, OH David Chiu Washington State University.
Supporting Load Balancing for Distributed Data-Intensive Applications Leonid Glimcher, Vignesh Ravi, and Gagan Agrawal Department of ComputerScience and.
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
OSU – CSE 2014 Supporting Data-Intensive Scientific Computing on Bandwidth and Space Constrained Environments Tekin Bicer Department of Computer Science.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi
Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.
Employing compression solutions under openacc
UMIACS and Department of Computer Science
UMIACS and Department of Computer Science
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Optimizing MapReduce for GPUs with Effective Shared Memory Usage
Data-Intensive Computing: From Clouds to GPU Clusters
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Resource Allocation for Distributed Streaming Applications
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

IPDPS Boston Integrating Online Compression to Accelerate Large-Scale Data Analytics Applications Tekin Bicer, Jian Yin, David Chiu, Gagan Agrawal and Karen Schuchardt Ohio State University Washington State University Pacific Northwest National Laboratories 1

IPDPS Boston Introduction Scientific simulations and instruments can generate large amount of data –E.g. Global Cloud Resolving Model 1PB data for 4km grid-cell –Higher resolutions, more and more data –I/O operations become bottleneck Problems –Storage, I/O performance Compression 2

IPDPS Boston Motivation Generic compression algorithms –Good for low entropy sequence of bytes –Scientific dataset are hard to compress Floating point numbers: Exponent and mantissa Mantissa can be highly entropic Using compression in applications is challenging –Suitable compression algorithms –Utilization of available resources –Integration of compression algorithms 3

IPDPS Boston Outline Introduction Motivation Compression Methodology Online Compression Framework Experimental Results Related Work Conclusion 4

IPDPS Boston Compression Methodology Common properties of scientific datasets –Multidimensional arrays –Consist of floating point numbers –Relationship between neighboring values Domain specific solutions can help Approach: –Prediction-based differential compression Predict the values of neighboring cells Store the difference 5

IPDPS Boston Example: GCRM Temperature Variable Compression E.g.: Temperature record The values of neighboring cells are highly related X table (after prediction): X compressed values –5bits for prediction + difference Lossless and lossy comp. Fast and good compression ratios 6

IPDPS Boston Compression Framework Improve end-to-end application performance Minimize the application I/O time –Pipelining I/O and (de)comp. operations Hide computational overhead –Overlapping app. computation with comp. framework Easy implementation of diff. comp. alg. Easy integration with applications –Similar API to POSIX I/O 7

IPDPS Boston A Compression Framework for Data Intensive Applications Chunk Resource Allocation (CRA) Layer Initialization of the system Generate chunk requests, enqueue processing Converting original offset and data size requests to compressed 8 Parallel Compression Engine (PCE) Applies encode(), decode() functions to chunks Manages in-memory cache with informed prefetching Creates I/O requests Parallel I/O Layer (PIOL) Creates parallel chunk requests to storage medium Each chunk request is handled by a group of threads Provides abstraction for different data transfer protocols

IPDPS Boston Compression Framework API User defined functions: –encode_t(…): (R) Code for compression –decode_t(…): (R) Code for decompression –prefetch_t(…): (O) Informed prefetching function Application can use below functions –comp_read: Applies decode_t to comp. chunk –comp_write: Applies encode_t to original chunk comp_seek: Mimics fseek, also utilizes prefetch_t –comp_init: Init. system (thread pools, cache etc.) 9

IPDPS Boston Prefetching and In-Memory Cache Overlapping application layer computation with I/O Reusability of already accessed data is small Prefetching and caching the prospective chunks –Default is LRU –User can analyze history and provide prospective chunk list Cache uses row-based locking scheme for efficient consecutive chunk requests 10 Informed Prefetching prefetch(…)

IPDPS Boston Integration with a Data-Intensive Computing System MapReduce style API –Remote data processing –Sensitive to I/O bandwidth Processes data in… –local cluster –cloud –or both (Hybrid Cloud) 11

IPDPS Boston Outline Introduction Motivation Compression Methodology Online Compression Framework Experimental Results Related Work Conclusion 12

IPDPS Boston Experimental Setup Two datasets: –GCRM: 375GB (L:270 + R:105) –NPB: 237GB (L:166 + R:71) 16x8 cores (Intel Xeon 2.53GHz) Storage of datasets –Lustre FS (14 storage nodes) –Amazon S3 (Northern Virginia) Compression algorithms –CC, FPC, LZO, bzip, gzip, lzma Applications: AT, MMAT, KMeans 13

IPDPS Boston Performance of MMAT 14 Breakdown of Performance Overhead (Local): 15.41% Read Speedup: 1.96

IPDPS Boston Lossy Compression (MMAT) 15 Lossy #e: # dropped bits Error bound: 5x(1/10^5)

IPDPS Boston 16 Performance of KMeans NPB dataset Comp ratio: 24.01% (180GB) More computation –More opportunity to fetch and decompression

IPDPS Boston Conclusion Management and analysis of scientific datasets are challenging –Generic compression algorithms are inefficient for scientific datasets We proposed a compression framework and methodology –Domain specific compression algorithms are fast and space efficient 51.68% compression ratio 53.27% improvement in exec. time –Easy plug-and-play of compression –Integration of the proposed framework and methodology with a data analysis middleware 17

IPDPS Boston Thanks! 18

IPDPS Boston 19 Multithreading & Prefetching Diff. # PCE and I/O Threads 2P – 4IO –2 PCE threads, 4 I/O threads One core is assigned to comp. framework

IPDPS Boston Related Work (Scientific) data management –NetCDF, PNetCDF, HDF5 –Nicolae et al. (BlobSeer) Distributed data management service for efficient reading, writing and appending ops. Compression –Generic: LZO, bzip, gzip, szip, LZMA etc. –Scientific Schendel and Jin et al. (ISOBAR) –Organizes highly entropic data into compressible data chunks Burtscher et al. (FPC) –Efficient double-precision floating point compression Lakshminarasimhan et al. (ISABELA) 20