Wavelet Compression for In Situ Data Reduction

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Dana Shapira Hash Tables
Data Models There are 3 parts to a GIS: GUI Tools
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
CS 206 Introduction to Computer Science II 02 / 27 / 2009 Instructor: Michael Eckmann.
1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
Hashing as a Dictionary Implementation
PHOTOMOD. Future outlook Aleksey Elizarov Head of Software Development Department, Racurs October 2014, Hainan, China From Imagery to Map: Digital Photogrammetric.
Unstructured Data Partitioning for Large Scale Visualization CSCAPES Workshop June, 2008 Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram.
Efficient Search in Large Textual Collections with Redundancy Jiangong Zhang and Torsten Suel Review by Newton Alex
Computer Graphics and Scientific Visualization Yi-Jen Chiang (Polytechnic University)
CS335 Networking & Network Administration Tuesday, May 11, 2010.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
DISCLAIMER: This material is based on work supported by the National Science Foundation and the Department of Defense under grant No. CNS Any.
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
 Let A and B be any sets A binary relation R from A to B is a subset of AxB Given an ordered pair (x, y), x is related to y by R iff (x, y) is in R. This.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Publication: Ra Inta, David J. Bowman, and Susan M. Scott. Int. J. Reconfig. Comput. 2012, Article 2 (January 2012), 1 pages. DOI= /2012/ Naveen.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Hashing Chapter 20. Hash Table A hash table is a data structure that allows fast find, insert, and delete operations (most of the time). The simplest.
DBMS Implementation Chapter 6.4 V3.0 Napier University Dr Gordon Russell.
GPU Architectural Considerations for Cellular Automata Programming A comparison of performance between a x86 CPU and nVidia Graphics Card Stephen Orchowski,
The Scalable Data Management, Analysis, and Visualization Institute VTK-m: Accelerating the Visualization Toolkit for Multi-core.
Silberschatz, Galvin and Gagne Operating System Concepts Chapter 9: Virtual Memory.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)
JAVA AND MATRIX COMPUTATION
Hashing Hashing is another method for sorting and searching data.
Hashing as a Dictionary Implementation Chapter 19.
Compact, Fast and Robust Grids for Ray Tracing Ares Lagae & Philip Dutré 19 th Eurographics Symposium on Rendering EGSR 2008Wednesday, June 25th.
Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a.
Recursive Architectures for 2DLNS Multiplication RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR 11 Recursive Architectures for 2DLNS.
Sections 10.5 – 10.6 Hashing.
GPU Architecture and Its Application
Understanding Data Storage
Virtual memory.
Processes and threads.
Virtual Memory CSSE 332 Operating Systems
Fast and Robust Hashing for Database Operators
Parallel Databases.
UNIVERSITY OF MASSACHUSETTS Dept
Data Abstraction & Problem Solving with C++
Hashing - Hash Maps and Hash Functions
COMP 430 Intro. to Database Systems
Cache Memory Presentation I
Real-Time Ray Tracing Stefan Popov.
Accelerating MapReduce on a Coupled CPU-GPU Architecture
A Pattern Specification and Optimizations Framework for Accelerating Scientific Computations on Heterogeneous Clusters Linchuan Chen Xin Huo and Gagan.
Ray-Cast Rendering in VTK-m
Scientific Discovery via Visualization Using Accelerated Computing
Advanced Associative Structures
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
MASS CUDA Performance Analysis and Improvement
Scientific Achievement
Hash Tables.
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
Indexing and Hashing Basic Concepts Ordered Indices
Lecture 29: Virtual Memory-Address Translation
Lecture 3: Main Memory.
Methodology – Monitoring and Tuning the Operational System
Advanced Implementation of Tables
Computer Organization & Architecture 3416
Portable Performance for Many-Core Particle Advection
Advance Database System
6- General Purpose GPU Programming
In Situ Fusion Simulation Particle Data Reduction Through Binning
Lecture-Hashing.
CS222P: Principles of Data Management UCI, Fall 2018 Notes #04 Schema versioning and File organizations Instructor: Chen Li.
Presentation transcript:

Wavelet Compression for In Situ Data Reduction Scientific Achievement Wavelet compression is now available in the VTK-m library. The algorithm was re-thought within the VTK-m framework, and was shown to be efficient on both CPUs and GPUs. Significance and Impact The growing divide between compute and I/O will require data reduction to occur in situ on the next generation of supercomputers. Wavelet compression is an important reduction technique, since it balances between storage costs and integrity. With our efforts, any stakeholder using VTK-m can now apply wavelet compression. Citation S. Li, N. Marsaglia, V. Chen, C. Sewell, J. Clyne, and H. Childs. “Achieving Portable Performance For Wavelet Compression Using Data Parallel Primitives.” In Proceedings of EuroGraphics Symposium on Parallel Graphics and Visualization (EGPGV), pages 73–81, Barcelona, Spain, June 2017. Relative performance of different hash functions on NVIDIA Tesla P100. Performance of wavelet compression (in seconds) for VTK-m and for a CPU-specific implementation (VAPOR) for data sets ranging from 2563 to 20483. Overall, our hardware-agnostic approach had comparable performance to hardware-specific approaches. Results with GPU comparisons were similarly even. Scientific visualization algorithms often produce meshes comprising many small cells of data that in aggregate describe a full volume or surface. These many small cell units can allow these algorithms to engage the high degree of parallelism required to make full use of modern accelerators like GPU’s and Xeon Phi’s. However, when these elements are created independently and in parallel, shared features of these elements are duplicated, which leads to redundant data and loss of connectivity information. A known parallel technique to find duplicate entities is to generate identifiers for elements that are the same if and only if the two elements are duplicate. One such identifier is the spatial coordinates although using an index-based approach is generally faster and more robust. In either case, a parallel sort readily finds duplicate indices. However, the sort is slow because the indices are large (192 bits is common). In this work we experiment with using 32-bit hash values, which are not guaranteed to be unique but sort much faster. We resolve hash collisions in a later step. Resolving hash collisions takes time, but much less time than that saved in the sort operation. Smaller hash values also provide the opportunity to repalce the sort with a hash table, which sometimes increases the performance further. Different hash functions can make the performance of the algorithm behave differently, so we experimented with 3 distinct versions of hashing. In our experiments, we found that we had up to 7.8x speed up in the external faces algorithm by using hash functions rather than indices.