Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Christian Lauterbach COMP 770, 2/16/2009. Overview  Acceleration structures  Spatial hierarchies  Object hierarchies  Interactive Ray Tracing techniques.
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Materialization and Cubing Algorithms. Cube Materialization Each cell of the data cube is a view consisting of an aggregation of interest. The values.
VLDB 2011 Pohang University of Science and Technology (POSTECH) Republic of Korea Jongwuk Lee, Seung-won Hwang VLDB 2011.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
DIMACS Streaming Data Working Group II On the Optimality of the Holistic Twig Join Algorithm Speaker: Byron Choi (Upenn) Joint Work with Susan Davidson.
Fast Algorithms For Hierarchical Range Histogram Constructions
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Lecture 3: Parallel Algorithm Design
1 Parallel Parentheses Matching Plus Some Applications.
Generating the Data Cube (Shared Disk) Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch.
Benchmarking Parallel Code. Benchmarking2 What are the performance characteristics of a parallel code? What should be measured?
Decision Tree under MapReduce Week 14 Part II. Decision Tree.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
Efficient Multidimensional Packet Classification with Fast Updates Author: Yeim-Kuan Chang Publisher: IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 4, APRIL.
Advanced Topics in Algorithms and Data Structures Page 1 An overview of lecture 3 A simple parallel algorithm for computing parallel prefix. A parallel.
The Euler-tour technique
Fast Agglomerative Clustering for Rendering Bruce Walter, Kavita Bala, Cornell University Milind Kulkarni, Keshav Pingali University of Texas, Austin.
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates
FLANN Fast Library for Approximate Nearest Neighbors
A Survey of Parallel Tree- based Methods on Option Pricing PRESENTER: LI,XINYING.
Parallel OLAP Andrew Rau-Chaplin Faculty of Computer Science Dalhousie University Joint Work with F. Dehne T. Eavis S. Hambrusch.
Randomized Algorithms - Treaps
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Shared Memory Parallelization of Decision Tree Construction Using a General Middleware Ruoming Jin Gagan Agrawal Department of Computer and Information.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Computer Science and Engineering Parallelizing Defect Detection and Categorization Using FREERIDE Leonid Glimcher P. 1 ipdps’05 Scaling and Parallelizing.
FREERIDE: System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Ge Yang Gagan Agrawal Department of Computer and Information.
Frank Dehnewww.dehne.net Parallel Data Cube Data Mining OLAP (On-line analytical processing) cube / group-by operator in SQL.
1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State.
Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.
5/29/2008AI UEC in Japan Chapter 12 Clustering: Large Databases Written by Farial Shahnaz Presented by Zhao Xinyou Data Mining Technology.
2015/12/251 Hierarchical Document Clustering Using Frequent Itemsets Benjamin C.M. Fung, Ke Wangy and Martin Ester Proceeding of International Conference.
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
System Support for High Performance Data Mining Ruoming Jin Leo Glimcher Xuan Zhang Gagan Agrawal Department of Computer and Information Sciences Ohio.
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Research Overview Gagan Agrawal Associate Professor.
Attribute Allocation in Large Scale Sensor Networks Ratnabali Biswas, Kaushik Chowdhury, and Dharma P. Agrawal International Workshop on Data Management.
1 Parallel Datacube Construction: Algorithms, Theoretical Analysis, and Experimental Evaluation Ruoming Jin Ge Yang Gagan Agrawal The Ohio State University.
System Support for High Performance Scientific Data Mining Gagan Agrawal Ruoming Jin Raghu Machiraju S. Parthasarathy Department of Computer and Information.
Label Embedding Trees for Large Multi-class Tasks Samy Bengio Jason Weston David Grangier Presented by Zhengming Xing.
A novel, low-latency algorithm for multiple group-by query optimization Duy-Hung Phan Pietro Michiardi ICDE16.
Lecture 3: Parallel Algorithm Design
A Dynamic Scheduling Framework for Emerging Heterogeneous Systems
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
RE-Tree: An Efficient Index Structure for Regular Expressions
Interquery Parallelism
Database Performance Tuning and Query Optimization
Supporting Fault-Tolerance in Streaming Grid Applications
Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz
Communication and Memory Efficient Parallel Decision Tree Construction
Parallel Matrix Operations
Fast and Exact K-Means Clustering
Transactions with Nested Parallelism
Resource Allocation in a Middleware for Streaming Data
Chapter 11 Database Performance Tuning and Query Optimization
Decision Trees for Mining Data Streams
Gary M. Zoppetti Gagan Agrawal Rishi Kumar
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
FREERIDE: A Framework for Rapid Implementation of Datamining Engines
Multiobjective Optimization
Presentation transcript:

Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University Evaluating Impact of Data Distribution, Level of Parallelism and Communication Performance on Data Cube Construction Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University

Motivation Datasets for off-line processing are becoming larger. A system storing and allowing analysis on such datasets is a data warehouse Frequent queries on data warehouses require aggregation along one or more dimensions Data cube construction performs all aggregations in advance to facilitate fast responses to all queries Data cube construction is a compute and data-intensive problem Memory requirements become the bottleneck for sequential algorithms Construct data cubes in parallel in cluster environments!

Outline Issues in sequential / parallel data cube construction Aggregation tree Parallel algorithms for data cube construction Varying level of parallelism Varying frequency of parallelism Trade-offs in parallel data cube construction Impact of data distribution Impact of parallelism Impact of communication frequency

Data Cube – Definition Data cube construction involves computing aggregates for all values across all possible subsets of dimensions If the original dataset is n dimensional, the data cube construction includes computing and storing nCm m-dimensional arrays Three-dimensional data cube construction involves computing arrays AB, AC, BC, A, B, C and a scalar value all

Main Issues Cache and Memory Reuse Using Minimal Parents Each portion of the parent array is read only once to compute its children. Corresponding portions of each child should be updated simultaneously Using Minimal Parents If a child has more than one parent, it uses the minimal parent which requires less computation to obtain the child Choose a spanning tree with minimal parents Memory Management Write back the output array to the disk if there is no child which is computed from this array Communication Volume Appropriately partition along one or more dimensions to guarantee minimal communication volume

Aggregation Tree Given a set X = {1, 2, …, n} and a prefix tree P(n), the corresponding aggregation tree A(n) is constructed by complementing every node in P(n) with respect to X Power Set Lattice Prefix tree Aggregation tree

Related Theoretical Results Data cube construction using aggregation tree Use right-to-left depth-first traversal Has a number of theoretical results The total memory requirement for holding the results is minimally bounded The total communication volume is bounded All arrays are computed from their minimal parents A procedure of partitioning input datasets exists for minimizing interprocessor communication

Level One Parallel Algorithm Main ideas Each processor computes a portion of each child at the first level. Lead processors have the final results after interprocessor communication. If the output is not used to compute other children, write it back; otherwise compute children on lead processors.

Example Assumption Initially 8 processors Each of the three dimensions is partitioned in half Initially Each processor computes partial results for each of D1D2, D1D3 and D2D3 D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 all Three-dimensional array D1D2D3 with |D1|  |D2|  |D3|

Example (cont.) Lead processors for D1D2 (l1, l2, 0) (l1, l2, 1) (0, 0, 0) (0, 0, 1) (0, 1, 0) (0, 1, 1) (1, 0, 0) (1, 0, 1) (1, 1, 0) (1, 1, 1) Write back D1D2 on lead processors D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 all Three-dimensional array D1D2D3 with |D1|  |D2|  |D3|

Example (cont.) (l1, 0, l3) (l1, 1, l3) D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 Lead processors for D1D3 (l1, 0, l3) (l1, 1, l3) (0, 0, 0) (0, 1, 0) (0, 0, 1) (0, 1, 1) (1, 0, 0) (1, 1, 0) (1, 0, 1) (1, 1, 1) Compute D1 from D1D3 on lead processors; write back D1D3 on lead processors Lead processors for D1 (l1, 0, 0) (l1, 0, 1) (0, 0, 0) (0, 0, 1) (1, 0, 0) (1, 0, 1) Write back D1 on lead processors D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 all Three-dimensional array D1D2D3 with |D1|  |D2|  |D3|

Performance Factors Level One Parallel Opt. Level One Parallel Reducing Comm. Freq. Increasing Parallelism All Levels Parallel More Memory Opt. All Levels Parallel Reducing Comm. Freq. More Comm. Volume

Impact of Communication Frequency Optimized versions with less comm. freq. are better!

Impact of Parallelism More parallelism is preferred though it increases communication volume!

Impact of Data Distribution Partitioning along multiple dimensions does better!

Related Work Goil et. al did the initial work on parallelizing data cube construction Dehne et. al focused on a shared-disk model where all processors access data from a common set of disks. They did not consider memory requirement issue either Our work includes concrete results on minimized memory requirements and communication volume. Our work focuses on a shared-nothing model which is more commonly used.