Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University

Similar presentations


Presentation on theme: "Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University"— Presentation transcript:

1 Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University
Evaluating Impact of Data Distribution, Level of Parallelism and Communication Performance on Data Cube Construction Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University

2 Motivation Datasets for off-line processing are becoming larger.
A system storing and allowing analysis on such datasets is a data warehouse Frequent queries on data warehouses require aggregation along one or more dimensions Data cube construction performs all aggregations in advance to facilitate fast responses to all queries Data cube construction is a compute and data-intensive problem Memory requirements become the bottleneck for sequential algorithms Construct data cubes in parallel in cluster environments!

3 Outline Issues in sequential / parallel data cube construction
Aggregation tree Parallel algorithms for data cube construction Varying level of parallelism Varying frequency of parallelism Trade-offs in parallel data cube construction Impact of data distribution Impact of parallelism Impact of communication frequency

4 Data Cube – Definition Data cube construction involves computing aggregates for all values across all possible subsets of dimensions If the original dataset is n dimensional, the data cube construction includes computing and storing nCm m-dimensional arrays Three-dimensional data cube construction involves computing arrays AB, AC, BC, A, B, C and a scalar value all

5 Main Issues Cache and Memory Reuse Using Minimal Parents
Each portion of the parent array is read only once to compute its children. Corresponding portions of each child should be updated simultaneously Using Minimal Parents If a child has more than one parent, it uses the minimal parent which requires less computation to obtain the child Choose a spanning tree with minimal parents Memory Management Write back the output array to the disk if there is no child which is computed from this array Communication Volume Appropriately partition along one or more dimensions to guarantee minimal communication volume

6 Aggregation Tree Given a set X = {1, 2, …, n} and a prefix tree P(n), the corresponding aggregation tree A(n) is constructed by complementing every node in P(n) with respect to X Power Set Lattice Prefix tree Aggregation tree

7 Related Theoretical Results
Data cube construction using aggregation tree Use right-to-left depth-first traversal Has a number of theoretical results The total memory requirement for holding the results is minimally bounded The total communication volume is bounded All arrays are computed from their minimal parents A procedure of partitioning input datasets exists for minimizing interprocessor communication

8 Level One Parallel Algorithm
Main ideas Each processor computes a portion of each child at the first level. Lead processors have the final results after interprocessor communication. If the output is not used to compute other children, write it back; otherwise compute children on lead processors.

9 Example Assumption Initially 8 processors
Each of the three dimensions is partitioned in half Initially Each processor computes partial results for each of D1D2, D1D3 and D2D3 D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 all Three-dimensional array D1D2D3 with |D1|  |D2|  |D3|

10 Example (cont.) Lead processors for D1D2
(l1, l2, 0) (l1, l2, 1) (0, 0, 0) (0, 0, 1) (0, 1, 0) (0, 1, 1) (1, 0, 0) (1, 0, 1) (1, 1, 0) (1, 1, 1) Write back D1D2 on lead processors D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 all Three-dimensional array D1D2D3 with |D1|  |D2|  |D3|

11 Example (cont.) (l1, 0, l3) (l1, 1, l3) D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1
Lead processors for D1D3 (l1, 0, l3) (l1, 1, l3) (0, 0, 0) (0, 1, 0) (0, 0, 1) (0, 1, 1) (1, 0, 0) (1, 1, 0) (1, 0, 1) (1, 1, 1) Compute D1 from D1D3 on lead processors; write back D1D3 on lead processors Lead processors for D1 (l1, 0, 0) (l1, 0, 1) (0, 0, 0) (0, 0, 1) (1, 0, 0) (1, 0, 1) Write back D1 on lead processors D1D2D3 D2D3 D1D3 D1D2 D3 D2 D1 all Three-dimensional array D1D2D3 with |D1|  |D2|  |D3|

12 Performance Factors Level One Parallel Opt. Level One Parallel
Reducing Comm. Freq. Increasing Parallelism All Levels Parallel More Memory Opt. All Levels Parallel Reducing Comm. Freq. More Comm. Volume

13 Impact of Communication Frequency
Optimized versions with less comm. freq. are better!

14 Impact of Parallelism More parallelism is preferred though it increases communication volume!

15 Impact of Data Distribution
Partitioning along multiple dimensions does better!

16 Related Work Goil et. al did the initial work on parallelizing data cube construction Dehne et. al focused on a shared-disk model where all processors access data from a common set of disks. They did not consider memory requirement issue either Our work includes concrete results on minimized memory requirements and communication volume. Our work focuses on a shared-nothing model which is more commonly used.


Download ppt "Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University"

Similar presentations


Ads by Google