Efficient Join Query Evaluation in a Parallel Database System

Slides:



Advertisements
Similar presentations
February 20, Spatio-Temporal Bandwidth Reuse: A Centralized Scheduling Mechanism for Wireless Mesh Networks Mahbub Alam Prof. Choong Seon Hong.
Advertisements

Choosing an Order for Joins
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
GRAPH BALANCING. Scheduling on Unrelated Machines J1 J2 J3 J4 J5 M1 M2 M3.
Great Theoretical Ideas in Computer Science for Some.
CIS December '99 Introduction to Parallel Architectures Dr. Laurence Boxer Niagara University.
S KEW IN P ARALLEL Q UERY P ROCESSING Paraschos Koutris Paul Beame Dan Suciu University of Washington PODS 2014.
HyperCuP – P2P Network Boyko Syarov. 2 Outline  HyperCup: What is it?  Basic Concepts  Broadcasting Algorithm  Topology Construction  Ontology Based.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Parallel and Distributed IR. 2 Papers on Parallel and Distributed IR Introduction Paper A: Inverted file partitioning schemes in Multiple Disk Systems.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Advanced Databases: Lecture 6 Query Optimization (I) 1 Introduction to query processing + Implementing Relational Algebra Advanced Databases By Dr. Akhtar.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
Data Structures and Algorithms in Parallel Computing Lecture 2.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,
Lecture 19 Minimal Spanning Trees CSCI – 1900 Mathematics for Computer Science Fall 2014 Bill Pine.
BY: Mark Gruszecki.  What is a Recursive Query?  Definition(s) and Algorithm(s)  Optimization Techniques  Practical Issues  Impact of each Optimization.
1 Semijoin Reduction in Query Processors Stocker, Kossman, Braumandl, Kemper Integrating Semi-Join-Reducers into State-of-the-Art Query Processors ICDE.
1 VLDB, Background What is important for the user.
A novel, low-latency algorithm for multiple group-by query optimization Duy-Hung Phan Pietro Michiardi ICDE16.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.
Chiu Luk CS257 Database Systems Principles Spring 2009
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
CS 545 – Fundamentals of Stream Processing – Consistent Hashing
Auburn University
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Exploratory Decomposition Dr. Xiao Qin Auburn.
Database System Implementation CSE 507
CS 540 Database Management Systems
CSCI5570 Large Scale Data Processing Systems
CS 440 Database Management Systems
CSC 421: Algorithm Design & Analysis
CSC 421: Algorithm Design & Analysis
Advanced Computer Networks
Database Management System
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Dr. Rachel Ben-Eliyahu – Zohary
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Data Partition Dr. Xiao Qin Auburn University.
CSC 421: Algorithm Design & Analysis
COMP 430 Intro. to Database Systems
Methodology – Physical Database Design for Relational Databases
CS222P: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
Great Theoretical Ideas in Computer Science
Evaluation of Relational Operations
Parallel Programming in C with MPI and OpenMP
COST ESTIMATION FOR THE RELATIONAL ALGEBRA OPERATIONS MIT 813 GROUP 15 PRESENTATION.
Chapter 15 QUERY EXECUTION.
Lecture 17: Distributed Transactions
On Spatial Joins in MapReduce
Discrete Mathematics for Computer Science
Chapter 11 Limitations of Algorithm Power
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
Parallel DBMS Chapter 22, Sections 22.1–22.6
Advanced Implementation of Tables
“Definition” of Combinational
Solving the Minimum Labeling Spanning Tree Problem
Overview of Query Evaluation
CSC 421: Algorithm Design & Analysis
Compact routing schemes with improved stretch
Complexity Theory in Practice
CS222: Principles of Data Management Lecture #15 Query Optimization (System-R) Instructor: Chen Li.
CSC 421: Algorithm Design & Analysis
Multidisciplinary Optimization
Presentation transcript:

Efficient Join Query Evaluation in a Parallel Database System

Distributed Multi-Join Query Evaluation n relations , k join attributes p servers connected by a network data is partitioned uniformly on the servers the problem can be divided into two sub-problems: how to efficiently join local data items on each server? how to deliver all data items between servers?

Distributed Shuffle Algorithms Broadcast Shuffle Hash-Based Shuffle HyperCube Shuffle

Broadcast Shuffle S11 S21 S21 S21 S21 S31 S31 S31 S31 P1 S14 S24 S24

Hash-Based Shuffle continuing for another round… S11 S121 S21 S1231 P1 S14 S124 S24 S1234 S34 S12 S122 S22 S1232 S32 P4 P2 S13 S23 S123 S1233 S33 continuing for another round… P3

HyperCube Shuffle S1(x1 = 2, x2 = 4) S1(x1 = 2, x2 = 4)

HyperCube Shuffle – contd.

HyperCube – Optimal Shares Factorization Recall that the dimensions of the hypercube are determined by shares How do we find the optimal factorization? If , the solution is optimal What about other cases? Rounding works poorly (e.g., )

Optimal Shares Factorization – The Intuition check all possible combinations of shares which satisfy select the combination with the smallest maximum workload defined as the maximum amount of data assigned to a single worker can be easily computed for each configuration break ties by choosing a hypercube with more evenly sized edges

Optimal Shares Factorization – The Algorithm best_config = ; curr_min = 0 for each configuration such that if curr_min: curr_min = best_workload = else if == curr_min: best_config = configuration with more even dimensions return best_config

Sequential Join Algorithms Centralized algorithms Used within a single node We examine two algorithms of this type: Binary Symmetric Hash Join Tributary Multi-Way Join

Binary Symmetric Hash Join x1 x2 2 3 5 1 4 6 x1 x2 x3 2 1 3 4 5 6 x1 x2 x3 2 3 4 x2 x3 1 2 3 5 4 6 x1 x3 2 1 4 3 5 6

Tributary Multi-Way Join binary search (x3 = 1)… binary search (x2 = 0)… S1 S2 S3 S123 x1 x2 2 3 5 1 4 6 x2 x3 1 2 3 5 4 6 x2 2 3 4 5 x1 x3 2 1 4 3 5 6 x3 2 4 3 5 x1 x2 x3 2 3 4 x1 x2 x3 x1 x2 x3 2 3 4 x1 x2 x3 2 3 4

Attributes Order Optimization Tributary Join requires the optimizer to choose a global order of all attributes that participate in the join Bad orderings may lead to extremely bad performance (similarly to a classic join ordering problem..) A cost model is required to allow the optimizer to compare between different orders

Tributary Join Cost Model Intuitively, the most expensive step is the binary search We will estimate the number of binary searches during the join operation A cost function for each step of attribute order selection will be given, based on the number of unique values of each attribute in the current relation

Tributary Join Cost Model – Definitions Let: - an order on the join attributes - the cost of the i-th step of - the projection of attributes on , i.e., a subset of which appears in - the number of unique values of an attribute in - the number of unique combinations of in

Tributary Join Cost Function Then the cost of an order is calculated as follows: With the following definition of a query cost:

Empirical Evaluation We compare between the different combinations of the above algorithms 3 shuffle algorithms × 2 sequential algorithms = 6 combinations Different queries with various topologies of the query graph: cycle – each relation is joined with two other relations, query graph forms a circle clique – each relation is joined with other n-1 relations acyclic etc.

Empirical Evaluation - Queries The paper presents evaluation of 8 queries We only discuss three most interesting of them: cycle query with large intermediate results clique query with large intermediate results acyclic query with small intermediate results

Cycle Query with Large Intermediate Results RS_HJ – Regular Shuffle with Hash Join BR_TJ – Broadcast Shuffle with Tributary Join RS_TJ – Regular Shuffle with Tributary Join HC_HJ – HyperCube Shuffle with Hash Join BR_HJ – Broadcast Shuffle with Tributary Join HC_TJ – HyperCube Shuffle with Tributary Join

Cycle Query - Analysis HyperCube shuffles less data, since it doesn’t send intermediate join results Additionally, HyperCube achieves good load balance and much less skew in data distribution Tribulary Join outperforms a tree of binary joins because it avoids generating a huge number of intermediate results However, it requires HyperCube to fully exploit its potential, since: 1)it requires all the input relations 2)the sorting doesn’t scale well

Clique Query with Large Intermediate Results RS_HJ – Regular Shuffle with Hash Join BR_TJ – Broadcast Shuffle with Tributary Join RS_TJ – Regular Shuffle with Tributary Join HC_HJ – HyperCube Shuffle with Hash Join BR_HJ – Broadcast Shuffle with Tributary Join HC_TJ – HyperCube Shuffle with Tributary Join

Clique Query - Analysis Similarly to the previous query, the combination HC_TJ is the best in terms of query runtime, total CPU time and total data shuffled Broadcast Shuffle performs poorly, since every join involves at least a single full relation due to large number of join attributes in each relation, no such problem for Regular Shuffle

Acyclic Query with Small Intermediate Results RS_HJ – Regular Shuffle with Hash Join BR_TJ – Broadcast Shuffle with Tributary Join RS_TJ – Regular Shuffle with Tributary Join HC_HJ – HyperCube Shuffle with Hash Join BR_HJ – Broadcast Shuffle with Tributary Join HC_TJ – HyperCube Shuffle with Tributary Join

Acyclic Query - Analysis Since the intermediate results are small, Regular Shuffle sends significantly less data than do HyperCube and Broadcast (which send base data only) For the same reason, with HyperCube and Broadcast each worker processes much more data locally Due to data sorting step of Tributary Join it performs poorly for large inputs as compared to Hash Join

Experimental Results Summary There is no overall best query plan HC_TJ combination outperforms the others in presence of large intermediate results or significant data skew When the intermediate results are small, and there is no significant skew, the traditional join techniques lead to best performance