6.830 Lecture 10 Query Optimization 10/6/2014. Selinger Optimizer Algorithm algorithm: compute optimal way to generate every sub-join: size 1, size 2,...

Slides:



Advertisements
Similar presentations
© Project Maths Development Team
Advertisements

Salvatore Ruggieri SIGKDD2010 Frequent Regular Itemset Mining 2010/9/2 1.
Data Mining Techniques Association Rule
COE 202: Digital Logic Design Combinational Circuits Part 1
Selinger Optimizer Lecture 10 October 15, 2009 Sam Madden.
Figure 7 Fractional Factorials This first example (C=AB) is in the book c1 c2 a1b1 b2 a2b1 b2 ConditionGMABCABACBCABC a 1 b 1 c a 1 b.
Lecture 10 Query Optimization II Automatic Database Design.
6.830 Lecture 11 Query Optimization & Automatic Database Design 10/8/2014.
Properties of Armstrong’s Axioms Soundness All dependencies generated by the Axioms are correct Completeness Repeatedly applying these rules can generate.
Quit Permutations Combinations Pascal’s triangle Binomial Theorem.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Classroom Exercise: Normalization
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #9.
Association Rule Mining - MaxMiner. Mining Association Rules in Large Databases  Association rule mining  Algorithms Apriori and FP-Growth  Max and.
CONGRUENT AND SIMILAR FIGURES
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #8.
Lecture 07 Prof. Dr. M. Junaid Mughal
Lecture 11 Main Memory Databases Midterm Review. Time breakdown for Shore DBMS Source: “OLTP Under the Looking Glass”, SIGMOD 2008 Systematically removed.
P ERMUTATIONS AND C OMBINATIONS Homework: Permutation and Combinations WS.
Jessie Zhao Course page: 1.
Lesson 1.9 Probability Objective: Solve probability problems.
Introduction to Geometry
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
A small country has 4 provinces – A, B, C and D. Each province contains 30%, 20%, 10% and 40% of the population in the country, respectively. The.
Recursion. Review  Recursive solutions, by definition, are built off solutions to sub-problems.  Many times, this will mean simply to compute f(n) by.
Aim: Properties of Square & Rhombus Course: Applied Geo. Do Now: Aim: What are the properties of a rhombus and a square? Find the length of AD in rectangle.
Lecture 9 Query Optimization.
0 0 Correction: Problem 18 Alice and Bob have 2n+1 FAIR coins, each with probability of a head equal to 1/2.
The Binomial Theorem Lecture 29 Section 6.7 Mon, Apr 3, 2006.
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam.
Examples. Examples (1/11)  Example #1: f(A,B,C,D) =  m(2,3,4,5,7,8,10,13,15) Fill in the 1’s. 1 1 C A B CD AB D 1 1.
Ch. 6: Permutations!.
Counting CSC-2259 Discrete Structures Konstantin Busch - LSU1.
Phrase-structure grammar A phrase-structure grammar is a quadruple G = (V, T, P, S) where V is a finite set of symbols called nonterminals, T is a set.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Arrangements, Permutations & Combinations
Minimum Cover of F. Minimal Cover for a Set of FDs Minimal cover G for a set of FDs F: –Closure of F = closure of G. –Right hand side of each FD in G.
Multiplying Binomials
CONSENSUS THEOREM Choopan Rattanapoka.
Geometry: Plane Figures Chapter. point A point marks a location. A A B B line segment the part of the line between 2 points endpoints.
Counting Techniques Tree Diagram Multiplication Rule Permutations Combinations.
March 11, 2005 Recursion (Implementation) Satish Dethe
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
南亚和印度.
Lecture 34 Section 6.7 Wed, Mar 28, 2007
Permutations and Combinations
4.5 Proving Triangles Congruent - ASA and AAS
Reducing Number of Candidates
Additional Example 3 Additional Example 4
Data Mining and Its Applications to Image Processing
Frequent Pattern Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
تصنيف التفاعلات الكيميائية
Data Mining Association Analysis: Basic Concepts and Algorithms
Pascal's Triangle This is the start of Pascal's triangle. It follows a pattern, do you know.
External Joins Query Optimization 10/4/2017
Binomial Expansion 2 L.O. All pupils can follow where the binomial expansion comes from All pupils understand binomial notation All pupils can use the.
Design and Analysis of Multi-Factored Experiments
Permutations and Combinations
Fractional Factorial Design
6.3 Proving Quadrilaterals are Parallelograms
Recall Brute Force as a Problem Solving Technique
9.2 Proving Quadrilaterals are Parallelograms
Lecture 29 Section 6.7 Thu, Mar 3, 2005
Design matrix Run A B C D E
Lines, rays and line segments
Association Analysis: Basic Concepts
The Shapley-Shubik Power Index
Presentation transcript:

6.830 Lecture 10 Query Optimization 10/6/2014

Selinger Optimizer Algorithm algorithm: compute optimal way to generate every sub-join: size 1, size 2,... n (in that order) e.g. {A}, {B}, {C}, {AB}, {AC}, {BC}, {ABC} R  set of relations to join For i in {1...|R|}: for S in {all length i subsets of R}: optjoin(S) = a join (S-a), where a is the relation that minimizes: cost(optjoin(S-a)) + min. cost to join (S-a) to a + min. access cost for a Precomputed in previous iteration!

Selinger, as code R  set of relations to join For i in {1...|R|}: for S in {all length i subsets of R}: optcost s = ∞ optjoin S = ø for a in S: //a is a relation c sa = optcost s-a + min. cost to join (S-a) to a + min. access cost for a if c sa < optcost s optcost s = c sa optjoin s = optjoin(S-a) joined optimally w/ a This is the same algorithm as on the board, written differently Pre-computed in previous iteration!

Example 4 Relations: ABCD (only consider NL join) Optjoin: A = best way to access A (e.g., sequential scan, or predicate pushdown into index...) B = " " " " B C = " " " " C D = " " " " D {A,B} = AB or BA {A,C} = AC or CA {B,C} = BC or CB {A,D} {B,D} {C,D} R  set of relations to join For i in {1...|R|}: for S in {all length i subsets of R}: optjoin(S) = a join (S-a), where a is the relation that minimizes: cost(optjoin(S-a)) + min. cost to join (S-a) to a + min. access cost for a Optjoin

Example (con’t) Optjoin {A,B,C} = remove A: compare A({B,C}) to ({B,C})A remove B: compare ({A,C})B to B({A,C}) remove C: compare C({A,B}) to ({A,B})C {A,C,D} = … {A,B,D} = … {B,C,D} = … … {A,B,C,D} = remove A: compare A({B,C,D}) to ({B,C,D})A remove B: compare B({A,C,D}) to ({A,C,D})B remove C: compare C({A,B,D}) to ({A,B,D})C remove D: compare D({A,C,C}) to ({A,B,C})D R  set of relations to join For i in {1...|R|}: for S in {all length i subsets of R}: optjoin(S) = a join (S-a), where a is the relation that minimizes: cost(optjoin(S-a)) + min. cost to join (S-a) to a + min. access cost for a Optjoin

Complexity Number of subsets of set of size n = |power set of n| = 2 n (here, n is number of relations) How much work per subset? Have to iterate through each element of each subset, so this at most n n2 n complexity (vs n!) n=12  49K vs 479M R  set of relations to join For i in {1...|R|}: for S in {all length i subsets of R}: optjoin(S) = a join (S-a), where a is the relation that minimizes: cost(optjoin(S-a)) + min. cost to join (S-a) to a + min. access cost for a Optjoin