CS157B Query Optimization.

Slides:



Advertisements
Similar presentations
พีชคณิตแบบสัมพันธ์ (Relational Algebra) บทที่ 3 อ. ดร. ชุรี เตชะวุฒิ CS (204)321 ระบบฐานข้อมูล 1 (Database System I)
Advertisements

SQL, RA, Sets, Bags Fundamental difference between theoretical RA and practical application of it in DBMSs and SQL RA  uses sets SQL  uses bags (multisets)
Tallahassee, Florida, 2014 COP4710 Database Systems Relational Algebra Fall 2014.
Basic Queries. 2 Retrieval Queries in SQL SQL has one basic statement for retrieving information from a database; the SELECT statement This is not the.
Relational Algebra Ch. 7.4 – 7.6 John Ortiz. Lecture 4Relational Algebra2 Relational Query Languages  Query languages: allow manipulation and retrieval.
CS4432: Database Systems II Query Operator & Algebraic Expressions 1.
16.2.Algebraic Laws for Improving Query Plans Algebraic Laws for Improving Query Plans Commutative and Associative Laws Laws Involving.
Oct 28, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
16.2 ALGEBRAIC LAWS FOR IMPROVING QUERY PLANS Ramya Karri ID: 206.
Instructor: Mohamed Eltabakh
Nov 18, 2003Murali Mani Relational Algebra B term 2004: lecture 10, 11.
16.2.Algebraic Laws for Improving Query Plans Algebraic Laws for Improving Query Plans Commutative and Associative Laws Laws Involving.
Murali Mani Relational Algebra. Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQL’s DML (Data Manipulation.
Midterm 1 Concepts Relational Algebra (DB4) SQL Querying and updating (DB5) Constraints and Triggers (DB11) Unified Modeling Language (DB9) Relational.
Relational Algebra Basic Operations Algebra of Bags.
1 CS 430 Database Theory Winter 2005 Lecture 12: SQL DML - SELECT.
Relational Algebra - Chapter (7th ed )
RELATIONAL ALGEBRA CHAPTER 6 1. LECTURE OUTLINE  Unary Relational Operations: SELECT and PROJECT  Relational Algebra Operations from Set Theory  Binary.
From Relational Algebra to SQL CS 157B Enrique Tang.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Advanced Relational Algebra & SQL (Part1 )
Databases : Relational Algebra - Complex Expression 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof.
Relational Algebra Instructor: Mohamed Eltabakh 1 Part II.
1 Algebra of Queries Classical Relational Algebra It is a collection of operations on relations. Each operation takes one or two relations as its operand(s)
Presented By: Miss N. Nembhard. Relation Algebra Relational Algebra is : the formal description of how a relational database operates the mathematics.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
1. Chapter 2: The relational Database Modeling Section 2.4: An algebraic Query Language Chapter 5: Algebraic and logical Query Languages Section 5.1:
CSE202 Database Management Systems
Summary of Relational Algebra
The Relational Algebra and Relational Calculus
Basic Operations Algebra of Bags
Query Optimization Heuristic Optimization
CPSC-310 Database Systems
Module 2: Intro to Relational Model
COP4710 Database Systems Relational Algebra.
Chapter # 6 The Relational Algebra and Calculus
Fundamental of Database Systems
An Algebraic Query Language
Relational Algebra - Part 1
16.2.Algebraic Laws for Improving Query Plans
CS257 Query Optimization.
The Relational Algebra and Relational Calculus
Chapter 2: Intro to Relational Model
Relational Algebra Chapter 4, Part A
Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition
Chapter 15 QUERY EXECUTION.
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Operators Expression Trees Bag Model of Data
April 6th – relational algebra
LECTURE 3: Relational Algebra
The Relational Algebra and Relational Calculus
Relational Algebra Chapter 4 - part I.
Relational Algebra : #I
The Relational Algebra
CS 3630 Database Design and Implementation
Instructor: Mohamed Eltabakh
Relational Algebra Chapter 4, Sections 4.1 – 4.2
16.2.Algebraic Laws for Improving Query Plans
Basic Operations Algebra of Bags
Lecture 3 Relational Algebra and SQL
Chapter 2: Intro to Relational Model
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
5.1 Relational Operations on Bags
CPSC-608 Database Systems
Query Compiler By:Payal Gupta Shirali Choksi Professor :Tsau Young Lin.
Relational Algebra Chapter 4 - part I.
Presentation transcript:

CS157B Query Optimization

2.4 An Algebraic Query Language 2.4.1 Why Do We Need a Special Query Language? 2.4.2 What is an Algebra? 2.4.3 Overview of Relational Algebra 2.4.4 Set Operations on Relations 2.4.5 Projection 2.4.6 Selection 2.4.7 Cartesian Product 2.4.8 Natural Joins 2.4.9 Theta-Joins 2.4.10 Combining Operations to Form Queries 2.4.11 Naming and Renaming 2.4.12 Relationships Among Operations 2.4.13 A Linear Notation for Algebraic Expressions 2.4.14 Exercises for Section 2.4

2.4.1 Why Do We Need a Special Query Language?

2.4.2 What is an Algebra? + - * / Projection Selection Natural Join Cartesian Product Union Intersect Minus

2.4.3 Overview of Relational Algebra Both numerical algebra and relational algebra are defined by (respective sets of) Algebraic Laws.

2.4.3 Overview of Relational Algebra Algebraic laws of Numbers that we have learned in calculus, algebra and etc since high school days

2.4.3 Overview of Relational Algebra Algebraic laws of relational algebra are “New” to most of us

2.4.3 Overview of Relational Algebra Special Relational Operators Projection Selection Natural Join Traditional Set Operators 4. Cartesian Product 5. Union 6. Intersect 7. Minus SOME MORE

2.4.3 Overview of Relational Algebra SOME MORE

QUERY OPTIMIZATION In query optimization the query is transformed by compiler into such a form that can solve the problem “fastest ”

QUERY OPTIMIZATION Illustration: 2 * 3 + 5 * 3 From the laws of Numerical algebra, it can be computed as follows:. Method I - It computes in three operations 2 * 3 + 5 * 3 = 6 (first) + 15(second) = 21 (third)

QUERY OPTIMIZATION Illustration: 2 * 3 + 5 * 3 From the laws of Numerical algebra, it can Also be computed as follows:. Method II- transform into equivalent one. = (2 + 5) * 3 = 7 (first) * 3 = 21 (second) It computes in two operations So compiler choose Method II

Roles of Relational Algebra QUERY OPTIMIZATION Similar Idea is used in Relational Algebra So we need to know the Algebraic Laws of Relational Algebra well That is the main goal for Ch5 and part of Ch16

Query Algebra In DBMS a query is turn into a sequence of operators on U These operators forms the query algebra (Extended RA)

Query Algebra RS  SUM (# of times in R, - - - - - - S) Bag operations: RS  SUM (# of times in R, - - - - - - S) R∩S  Min (# of times in R, - - - - - - S)

Examples Two bags: R = {A, B, B} S = {C, A, B, C} R  S ={A, A, B, B, B, C, C} R ∩ S = {A, B} R—S ={B} (Obvious?)

Selection: C(R) from R where C C can involve computable formulas SQL: Select * from R where C C can involve computable formulas

Projection: L(R) Select L from R

Projection: L(R) L can have : 1. A single attribute of R. 2. An expression x  y: It means we take the attribute x of R and rename it as y. 3. An expression E  z where E is an expression involving attributes of R and z is a new name for the attribute that results from the calculation . Ex: a+b  x

Theta Join:  Theta Join: R c S = σc(RS) R c S = πL(σc(RS)) Where Natural Join (Special Case of Theta Join) R c S = πL(σc(RS)) Where L: The list that meets the condition c of equality and redundant attributes are dropped

SORTING OPERATOR ()  L (R) This is the SQL ORDER BY clause and denoted by the operator   L (R) where R is a relation and L a list of some of R’s attributes in the relation R but with the tuples of R sorted in the order indicated by L. If L is a1, a2…an, then tuples are first sorted by a1, then a2 until an. By default sorting is in ascending order.

Grouping and Aggregation: L(R) Select L(=A,B) from R group by A A: aggregating attribute B: aggregated attributes

Grouping and Aggregation: L(R) It returns a relation that partitions the tuples of R in to groups. Each group consists of all tuples having one particular assignment of values to the grouping attributes A in L. L also contains B, Aggregated attributes, in the form: Aggregation operator  Name

Grouping and Aggregation Aggregation operators : AVG, SUM, COUNT, MIN, MAX Grouping: GROUP BY clause in SQL Having clause must follow a GROUP BY clause

Grouping and Aggregation Grouping and aggregation are generally implemented together. So we have a single operator defining it It is a generalized Projection Operator Delicate-elimination operator  is a special Aggregation operator.

 pnum, sum(qty)sum(SP) Select pnum, sum(qty) as sum from SP group by pnum;

Some rules on selection: 1.σc1 and c2(R) = σc1(σc2 (R) ) 2.σc1 or c2(R) = (σc1R) s (σc2 R) whenever c apply to R or S

Some Rules about Selection: 3.σc(R  S) = (σcR )  (σc S) 4.σc(R  S) = (σcR)  (σc S) 5.σc(R  S) = (σcR)  (σcS) whenever c apply to R or S

Rules about (generalized) Projection: 6. We may introduce a (generalized) projection anywhere in an expression tree, as long as it eliminates only attributes that are never used by any operator above

Some Rules about Projection: 7. L(R c S ) =L(M(R) c N(R)) 8. L(R  S ) =L(R) L(S)

Query Optimization Example Select p.pname, p.pnum, sum(sp.qty) as sum from Parts p, Shipments sp where p.pnum = sp.pnum and p.weight > 10 group by p.pname, p.pnum having sum(sp.qty) >= 200;

Translating to Query Algebra Step 1 PP.PNUM = SP.PNUM SP πL(σP.PNUM = SP.PNUM (P  SP)) Step 2σ P.WEIGHT >10(PP.PNUM = SP.PNUM SP) =  Step 3σSUM >= 200 (γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ())=,  Step 4 πP.PNAME, P. PNUM, SUM () = 

Query Optimization Example Select p.pname, p.pnum, sum(sp.qty) as sum From step2(Select * From step1(Select p.*, sp.snum,sp.qty from Parts p, Shipments sp where p.pnum = sp.pnum) N where N.weight > 10) S group by p.pname, p.pnum having sum(sp.qty) >= 200;

Query Optimization Example step4Select SG.pname, SG.pnum, SG.sum From step3(Select * From (Select S.pname, S.pnum, sum(S.qty) as sum From (Select * From (Select p.*, sp.snum,sp.qty from Parts p, Shipments sp where p.pnum = sp.pnum) N where N.weight > 10) S group by S.pname, S.pnum ) G where G.sum>=20 ) SG

γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM πP.PNAME, P. PNUM, SUM SKIP  σSUM >= 200 γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM σP.WEIGHT >10 P.PNUM = SP.PNUM P SP

Computing in QA  = πP.PNAME, P. PNUM, SUM(   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( σP.WEIGHT >10 (P PNUM = SP.PNUM SP)   ) =(5)πP.PNAME, P. PNUM, SUM(   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( σP.WEIGHT >10 (P)PNUM = SP.PNUM (SP)    )

Computing in QA do{ printf("%c", c); } while ((isalpha(c = getc(fp))

πP.PNAME, P. PNUM, SUM(   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( Computing in QA πP.PNAME, P. PNUM, SUM(   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( σP.WEIGHT >10 (P P.PNUM = SP.PNUM SP   )=(5) πP.PNAME, P. PNUM, SUM(   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( σP.WEIGHT >10 (P) P.PNUM = SP.PNUM σP.WEIGHT >10 (SP)   )

πP.PNAME, P. PNUM, SUM (   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( Computing in QA πP.PNAME, P. PNUM, SUM (   γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( (σP.WEIGHT >10 (P) P.PNUM = SP.PNUM σP.WEIGHT >10 (SP)   ) =D πP.PNAME, P. PNUM, SUM(    γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( σP.WEIGHT >10 (P) P.PNUM = SP.PNUM (σP.WEIGHT >10 =1)(SP)  )=

σSUM >= 200 (γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( Final Expression = πP.PNAME, P. PNUM, SUM( σSUM >= 200 (γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM ( σP.WEIGHT >10 (P) P.PNUM = SP.PNUM (SP)  )= 

γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM  σSUM >= 200 γP.PNAME, P. PNUM, SUM(SP.QTY) → SUM P.PNUM = SP.PNUM σp.weight SP P

  Final Expression is computationally cheaper than σP.WEIGHT >10 (P) is smaller than P

Final Expression = πP.PNAME, P. PNUM, SUM( σSUM >= 200 (σP.WEIGHT >10 (P) P.PNUM = SP.PNUM γP.PNAME, SUM(SP.QTY) → SUM ( (SP)  )= 

σp.weight SP P πP.PNAME, P. PNUM, SUM σSUM >= 200 P.PNUM = SP.PNUM  σSUM >= 200 P.PNUM = SP.PNUM σp.weight γSP. PNUM, SUM(SP.QTY) → SUM SP P

P.PNUM = SP.PNUM σp.weight P SP πP.PNAME, P. PNUM, SUM σSUM >= 200  P.PNUM = SP.PNUM σp.weight σSUM >= 200 P γP. PNUM, SUM(SP.QTY) → SUM SP

Computing in QA (HW) select employee.ssn, employee.lname from department, employee where department.dname = 'administration' and employee.dno = department.dnumber;

Computing in QA select S.ssn, S.lname From (Select * Select E.*,D.dname,D.mgrprikey,D.mgrssn,D.mgrstartdate from department D, employee E where E.dno = D.dnumber) N where N.dname = 'administration‘) S

Computing in QA select E.ssn, E.lname, E.bdate from ( Select * From employee X where X.address like '%bellaire%‘)E

Computing in QA select S.ssn, S.lname From (Select * Select E.*,D.dname,D.mgrprikey,D.mgrssn,D.mgrstartdate from department D, employee E where E.dno = D.dnumber) N where N.dname = 'administration‘) S

EXPRESSION TREES Generated by combining several Qerry Algebra operators into one expression by applying one operator to the result(s) of one or more operators. The leaves of this tree are names of relations. Interior nodes are operators, which are applied to the relations represented by its child or children