Optimization of Relational Algebra Expressions Database I.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

CMPT 354 Views and Indexes Spring 2012 Instructor: Hassan Khosravi.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 19 Algorithms for Query Processing and Optimization.
1 Conjunctions of Queries. 2 Conjunctive Queries A conjunctive query is a single Datalog rule with only non-negated atoms in the body. (Note: No negated.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
QueryRewriting: 1 Query Rewriting (for Query “Optimization”) Main Strategy: Make intermediate results small by applying selection and projection early.
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts 3 rd Edition Chapter 14: Query Optimization Overview Catalog Information for Cost Estimation.
 Definition of computer Definition of computer  Block diagram of computer Block diagram of computer  Components of computer - Input DevicesInput Devices.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Convergence and New Technologies
Advanced Databases: Lecture 8 Query Optimization (III) 1 Query Optimization Advanced Databases By Dr. Akhtar Ali.
Database Management 9. course. Execution of queries.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
Department of Computer Science and Engineering, HKUST Slide Query Processing and Optimization Query Processing and Optimization.
Mark A. Magumba Storage Management. What is storage An electronic place where computer may store data and instructions for retrieval The objective of.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Holt McDougal Algebra 2 Exponential Functions, Growth, and Decay Exponential Functions, Growth and Decay Holt Algebra 2Holt McDougal Algebra 2 How do.
CSCI Query Processing1 QUERY PROCESSING & OPTIMIZATION Dr. Awad Khalil Computer Science Department AUC.
ENIAC was the first digital computer. It is easy to see how far we have come in the evolution of computers.
Central Processing Unit (CPU)
CS 440 Database Management Systems Lecture 5: Query Processing 1.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
CS 540 Database Management Systems
Chapter 13 Query Optimization Yonsei University 1 st Semester, 2015 Sanghyun Park.
Chapter 13: Query Processing
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
Computer Storage. What is Primary Storage? ● Primary storage is computer memory that is directly accessible to the CPU of a computer without the use of.
Integration Lower sums Upper sums
CPSC-310 Database Systems
Query Processing and Optimization, and Database Tuning
Practical Database Design and Tuning
Digital Logic.
MEMORY BYTES. MEMORY BYTES MEMORY MEMORY OUR Internal External.
Slides are reused by the approval of Jeffrey Ullman’s
CS 440 Database Management Systems
Database Management System
B-Trees 7/5/2018 4:26 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
Major developments in technology
Chapter 12: Query Processing
Database Performance Tuning and Query Optimization
Overview of Query Optimization
Chapter 15 QUERY EXECUTION.
CPSC-310 Database Systems
File Processing : Query Processing
Query Processing B.Ramamurthy Chapter 12 11/27/2018 B.Ramamurthy.
Practical Database Design and Tuning
Computer Fundamentals
QUERY OPTIMIZATION.
Chapter 12 Query Processing (1)
Chapter 11 Database Performance Tuning and Query Optimization
CPSC-608 Database Systems
CPSC-608 Database Systems
CPSC-608 Database Systems
Query Processing.
Transformations Limit Continuity Derivative
Presentation transcript:

Optimization of Relational Algebra Expressions Database I.

Moore's law Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years. The capabilities of many digital electronic devices are strongly linked to Moore's law: processing speed, memory capacity, sensors and even the number and size of pixels in digital cameras.

Our objective --1 On the other hand, the development of other factors is linear. An example of such a factor is the speed with which the disk moves in hard disk drives. For large data the amount of data moved between the primary and the secondary storage devices should be minimized.

Our objective --2 Data is moved in blocks between the main memory and the hard disk. Thus, in other words, in RDBMS computations the number of blocks involved in I/O operations should be kept as low as possible. This can be achieved by working with as small transitory relations as possible.

Equivalence of relational algebra expressions In order to reduce the size of the transitory relations the relational algebra expressions are rewritten. Two relational algebra queries q, q’ are equivalent if for all database instances I q(I) is the same as q’(I). Here: – a database instance consists of relations – q(I) denotes the result of applying q on I.

Example Consider relations – Likes(drinker, beer) – Frequents(drinker, bar) The following two expressions are equivalent to each other: –  bar (σ F.drinker=L.drinker  beer= Bud (F  L)) –  bar (σ F.drinker=L.drinker (F  (σ beer=Bud (L)))). In most cases the second query can be evaluated faster.

Optimization algorithm -- sketch The original relational algebra expression is rewritten into another one in which – the selection operations are accomplished as soon as possible – the unnecessary columns are removed afterwards. Next, the selection and the subsequent cross product operators are substituted with the appropriate join operators.

Our running example Likes(drinker, beer) Bar(name, city) Frequents(drinker, bar) Π L. drinker (σ L.drinker=F.drinker  name=bar  beer=Bud  city=N.Y. (L  B  F))

Splitting the conditions σ C1  C2 (E) is equivalent with σ C1 ( σ C2 (E)). Π L. drinker (σ L.drinker=F.drinker  name=bar  beer=Bud  city=N.Y. (L  B  F)) is equivalent with Π L. drinker (σ L.drinker=F.drinker (σ name=bar (σ beer=Bud (σ city=N.Y. (L  B  F)))))

Π L.drinker   L BF σ L.dinker=F.drinker σ name=bar σ beer=Bud σ city=N.Y. Π L.drinker   L BF σ L.drinker=F.drinker  name=bar   beer=Bud  city=N.Y. Expression trees

Pulling down the conditions σ C (E 1 ΘE 2 ) ≡ (σ C (E 1 ))ΘE 2, where attr(C)  attr(E 1 ) and Θ Є { , ⋈ }. Here – attr(C), attr(E 1 ) respectively denote the attributes appearing in condition C and relational algebra expression E 1 – while ≡ denotes the equivalence relation. Π L. drinker (σ L.drinker=F.drinker (σ name=bar (σ beer=Bud (σ city=N.Y. (L  B  F))))) is equivalent with Π L. drinker (σ L.drinker=F.drinker (σ name=bar ((σ beer=Bud (L))  (σ city=N.Y. (B))  F)))

Π L.drinker   L BF σ L.dinker=F.drinker σ name=bar σ beer=Bud σ city=N.Y. Π L.drinker σ L.dinker=F.drinker σ name=bar  σ beer=Bud L  σ city=N.Y. F B

Removal of the unnecessary columns Π X (E 1 Θ E 2 ) ≡ Π Y (E 1 ) Θ Π Z (E 2 ), where X = Y  Z, Y  attr(E 1 ), Z  attr(E 2 ) and Θ Є { , ⋈ }. Π X (σ C (E)) ≡ Π X (σ F (Π Y (E))), where Y = attr(C)  X. Π L. drinker (σ L.drinker=F.drinker (σ name=bar ((σ beer=Bud (L))  (σ city=N.Y. (B))  F))) is equivalent with Π L. drinker (σ L.drinker=F.drinker (σ name=bar ( (Π drinker (σ beer=Bud (L)))  (Π bar (σ city=N.Y. (B)))  F)))

Π L.drinker σ L.dinker=F.drinker σ name=bar  σ beer=Bud L  σ city=N.Y. F B Π L.drinker σ L.dinker=F.drinker σ name=bar  σ beer=Bud L  σ city=N.Y. F B Π drinker Π bar Note: the application of extra projections increases the time of the evaluation of the query, hence this rewriting step can be omitted.

Substitution with joins By definition, – E 1 ⋈ C E 2 ≡ σ C (E 1  E 2 ) – E 1 ⋈ E 2 ≡ Π L (σ C (E 1  E 2 )), where in condition C the common attributes of E 1 and E 2 are made equal and these common attributes occur only once in L. Π L. drinker (σ L.drinker=F.drinker (σ name=bar ( (Π drinker (σ beer=Bud (L)))  (Π bar (σ city=N.Y. (B)))  F))) is equivalent with Π L. drinker ((Π drinker (σ beer=Bud (L))) ⋈ (Π bar (σ city=N.Y. (B)) ⋈ name=bar F)))

Π L.drinker σ L.dinker=F.drinker σ name=bar  σ beer=Bud L  σ city=N.Y. F B Π drinker Π bar Π L.drinker ⋈ σ beer=Bud L ⋈ name=bar σ city=N.Y. F B Π drinker Π bar

Commutativity and associativity Commutativity: E 1 Θ E 2 ≡ E 2 Θ E 1, where Θ Є { , ⋈, ⋈ C }. Associativity: (E 1 Θ E 2 ) Θ E 3 ≡ E 1 Θ (E 2 Θ E 3 ), where Θ Є { , ⋈ } Note: in general (E 1 ⋈ C1 E 2 ) ⋈ C2 E 3 is not equivalent with E 1 ⋈ C1 (E 2 ⋈ C2 E 3 ). Why??

Disjunctions in the conditions Disjunctions in the conditions of selection operators may complicate the situation. As a first attempt one may use equivalence rule σ C1  C2 (E) ≡ σ C1 (E)  σ C2 (E) and then apply the previous algorithm on σ C1 (E) and σ C2 (E). However, in this case the relations appearing in E may be scanned twice, which is costly.