Example. Bulk Nested-Loop Joins Using Buffers: e.g. 22 blocks.

Slides:



Advertisements
Similar presentations
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 12, Part A.
Advertisements

Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
CS 540 Database Management Systems
Join Processing in Database Systems with Large Main Memories ACM Transactions on Database Systems Vol. 11, No. 3, Sep 1986 Leonard D. Shapiro Donghui Zhang,
Join Processing in Databases Systems with Large Main Memories
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
CS186 Week 0 Out of Core Algorithms. Today External Merge Sort External Hashing.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
1 Overview of Query Evaluation Chapter Objectives  Preliminaries:  Core query processing techniques  Catalog  Access paths to data  Index matching.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting “There it was, hidden in alphabetical order.” Rita Holt R&G Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 11 External Sorting.
1 Implementation of Relational Operations Module 5, Lecture 1.
1  Simple Nested Loops Join:  Block Nested Loops Join  Index Nested Loops Join  Sort Merge Join  Hash Join  Hybrid Hash Join Evaluation of Relational.
SPRING 2004CENG 3521 Join Algorithms Chapter 14. SPRING 2004CENG 3522 Schema for Examples Similar to old schema; rname added for variations. Reserves:
1 Optimization Recap and examples. 2 Optimization introduction For every SQL expression, there are many possible ways of implementation. The different.
External Sorting R & G Chapter 11 One of the advantages of being disorderly is that one is constantly making exciting discoveries. A. A. Milne.
Query Processing 1: Joins and Sorting R&G, Chapters 12, 13, 14 Lecture 8 One of the advantages of being disorderly is that one is constantly making exciting.
1 Optimization - Selection. 2 The Selection Operation Table: Reserves(sid, bid, day, agent) A page (block) can hold 100 Reserves tuples There are 1,000.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
CS186 Final Review Query Optimization.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
Introduction to Database Systems 1 Join Algorithms Query Processing: Lecture 1.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
External Sorting 198:541. Why Sort?  A classic problem in computer science!  Data requested in sorted order e.g., find students in increasing gpa order.
1 Evaluation of Relational Operations Yanlei Diao UMass Amherst March 01, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
CS 4432query processing - lecture 171 CS4432: Database Systems II Lecture #17 Join Processing Algorithms (cont). Professor Elke A. Rundensteiner.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
1 Relational Operators. 2 Outline Logical/physical operators Cost parameters and sorting One-pass algorithms Nested-loop joins Two-pass algorithms.
Query Optimization R&G, Chapter 15 Lecture 16. Administrivia Homework 3 available today –Written exercise; will be posted on class website –Due date:
1 Implementation of Relational Operations: Joins.
Query Processing 2: Sorting & Joins
Status “Lifetime of a Query” –Query Rewrite –Query Optimization –Query Execution Optimization –Use cost-estimation to iterate over all possible plans,
Relational Operator Evaluation. Overview Index Nested Loops Join If there is an index on the join column of one relation (say S), can make it the inner.
Lec3/Database Systems/COMP4910/031 Evaluation of Relational Operations Chapter 14.
RELATIONAL JOIN Advanced Data Structures. Equality Joins With One Join Column External Sorting 2 SELECT * FROM Reserves R1, Sailors S1 WHERE R1.sid=S1.sid.
Implementing Natural Joins, R. Ramakrishnan and J. Gehrke with corrections by Christoph F. Eick 1 Implementing Natural Joins.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13.
1 External Sorting. 2 Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing gpa order.
1 Database Systems ( 資料庫系統 ) December 7, 2011 Lecture #11.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations – Join Chapter 14 Ramakrishnan and Gehrke (Section 14.4)
Database Management Systems 1 Raghu Ramakrishnan Evaluation of Relational Operations Chpt 14.
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapters 13: 13.1—13.5.
Alon Levy 1 Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation. – Projection ( ) Deletes.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
External Sorting Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
Relational Operator Evaluation. Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin)
Database Systems (資料庫系統)
Evaluation of Relational Operations
Implementation of Relational Operations (Part 2)
Relational Operations
CS222P: Principles of Data Management UCI, Fall 2018 Notes #09 External Sorting Instructor: Chen Li.
CS222: Principles of Data Management Lecture #10 External Sorting
Slides adapted from Donghui Zhang, UC Riverside
Relational Query Optimization
CS222P: Principles of Data Management Notes #13 Set operations, Aggregation, Query Plans Instructor: Chen Li.
Overview of Query Evaluation: JOINS
Sorting We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each.
CS222P: Principles of Data Management Lecture #10 External Sorting
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #09 External Sorting Instructor: Chen Li.
External Sorting Dina Said
Presentation transcript:

Example

Bulk Nested-Loop Joins Using Buffers: e.g. 22 blocks

Sort Merge Joins Notes From Winter 2015

Visualizations Sailors Page 1 Page 2 Page 3 Page 4

Visualizations Sailors Record 1 Record 2 Record 3 Record 4 Record 5 Page 2 Page 3 Page 4

Visualizations Sailors Record 1 Record 2 Record 3 Record 4 Record 5 Page 2 Page 3 Page 4 Reserves

Simple Nested Loops Join Key idea: Take each record of S and match it with each record of R. Steps: 1.Get tuple of S. 2.Iterate through each tuple in R. SailorsReserves

Simple Nested Loops Join Key idea: Take each record of S and match it with each record of R. Steps: 1.Get tuple of S. 2.Iterate through each tuple in R. SailorsReserves (name = Bob, sid = 1)

Simple Nested Loops Join Key idea: Take each record of S and match it with each record of R. Steps: 1.Get tuple of S. 2.Iterate through each tuple in R. SailorsReserves (name = Bob, sid = 1)(sid = 3, bid = 6) (sid = 1, bid = 4) (name = Bob, sid = 1, bid = 4) Output:

Simple Nested Loops Join Key idea: Take each record of S and match it with each record of R. Steps: 1.Get tuple of S. 2.Iterate through each tuple in R. SailorsReserves (name = Bob, sid = 1)(sid = 3, bid = 6) (sid = 1, bid = 4) (name = Bob, sid = 1, bid = 4) Output: (sid = 1, bid = 7) (name = Bob, sid = 1, bid = 7)

Sort-Merge Join Key idea: Sort S and R, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Sam, sid = 3) (name = Sue, sid = 7) (name = Jill, sid = 2) (name = Joe, sid = 12) (name = Sue, sid = 8) (name = Yue, sid = 4)

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) Output:

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) Output:

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) Output:

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)... (name = Bob, sid = 1, bid = 4) (name = Bob, sid = 1, bid = 7) (name = Sam, sid = 3, bid = 6) Output:...

Sort-Merge Join Key idea: Sort S and R on join column, then merge them! Steps: 1.Sort S and R. 2.“Zip” or merge. I/Os: SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)...

Optimizing Sort-Merge Join Key idea: Internal Sort on both. Perform merge on all runs! Steps: 1.Internal sort S and R. (Pass 0) 2.Merge all runs. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Yue, sid = 4) (name = Sue, sid = 7) (name = Sue, sid = 8) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 3, bid = 6) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 8, bid = 15) (sid = 12, bid = 1)...

Optimizing Sort-Merge Join Key idea: Internal Sort on both. Perform merge on all runs! Steps: 1.Internal sort S and R. (Pass 0) 2.Merge all runs. SailorsReserves (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Yue, sid = 4) (name = Sue, sid = 8) (name = Jack, sid = 18) (name = Cat, sid = 22)... (name = Sam, sid = 3) (name = Sue, sid = 7) (name = Joe, sid = 12)... (sid = 1, bid = 4) (sid = 1, bid = 7) (sid = 4, bid = 3) (sid = 8, bid = 1) (sid = 8, bid = 13) (sid = 12, bid = 1)... (sid = 3, bid = 6) (sid = 8, bid = 15)...

External Sorting Classic interview question: how to sort if data don’t fit in memory.

External Sorting: 2-Way Merge Sort

What is a sorted run? (name = Bob, sid = 1) (name = Jill, sid = 2) (name = Sam, sid = 3) (name = Sue, sid = 6) (name = Kev, sid = 8) (name = Jack, sid = 9) (name = Joe, sid = 10) (name = Sid, sid = 12) (name = Sal, sid = 15) (name = Bit, sid = 1) (name = Bat, sid = 2) (name = Tam, sid = 3) (name = Foo, sid = 6) (name = Bar, sid = 8) (name = Bam, sid = 9) (name = Ke, sid = 10) (name = Kay, sid = 12) (name = Al, sid = 15) A sorted subset of a table. Above we have pages with tuple size = 3 There are two sorted runs both with a length of 3 pages. (We denote the size of a sorted run by how many pages the sorted run spans)

2 Way Merge Sort Given N pages of tuples to sort B = 3 case Pass 0: Quicksort a single page 2-way merge sort: how many passes? – 1+ceil(log_2(N)) – 1 for pass 0, ceil(log_2(N)) because pass 0 has N runs and at each level we merge 2 runs together – ceil of this because the number of runs in pass 0 might not be a a power of two, but we still need to keep merging until we're down to 1 run,

Generalized Sort Merge