CS505: Intermediate Topics in Database Systems

Slides:



Advertisements
Similar presentations
Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
Advertisements

Implementation of relational operations
CS 540 Database Management Systems
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 11 External Sorting.
CSCI 5708: Query Processing II Pusheng Zhang University of Minnesota Feb 5, 2004.
1  Simple Nested Loops Join:  Block Nested Loops Join  Index Nested Loops Join  Sort Merge Join  Hash Join  Hybrid Hash Join Evaluation of Relational.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
External Sorting 198:541. Why Sort?  A classic problem in computer science!  Data requested in sorted order e.g., find students in increasing gpa order.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
External Sorting Chapter 13.. Why Sort? A classic problem in computer science! Data requested in sorted order  e.g., find students in increasing gpa.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Relational Operator Evaluation. Overview Index Nested Loops Join If there is an index on the join column of one relation (say S), can make it the inner.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
1 External Sorting. 2 Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing gpa order.
CS4432: Database Systems II Query Processing- Part 3 1.
CS411 Database Systems Kazuhiro Minami 11: Query Execution.
CS4432: Database Systems II Query Processing- Part 2.
Query Processing CS 405G Introduction to Database Systems.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.
Relational Operator Evaluation. overview Projection Two steps –Remove unwanted attributes –Eliminate any duplicate tuples The expensive part is removing.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapters 13: 13.1—13.5.
CS 540 Database Management Systems
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
External Sorting Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein for some slides.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2007.
CS 540 Database Management Systems
CS 440 Database Management Systems
Database Management System
External Sorting Chapter 13
Database Applications (15-415) DBMS Internals- Part VII Lecture 16, October 25, 2016 Mohammad Hammoud.
Chapter 12: Query Processing
Evaluation of Relational Operations
Database Management Systems (CS 564)
Evaluation of Relational Operations: Other Operations
File Processing : Query Processing
Dynamic Hashing Good for database that grows and shrinks in size
Database Management Systems (CS 564)
Lecture#12: External Sorting (R&G, Ch13)
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
Query Processing.
CS222: Principles of Data Management Notes #12 Joins!
Faloutsos/Pavlo C. Faloutsos – A. Pavlo Lecture#13: Query Evaluation
External Sorting Chapter 13
Selected Topics: External Sorting, Join Algorithms, …
Chapter 12: Query Processing
CS222P: Principles of Data Management UCI, Fall 2018 Notes #09 External Sorting Instructor: Chen Li.
Module 13: Query Processing
Lecture 2- Query Processing (continued)
CS222: Principles of Data Management Lecture #10 External Sorting
Implementation of Relational Operations
Evaluation of Relational Operations: Other Techniques
Overview of Query Evaluation: JOINS
External Sorting.
Sorting We may build an index on the relation, and then use the index to read the relation in sorted order. May lead to one disk block access for each.
Lecture 22: Query Execution
CS222P: Principles of Data Management Lecture #10 External Sorting
Database Systems (資料庫系統)
Lecture 11: B+ Trees and Query Execution
Yan Huang - CSCI5330 Database Implementation – Query Processing
External Sorting Chapter 13
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #09 External Sorting Instructor: Chen Li.
Evaluation of Relational Operations: Other Techniques
CS222/CS122C: Principles of Data Management UCI, Fall Notes #11 Join!
Lecture 20: Query Execution
CS222P: Principles of Data Management UCI, Fall 2018 Notes #11 Join!
Presentation transcript:

CS505: Intermediate Topics in Database Systems Jinze Liu

Undo log, Redo log, Undo & Redo log

Checkpointing Where does recovery start? Naïve approach: Stop accepting new transactions (lame!) Finish all active transactions Take a database dump Fuzzy checkpointing Determine S, the set of currently active transactions, and log h begin-checkpoint S i Flush all blocks (dirty at the time of the checkpoint) at your leisure Log h end-checkpoint begin-checkpoint_location i Between begin and end, continue processing old and new transactions

<Start CKPT(T2)> An Example of CKPT <Start T1> <T1, A, 4, 5> <Commit T1> <T2, B, 9, 10> <Start CKPT(T2)> <T2, C, 14, 15> <Start T3> <T3, D, 19, 20> <End CKPT> <Commit T2> <Commit T3>

Recovery: analysis and redo phase Need to determine U, the set of active transactions at time of crash Scan log backward to find the last end-checkpoint record and follow the pointer to find the corresponding h start-checkpoint S i Initially, let U be S Scan forward from that start-checkpoint to end of the log For a log record h T, start i, add T to U For a log record h T, commit | abort i, remove T from U For a log record h T, X, old, new i, issue write(X, new) Basically repeats history!

Recovery: undo phase Scan log backward An optimization Undo the effects of transactions in U That is, for each log record h T, X, old, new i where T is in U, issue write(X, old), and log this operation too (part of the repeating-history paradigm) Log h T, abort i when all effects of T have been undone An optimization Each log record stores a pointer to the previous log record for the same transaction; follow the pointer chain during undo

Overview Many different ways of processing the same query Scan? Sort? Hash? Use an index? All have different performance characteristics and/or make different assumptions about data Best choice depends on the situation Implement all alternatives Let the query optimizer choose at run-time

Notation Relations: R, S Tuples: r, s Number of tuples: |R|, |S| Number of disk blocks: B(R), B(S) Number of memory blocks available: M Cost metric Number of I/O’s Memory requirement

Table Scan Scan table R and process the query Selection over R Projection of R without duplicate elimination INPUT OUTPUT Main memory buffers Disk Disk

Table scan I/O’s: B(R) Memory requirement: 2 (+1 for double buffering) Trick for selection: stop early if it is a lookup by key Memory requirement: 2 (+1 for double buffering) Not counting the cost of writing the result out Same for any algorithm! Maybe not needed—results may be pipelined into another operator

Nested Loop Join RXp S For each block of R, and for each r in the block: For each block of S, and for each s in the block: Output rs if p evaluates to true over r and s R is called the outer table; S is called the inner table INPUT 1 OUTPUT INPUT 2 Main memory buffers Disk Disk

Nested-loop join I/O’s: B(R) + |R| × B(S) Memory requirement: 3 (+1 for double buffering) Improvement: block-based nested-loop join For each block of R, and for each block of S: For each r in the R block, and for each s in the S block: … I/O’s: B(R) + B(R) × B(S) Memory requirement: same as before

More improvements of nested-loop join Stop early if the key of the inner table is being matched Make use of available memory Stuff memory with as much of R as possible, stream S by, and join every S tuple with all R tuples in memory I/O’s: B(R) + d B(R) / (M – 2 ) e × B(S) Or, roughly: B(R) × B(S) / M Memory requirement: M (as much as possible) Which table would you pick as the outer?

Previous-Slides: Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing gpa order Sorting is first step in bulk loading B+ tree index. Sorting useful for eliminating duplicate copies in a collection of records (Why?) Sorting is useful for summarizing related groups of tuples Sort-merge join algorithm involves sorting. Problem: sort 100Gb of data with 1Gb of RAM. why not virtual memory? 4

2-Way Sort: Requires 3 Buffers Pass 0: Read a page, sort it, write it. only one buffer page is used (as in previous slide) Pass 1, 2, 3, …, etc.: requires 3 buffer pages merge pairs of runs into runs twice as long three buffer pages used. INPUT 1 OUTPUT INPUT 2 Main memory buffers Disk Disk 5

Two-Way External Merge Sort Each pass we read + write each page in file. N pages in the file => the number of passes So total cost is: Idea: Divide and conquer: sort subfiles and merge 3,4 6,2 9,4 8,7 5,6 3,1 2 Input file PASS 0 3,4 2,6 4,9 7,8 5,6 1,3 2 1-page runs PASS 1 2,3 4,7 1,3 2-page runs 4,6 8,9 5,6 2 PASS 2 2,3 4,4 1,2 4-page runs 6,7 3,5 8,9 6 PASS 3 1,2 2,3 3,4 8-page runs 4,5 6,6 7,8 9 6

External merge sort Remember (internal-memory) merge sort? Problem: sort R, but R does not fit in memory Pass 0: read M blocks of R at a time, sort them, and write out a level-0 run There are d B(R) / M e level-0 sorted runs Pass i: merge (M – 1) level-(i-1) runs at a time, and write out a level-i run (M – 1) memory blocks for input, 1 to buffer output # of level-i runs = d # of level-(i–1) runs / (M – 1) e Final pass produces 1 sorted run

Example of external merge sort Input: 1, 7, 4, 5, 2, 8, 3, 6, 9 Pass 0 1, 7, 4 → 1, 4, 7 5, 2, 8 → 2, 5, 8 9, 6, 3 → 3, 6, 9 Pass 1 1, 4, 7 + 2, 5, 8 → 1, 2, 4, 5, 7, 8 3, 6, 9 Pass 2 (final) 1, 2, 4, 5, 7, 8 + 3, 6, 9 → 1, 2, 3, 4, 5, 6, 7, 8, 9

Performance of external merge sort Number of passes: d log M–1 d B(R) / M e e + 1 I/O’s Multiply by 2 × B(R): each pass reads the entire relation once and writes it once Subtract B(R) for the final pass Roughly, this is O( B(R) × log M B(R) ) Memory requirement: M (as much as possible)

Some tricks for sorting Double buffering Allocate an additional block for each run Trade-off: smaller fan-in (more passes) Blocked I/O Instead of reading/writing one disk block at time, read/write a bunch (“cluster”) More sequential I/O’s Trade-off: larger cluster ! smaller fan-in (more passes)

Sort-merge join R XR.A = S.B S Sort R and S by their join attributes, and then merge r, s = the first tuples in sorted R and S Repeat until one of R and S is exhausted: If r.A > s.B then s = next tuple in S else if r.A < s.B then r = next tuple in R else output all matching tuples, and r, s = next in R and S I/O’s: sorting + 2 B(R) + 2 B(S) In most cases (e.g., join of key and foreign key) Worst case is B(R) × B(S): everything joins

Example R: S: R !R.A = S.B S: r1.A = 1 s1.B = 1 r2.A = 3 s2.B = 2 r3.A = 3 s3.B = 3 r4.A = 5 s4.B = 3 r5.A = 7 s5.B = 8 r6.A = 7 r7.A = 8 r1 s1 r2 s3 r2 s4 r3 s3 r3 s4 r7 s5

Optimization of SMJ Idea: combine join with the merge phase of merge sort Sort: produce sorted runs of size M for R and S Merge and join: merge the runs of R, merge the runs of S, and merge-join the result streams as they are generated! Sorted runs R S Disk Memory Merge Join

Performance of two-pass SMJ I/O’s: 3 × (B(R) + B(S)) Memory requirement To be able to merge in one pass, we should have enough memory to accommodate one block from each run: M > B(R) / M + B(S) / M M > sqrt(B(R) + B(S))

Other sort-based algorithms Union (set), difference, intersection More or less like SMJ Duplication elimination External merge sort Eliminate duplicates in sort and merge GROUP BY and aggregation Produce partial aggregate values in each run Combine partial aggregate values during merge Partial aggregate values don’t always work though Examples: SUM(DISTINCT …), MEDIAN(…)

Hash join R XR.A = S.B S Main idea Partition R and S by hashing their join attributes, and then consider corresponding partitions of R and S If r.A and s.B get hashed to different partitions, they don’t join 1 2 3 4 5 R S Nested-loop join considers all slots Hash join considers only those along the diagonal

Partitioning phase Partition R and S according to the same hash function on their join attributes M – 1 partitions of R Disk Memory R Same for S …

Probing phase Read in each partition of R, stream in the corresponding partition of S, join Typically build a hash table for the partition of R Not the same hash function used for partition, of course! Disk Memory R partitions S partitions … load stream For each S tuple, probe and join

Performance of hash join I/O’s: 3 × (B(R) + B(S)) Memory requirement: In the probing phase, we should have enough memory to fit one partition of R: M – 1 ¸ B(R) / (M – 1) M > sqrt(B(R)) We can always pick R to be the smaller relation, so: M > sqrt(min(B(R), B(S))

Hash join tricks What if a partition is too large for memory? Read it back in and partition it again! See the duality in multi-pass merge sort here? Multi-pass merge sort is like going from right to left.

Hash join versus SMJ (Assuming two-pass) I/O’s: same Memory requirement: hash join is lower sqrt(min(B(R), B(S)) < sqrt(B(R) + B(S)) Hash join wins when two relations have very different sizes Other factors Hash join performance depends on the quality of the hash Might not get evenly sized buckets SMJ can be adapted for inequality join predicates SMJ wins if R and/or S are already sorted SMJ wins if the result needs to be in sorted order

What about nested-loop join? May be best if many tuples join Example: non-equality joins that are not very selective Necessary for black-box predicates Example: … WHERE user_defined_pred(R.A, S.B)

Other hash-based algorithms Union (set), difference, intersection More or less like hash join Duplicate elimination Check for duplicates within each partition/bucket GROUP BY and aggregation Apply the hash functions to GROUP BY attributes Tuples in the same group must end up in the same partition/bucket Keep a running aggregate value for each group

Duality of sort and hash Divide-and-conquer paradigm Sorting: physical division, logical combination Hashing: logical division, physical combination Handling very large inputs Sorting: multi-level merge Hashing: recursive partitioning I/O patterns Sorting: sequential write, random read (merge) Hashing: random write, sequential read (partition)

Selection using index Equality predicate: ¾A = v (R) Use an ISAM, B+-tree, or hash index on R(A) Range predicate: ¾A > v (R) Use an ordered index (e.g., ISAM or B+-tree) on R(A) Hash index is not applicable Indexes other than those on R(A) may be useful Example: B+-tree index on R(A, B) How about B+-tree index on R(B, A)?

Index versus table scan Situations where index clearly wins: Index-only queries which do not require retrieving actual tuples Example: ¼A (¾A > v (R)) Primary index clustered according to search key One lookup leads to all result tuples in their entirety

Index versus table scan (cont’d) BUT(!): Consider ¾A > v (R) and a secondary, non-clustered index on R(A) Need to follow pointers to get the actual result tuples Say that 20% of R satisfies A > v Could happen even for equality predicates I/O’s for index-based selection: lookup + 20% |R| I/O’s for scan-based selection: B(R) Table scan wins if a block contains more than 5 tuples

Index nested-loop join R !R.A = S.B S Idea: use the value of R.A to probe the index on S(B) For each block of R, and for each r in the block: Use the index on S(B) to retrieve s with s.B = r.A Output rs I/O’s: B(R) + |R| · (index lookup) Typically, the cost of an index lookup is 2-4 I/O’s Beats other join methods if |R| is not too big Better pick R to be the smaller relation Memory requirement: 2

Zig-zag join using ordered indexes R !R.A = S.B S Idea: use the ordering provided by the indexes on R(A) and S(B) to eliminate the sorting step of sort- merge join Trick: use the larger key to probe the other index Possibly skipping many keys that don’t match B+-tree on R(A) B+-tree on S(B) 1 2 3 4 7 9 18 1 7 9 11 12 17 19

Summary of tricks Scan Sort Hash Index Selection, duplicate-preserving projection, nested-loop join Sort External merge sort, sort-merge join, union (set), difference, intersection, duplicate elimination, GROUP BY and aggregation Hash Hash join, union (set), difference, intersection, duplicate elimination, GROUP BY and aggregation Index Selection, index nested-loop join, zig-zag join