Download presentation
Presentation is loading. Please wait.
1
April 20th – RDBMS Internals
Cse 344 April 20th – RDBMS Internals
2
Administrivia OQ5 Out Datalog – Due next Wednesday
HW4 Due next Wednesday Written portion (.pdf) Coding portion (one .dl file)
3
Today Back to RDBMS ”Query plans” and DBMS planning
Management between SQL and execution Optimization techniques Indexing and data arrangement
4
Query Evaluation Steps
SQL query Parse & Rewrite Query Logical plan (RA) Select Logical Plan Query optimization Select Physical Plan Physical plan Analogy with sorting: we have decided the ordering of operations, and now we need to choose their implementation. For instance, there are many ways to implement sort. Query Execution Disk
5
Logical vs Physical Plans
Logical plans: Created by the parser from the input SQL text Expressed as a relational algebra tree Each SQL query has many possible logical plans Physical plans: Goal is to choose an efficient implementation for each operator in the RA tree Each logical plan has many possible physical plans
6
Review: Relational Algebra
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = and x.scity = ‘Seattle’ and x.sstate = ‘WA’ πsname σscity=‘Seattle’ and sstate=‘WA’ and pno=2 sid = sid Relational algebra expression is also called the “logical query plan” Supplier Supply
7
Physical Query Plan 1 (On the fly) πsname (On the fly)
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 1 (On the fly) πsname A physical query plan is a logical query plan annotated with physical implementation details (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = and x.scity = ‘Seattle’ and x.sstate = ‘WA’ (Nested loop) sid = sid Describe the “nested loop join”: we have not discussed physical operators in class yet, but it should be clear what a nested loop means. Supplier Supply (File scan) (File scan)
8
Physical Query Plan 2 (On the fly) πsname (On the fly)
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 2 (On the fly) πsname Same logical query plan Different physical plan (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = and x.scity = ‘Seattle’ and x.sstate = ‘WA’ (Hash join) sid = sid Briefly mention that a hash join is a different implementation. Normally, most students in class should have no problem imagining what it might be, but no need to describe it in detail, we will cover it in two lectures. Supplier Supply (File scan) (File scan)
9
Physical Query Plan 3 (On the fly) πsname (d) (Sort-merge join) (c)
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Physical Query Plan 3 Different but equivalent logical query plan; different physical plan (On the fly) πsname (d) SELECT sname FROM Supplier x, Supply y WHERE x.sid = y.sid and y.pno = and x.scity = ‘Seattle’ and x.sstate = ‘WA’ (Sort-merge join) (c) sid = sid (Scan & write to T1) (a) σscity=‘Seattle’ and sstate=‘WA’ Same here, briefly mention that sort-merge is yet another way to implement join. (b) σpno=2 (Scan & write to T2) Supplier Supply (File scan) (File scan)
10
Query Optimization Problem
For each SQL query… many logical plans For each logical plan… many physical plans Next: we will discuss physical operators; how exactly are query executed?
11
Physical Operators Each of the logical operators may have one or more implementations = physical operators Will discuss several basic physical operators, with a focus on join
12
Main Memory Algorithms
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Main Memory Algorithms Logical operator: Supplier ⨝sid=sid Supply Propose three physical operators for the join, assuming the tables are in main memory:
13
Main Memory Algorithms
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Main Memory Algorithms Logical operator: Supplier ⨝sid=sid Supply Propose three physical operators for the join, assuming the tables are in main memory: Nested Loop Join O(??) Merge join O(??) Hash join O(??)
14
Main Memory Algorithms
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Main Memory Algorithms Logical operator: Supplier ⨝sid=sid Supply Propose three physical operators for the join, assuming the tables are in main memory: Nested Loop Join O(n2) Merge join O(n log n) Hash join O(n) … O(n2)
15
BRIEF Review of Hash Tables
Separate chaining: A (naïve) hash function: 1 2 3 4 5 6 7 8 9 Duplicates OK WHY ?? h(x) = x mod 10 503 103 503 Operations: 76 666 find(103) = ?? insert(488) = ?? 48
16
BRIEF Review of Hash Tables
insert(k, v) = inserts a key k with value v Many values for one key Hence, duplicate k’s are OK find(k) = returns the list of all values v associated to the key k
17
Iterator Interface Each operator implements three methods: open() next() close()
18
Iterator Interface Example “on the fly” selection operator
interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Iterator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; while (!found) { Tuple in = child.next(); if (in == EOF) return EOF; found = p(in); return in; void close () { child.close(); }
19
Iterator Interface Example “on the fly” selection operator
interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; while (!found) { Tuple in = child.next(); if (in == EOF) return EOF; found = p(in); return in; void close () { child.close(); }
20
Iterator Interface Example “on the fly” selection operator
interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () {
21
Iterator Interface Example “on the fly” selection operator
interface Operator { // initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } class Select implements Operator {... void open (Predicate p, Operator child) { this.p = p; this.child = child; } Tuple next () { boolean found = false; Tuple r = null; while (!found) { r = child.next(); if (r == null) break; found = p(r); return r; void close () { child.close(); }
22
Iterator Interface interface Operator {
// initializes operator state // and sets parameters void open (...); // calls next() on its inputs // processes an input tuple // produces output tuple(s) // returns null when done Tuple next (); // cleans up (if any) void close (); } Operator q = parse(“SELECT ...”); q = optimize(q); q.open(); while (true) { Tuple t = q.next(); if (t == null) break; else printOnScreen(t); } q.close(); Query plan execution
23
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)
24
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)
25
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)
26
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 open() (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). Suppliers Supplies (File scan) (File scan)
27
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 open() (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). open() Suppliers Supplies (File scan) (File scan)
28
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join open() (On the fly) πsname open() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 open() (Nested loop) sno = sno Say: Recall: All operator implement the same iterator interface: open, next, close (SimpleDB also has hasNext and rewind). open() open() Suppliers Supplies (File scan) (File scan)
29
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)
30
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)
31
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)
32
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. next() Suppliers Supplies (File scan) (File scan)
33
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. next() next() Suppliers Supplies (File scan) (File scan)
34
Discuss: open/next/close for nested loop join
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining Discuss: open/next/close for nested loop join next() (On the fly) πsname next() (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 next() (Nested loop) sno = sno Slide assumes first call to next on the right hand side does not result in a join. next() next() next() Suppliers Supplies (File scan) (File scan)
35
Tuples from here are pipelined
Supplier(sid, sname, scity, sstate) Supply(sid, pno, quantity) Pipelining (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Hash Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Tuples from here are pipelined Suppliers Supplies (File scan) (File scan)
36
Pipelining Supplier(sid, sname, scity, sstate)
Supply(sid, pno, quantity) Pipelining (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 Tuples from here are “blocked” (Hash Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Tuples from here are pipelined Suppliers Supplies (File scan) (File scan)
37
Blocked Execution Supplier(sid, sname, scity, sstate)
Supply(sid, pno, quantity) Blocked Execution (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Merge Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Suppliers Supplies (File scan) (File scan)
38
Blocked Execution Supplier(sid, sname, scity, sstate)
Supply(sid, pno, quantity) Blocked Execution (On the fly) πsname (On the fly) σscity=‘Seattle’ and sstate=‘WA’ and pno=2 (Merge Join) sno = sno Slide assumes first call to next on the right hand side does not result in a join. Blocked Blocked Suppliers Supplies (File scan) (File scan)
39
Pipelined Execution Tuples generated by an operator are immediately sent to the parent Benefits: No operator synchronization issues No need to buffer tuples between operators Saves cost of writing intermediate data to disk Saves cost of reading intermediate data from disk This approach is used whenever possible Mention that some engines process tuples in batches rather than one at a time. Others process one at a time.
40
Query Execution Bottom Line
SQL query transformed into physical plan Access path selection for each relation Scan the relation or use an index (next lecture) Implementation choice for each operator Nested loop join, hash join, etc. Scheduling decisions for operators Pipelined execution or intermediate materialization Pipelined execution of physical plan
41
Recall: Physical Data Independence
Applications are insulated from changes in physical storage details SQL and relational algebra facilitate physical data independence Both languages input and output relations Can choose different implementations for operators
42
Query Performance My database application is too slow… why?
One of the queries is very slow… why? To understand performance, we need to understand: How is data organized on disk How to estimate query costs In this course we will focus on disk-based DBMSs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.