Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC-608 Database Systems

Similar presentations


Presentation on theme: "CPSC-608 Database Systems"— Presentation transcript:

1 CPSC-608 Database Systems
Fall 2017 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #19

2 Algorithms Implementing
Relational Algebraic Operations

3 Algorithms Implementing Relational Algebraic Operations
Projection and selection π, σ Set/bag operations US, ∩S, −S, UB, ∩B, −B Join operations Extended operations γ, δ, τ, table-scan × C ,

4 Algorithms Implementing Relational Algebraic Operations
Projection and selection π, σ Set/bag operations US, ∩S, −S, UB, ∩B, −B Join operations Extended operations γ, δ, τ, table-scan × C , π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

5 Algorithms Implementing Relational Algebraic Operations
Operations based on tuples: π, σ, UB, table-scan Operations based on entire relation: US, ∩S, −S, ∩B, −B, γ, δ, τ, Unary operations: π, σ, γ, δ, τ , table-scan Binary operations: US, ∩S, −S, UB, ∩B, −B, × C , × C , π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

6 Algorithms Implementing Relational Algebraic Operations
Operations based on tuples: π, σ, UB, table-scan Operations based on entire relation: US, ∩S, −S, ∩B, −B, γ, δ, τ, Unary operations: π, σ, γ, δ, τ , table-scan Binary operations: US, ∩S, −S, UB, ∩B, −B, × C , × C , π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

7 Important Facts Data (relations) are in disks
Disk IO’s are time-consuming Relations are too large to fit into main memory Different algorithms are needed when (assigned/available) main memory buffer size is different.

8 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

9 Disks slow (read/write: 1~40 millisecond) large capacity (100’s gigabytes) non-volatile Main Memory fast (read/write: nanosecond) small capacity (gigabytes) volatile Disks are about 105~106 times slower than main memory

10 Disk I/O Model of Computation
Dominance of I/O cost: if a block needs to be moved between disk and main memory, then the time taken to perform the read/write is much larger than the time likely to be used to manipulate that data in main memory. The number of disk block reads/writes is a good approximation to the entire computation.

11 Disk I/O Model of Computation
Dominance of I/O cost: if a block needs to be moved between disk and main memory, then the time taken to perform the read/write is much larger than the time likely to be used to manipulate that data in main memory. The number of disk block reads/writes is a good approximation to the entire computation. Assume: -- inputs are on disk (so must be read in) -- but output is not written back to disk (may not have to; hard to estimate output size, which also does not depend on the adopted algorithms)

12 Disk I/O Model of Computation
Dominance of I/O cost: if a block needs to be moved between disk and main memory, then the time taken to perform the read/write is much larger than the time likely to be used to manipulate that data in main memory. The number of disk block reads/writes is a good approximation to the entire computation. Assume: -- inputs are on disk (so must be read in) -- but output is not written back to disk (may not have to; hard to estimate output size, which also does not depend on the adopted algorithms)

13 Parameters for algorithm complexity
R: a relation B(R): # of blocks containing tuples of R T(R): # of tuples in R V(R, A): # of distinct values on attribute A of R M: # of useable main memory blocks π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

14 A Remark on Main Memory Size M
× π σ G F E D C B A π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

15 A Remark on Main Memory Size M
× π σ scan(G) scan(F) scan(E) scan(D) scan(C) scan(B) scan(A) index-scan J2P J1P CJ I1P × π σ G F E D C B A π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

16 Operations requiring (almost) no M
π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

17 Operations requiring (almost) no M
Tuple-based operations: π, σ, UB, table-scan π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

18 Operations requiring (almost) no M
Tuple-based operations: π, σ, UB, table-scan General framework: Read in a block; Process; Send to the output main memory process disk π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

19 Operations requiring (almost) no M
Tuple-based operations: π, σ, UB, table-scan General framework: Read in a block; Process; Send to the output Memory: M = 2 Cost: π (R), σ(R), table-scan(R): B(R) UB(R, S): B(R) + B(S) main memory process disk π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

20 σ A=c(R) can be done with cost B(R)/V(R, A) if R has an index on A.
Operations requiring (almost) no M Tuple-based operations: π, σ, UB, table-scan General framework: Read in a block; Process; Send to the output Memory: M = 2 Cost: π (R), σ(R), table-scan(R): B(R) UB(R, S): B(R) + B(S) σ A=c(R) can be done with cost B(R)/V(R, A) if R has an index on A. main memory process disk π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

21 Operations requiring (almost) no M
Tuple-based operations: π, σ, UB, table-scan General framework: Read in a block; Process; Send to the output Memory: M = 2 Cost: π (R), σ(R), table-scan(R): B(R) UB(R, S): B(R) + B(S) main memory process disk π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

22 One-pass algorithms π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan,
22 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

23 One-pass algorithms Condition: the main memory M is sufficiently large
23 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

24 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block 24 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

25 If the operation is binary
One-pass algorithms Condition: the main memory M is sufficiently large General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block If the operation is binary 25 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

26 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block Unary operations γ(R), δ(R), τ(R) main memory R 26 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

27 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block Unary operations γ(R), δ(R), τ(R) main memory R R 27 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

28 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block Unary operations γ(R), δ(R), τ(R) main memory R R process 28 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

29 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block Unary operations γ(R), δ(R), τ(R) Summary: Memory: M ≥ B(R) Cost: B(R) main memory R R process 29 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

30 Apply efficient main memory algorithms
One-pass algorithms Condition: the main memory M is sufficiently large General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block Unary operations γ(R), δ(R), τ(R) Summary: Memory: M ≥ B(R) Cost: B(R) main memory R R process Apply efficient main memory algorithms (e.g., sort R) 30 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

31 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: Read in an entire relation R; Process R; Read in the other relation S block by block; Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 31 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

32 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rsmall Rlarge disk 32 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

33 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rsmall Rsmall Rlarge disk 33 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

34 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rsmall Rsmall Rlarge disk 34 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

35 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rsmall Rsmall Rlarge process disk 35 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

36 Build an efficient data structure for Rsmall (e.g., sort Rsmall)
One-pass algorithms Condition: the main memory M is sufficiently large General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rsmall Rsmall Build an efficient data structure for Rsmall (e.g., sort Rsmall) Rlarge process disk 36 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

37 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rlarge US Rsmall 1. Sort Rsmall; 2. FOR each tuple t in Rlarge DO IF t is not in Rsmall THEN put t to the output; 3. Send Rsmall to the output. Rsmall Rsmall Rlarge process disk 37 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

38 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rlarge ∩S Rsmall 1. Sort Rsmall; 2. FOR each tuple t in Rlarge DO IF t is in Rsmall THEN put t to the output; Rsmall Rsmall Rlarge process disk 38 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

39 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rsmall Rsmall Rlarge process disk 39 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

40 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , −S is not commutative main memory Rsmall Rsmall Rlarge process disk 40 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

41 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , −S is not commutative main memory Rsmall Rlarge−S Rsmall 1. sort Rsmall; 2. FOR each tuple t in Rlarge DO IF t is not in Rsmall THEN put t to the output. Rsmall Rlarge process disk 41 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

42 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , −S is not commutative main memory Rsmall Rsmall −S Rlarge 1. sort Rsmall; 2. FOR each tuple t in Rlarge DO IF t is in Rsmall THEN remove t from Rsmall; 3. send Rsmall to the output Rsmall Rlarge process disk 42 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

43 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rlarge ∩B Rsmall 1. Make Rsmall a balance tree; 2. FOR each tuple t in Rlarge DO IF t is in Rsmall THEN output t; and remove a copy of t from Rsmall Rsmall Rsmall Rlarge process disk 43 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

44 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , −B is not commutative main memory Rsmall Rsmall Rlarge process disk 44 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

45 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , −B is not commutative main memory Rsmall Rlarge −B Rsmall 1. Make Rsmall a balance tree; 2. FOR each tuple t in Rlarge DO IF t is not in Rsmall THEN output t ELSE remove a copy of t from Rsmall; Rsmall Rlarge process disk 45 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

46 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , −B is not commutative main memory Rsmall Rsmall −B Rlarge 1. Make Rsmall a balance tree; 2. FOR each tuple t in Rlarge DO IF t is in Rsmall THEN remove a copy of t from Rsmall; 3. Output Rsmall. Rsmall Rlarge process disk 46 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

47 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rlarge × Rsmall 1. FOR each tuple t in Rlarge DO cross join t and each tuple in Rsmall and send to the output. Rsmall Rsmall Rlarge process disk 47 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

48 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rlarge Rsmall 1. FOR each tuple t in Rlarge DO cross join t and each tuple in Rsmall ; IF the join satisfies C THEN send to the output. Rsmall C Rsmall Rlarge process disk 48 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

49 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Rlarge Rsmall 1. sort Rsmall by join attributes A; 2. FOR each tuple t in Rlarge DO find the tuples in Rsmall with the same A-value; join them with t and put in the output block Rsmall Rsmall Rlarge process disk 49 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

50 One-pass algorithms Condition: the main memory M is sufficiently large
General framework: 1. Read in an entire relation Rsmall; 2. Process Rsmall; 3. Read in the other relation Rlarge block by block; 4. Sent the results to an output block Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , main memory Summary: Memory: M ≥ B(Rsmall) Cost: B(Rsmall) + B(Rlarge) Rsmall Rsmall Rlarge process disk 50 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

51 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. 51 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

52 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS 52 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

53 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 53 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

54 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R US S 1. \\ in the first execution of the \\ tR-loop, output tS; 2. \\ in an execution of the tR-loop IF tR = tS THEN mark tR; 3. \\ at the end of the tR-loop IF tR is unmarked THEN output tR. Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 54 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

55 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R ∩S S 1. \\ in an execution of the tR-loop IF tR = tS THEN mark tR; 2. \\ at the end of the tR-loop IF tR is marked THEN output tR. Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 55 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

56 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R −S S 1. \\ in an execution of the tR-loop IF tR = tS THEN mark tR; 2. \\ at the end of the tR-loop IF tR is unmarked THEN output tR. Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 56 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

57 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R −S S 1. \\ in an execution of the tR-loop IF tR = tS THEN mark tR; 2. \\ at the end of the tR-loop IF tR is unmarked THEN output tR. Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , Not working for S −S R because −S is not commutative 57 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

58 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 58 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

59 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R ∩B S and R −B S Nested-loop does not seem to be effective for R ∩B S and R −B S Remark: we cannot mark S. Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 59 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

60 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R ∩B S and R −B S Nested-loop does not seem to be effective for R ∩B S and R −B S Remark: we cannot mark S. Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 60 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

61 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 61 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

62 Nested-loop is particularly simple for join operations
For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop is particularly simple for join operations Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 62 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

63 Nested-loop is particularly simple for join operations
For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Nested-loop is particularly simple for join operations Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS R join S IF tR and tS are joinable THEN Join tR and tS; IF the join is×or ) THEN output the join; ELSE \\ the join is C output the join if it satisfies C Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 63 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

64 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: T(R)*T(S) + T(R) Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 64 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

65 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: T(R)*T(S) + T(R) Nested-loop (R □ S): FOR each tuple tR in R DO FOR each tuple tS in S DO Apply the operation □ on tR and tS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , Very bad 65 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

66 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: T(R)*B(S) + T(R) Nested-loop (R □ S): FOR each tuple tR in R DO FOR each in S DO Apply the operation □ on tR and the tuples in bS block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 66 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

67 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: T(R)*B(S) + T(R) Nested-loop (R □ S): FOR each tuple tR in R DO FOR each in S DO Apply the operation □ on tR and the tuples in bS block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , Still large 67 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

68 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: B(R)*B(S) + B(R) Nested-loop (R □ S): FOR each in R DO FOR each in S DO Apply the operation □ on the tuples in bR and the tuples in bS block bR block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 68 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

69 Can it be further improved?
For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: B(R)*B(S) + B(R) Nested-loop (R □ S): FOR each in R DO FOR each in S DO Apply the operation □ on the tuples in bR and the tuples in bS block bR block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , Can it be further improved? 69 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

70 max # blocks fitting in M
For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: B(R)*B(S)/M + B(R) Nested-loop (R □ S): FOR in R DO FOR each in S DO Apply the operation □ on the tuples in R and the tuples in bS max # blocks fitting in M block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , 70 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

71 For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: B(R)*B(S)/M + B(R) Nested-loop (R □ S): FOR in R DO FOR each in S DO Apply the operation □ on the tuples in R and the tuples in bS max # blocks fitting in M block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , Very good if B(R) or B(S) is only slightly larger than M 71 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

72 max # blocks fitting in M
For larger relations When relations cannot fit in main memory, one-pass algorithms cannot be used. A generic algorithm for binary operations: Summary: Memory: M ≥ 2 Cost: B(R)*B(S)/M + B(R) Nested-loop (R □ S): FOR in R DO FOR each in S DO Apply the operation □ on the tuples in R and the tuples in bS max # blocks fitting in M block bS Binary operations on two relations R and S: US, ∩S, −S, ∩B, −B, ×, C , Should pick the smaller relation for the outer loop (not working for −S ) 72 π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,

73 Algorithms Implementing Relational Algebraic Operations
Quick Review What We did Operations requiring almost no space: π, σ, UB, table-scan One-pass Algorithms: γ, δ, τ, US, ∩S, −S, ∩B, −B, Nested-loop Algorithms For binary operations: US, ∩S, −S, × C , × C , Memory: M = 2 Cost: π (R), σ(R), table-scan(R): B(R) UB(R, S): B(R) + B(S) Unary: Memory: M ≥ B(R) Cost: B(R) Binary: Memory: M ≥ B(Rsmall) Cost: B(Rsmall) + B(Rlarge) Memory: M ≥ 2 Cost: B(R)*B(S)/M + B(R) π, σ, US, ∩S, −S, UB, ∩B, −B, γ, δ, τ, table-scan, × C ,


Download ppt "CPSC-608 Database Systems"

Similar presentations


Ads by Google