Presentation is loading. Please wait.

Presentation is loading. Please wait.

Next: Data Items Records Blocks Files Memory CS 4432 lecture #5.

Similar presentations


Presentation on theme: "Next: Data Items Records Blocks Files Memory CS 4432 lecture #5."— Presentation transcript:

1 Next: Data Items Records Blocks Files Memory CS 4432 lecture #5

2 Goal : placing records into blocks
a file records assume fixed length blocks assume a single file (for now) CS 4432 lecture #5

3 Options for storing records in blocks:
(1) separating records (2) spanned vs. unspanned (3) mixed record types – clustering (4) split records (5) sequencing (6) indirection CS 4432 lecture #5

4 (1) Separating records Block
(a) no need to separate if fixed size records. (b) or, use special marker (c) or, give record lengths (or offsets) - within each record - in block header R1 R2 R3 CS 4432 lecture #5

5 (2) Spanned vs. Unspanned
Unspanned: records within one block block block 2 ... Spanned : records wrap across 2 blocks block block R1 R2 R3 R4 R5 R1 R2 R3 (a) R3 (b) R4 R5 R6 R7 (a) CS 4432 lecture #5

6 Unspanned is much simpler, but may waste space…
Spanned vs. unspanned: Unspanned is much simpler, but may waste space… Spanned essential if record size > block size CS 4432 lecture #5

7 Example 106 records each of size 2,050 bytes (fixed)
block size = 4096 bytes block block 2 2050 bytes wasted bytes wasted 2046 R1 R2 Utiliz = 50% -> ½ of space is wasted CS 4432 lecture #5

8 (3) Mixed versus uniform record types
Mixed - records of different types (e.g., EMPLOYEE, DEPT) allowed in same block e.g., a block EMP e1 DEPT d1 DEPT d2 CS 4432 lecture #5

9 Why do we want to mix? Answer: CLUSTERING Records that are frequently
accessed together should be placed into the same block Problems Creates variable length records in block Aim to avoid duplicates (how to cluster?) Insert/deletes are harder CS 4432 lecture #5

10 Example Clustering Q1: select C_NAME, C_CITY, AMOUNT, …
from DEPOSIT, CUSTOMER where DEPOSIT.C_NAME = CUSTOMER.C.NAME a block layout: CUSTOMER,NAME=SMITH DEPOSIT,NAME=SMITH DEPOSIT,NAME=SMITH CUSTOMER,NAME=JONES Question: Good idea or bad idea ? CS 4432 lecture #5

11 But if instead Q2 frequent with : Q2: SELECT * FROM CUSTOMER
If Q1 frequent with join on customer and deposit relations, then clustering good But if instead Q2 frequent with : Q2: SELECT * FROM CUSTOMER then clustering is counter-productive CS 4432 lecture #5

12 Compromise: No mixing, but keep related records in same cylinder ...
CS 4432 lecture #5

13 Recap: Storing records in blocks
(1) Separating records (2) Spanned vs. Unspanned (3) Mixed record types - Clustering (4) Split records (5) Sequencing (6) Indirection CS 4432 lecture #5

14 Options for storing records in blocks:
(1) separating records (2) spanned vs. unspanned (3) mixed record types – clustering (4) split records (5) sequencing (6) indirection CS 4432 lecture #5

15 (4) Split records Fixed part in one block Typically for hybrid format
Variable part in another block CS 4432 lecture #5

16 R1 (a) R1 (b) R2 (a) R2 (b) R2 (c) Block with fixed recs.
Block with variable recs. R1 (a) R1 (b) R2 (a) R2 (b) R2 (c) CS 4432 lecture #5

17 (5) Sequencing Ordering records in file (and block) by some key value
Sequential file ( - sequenced file) Why sequencing ? Typically to make it possible to efficiently read records in order CS 4432 lecture #5

18 Sequencing Options (a) Next record physically contiguous ...
(b) Linked What about INSERT/ DELETE ? Next (R1) R1 R1 Next (R1) CS 4432 lecture #5

19 Sequencing Options (c) Overflow area Records in sequence header R1
CS 4432 lecture #5

20 (6) Indirection Addressing
How does one refer to records? Problem: Records can be on disk or in (virtual) memory. Need common address, but have different physical locations. Rx Many options: Physical Indirect CS 4432 lecture #5

21 Purely Physical Addressing
Device ID E.g., Record Cylinder # Address = Track # ( ID ) Block # Offset in block Block ID CS 4432 lecture #5

22 Fully Indirect Addressing
Solution: Record ID (Oracle: ROWID) as global address, maintain a map table. Map Table rec ID r address a Rec ID Physical addr. CS 4432 lecture #5

23 What to do : Options inbetween ?
Tradeoff Physical Indirect Flexibility Cost to move records of indirection (for deletions, insertions) (lookup) What to do : Options inbetween ? CS 4432 lecture #5

24 Ex #1 : Indirection in block
Block Header A block: Free space R3 R4 R1 R2 CS 4432 lecture #5

25 Ex. #2 Use logical block #’s. understood by file system
Ex. #2 Use logical block #’s understood by file system instead of direct disk access REC ID File ID Block # Record # or Offset File ID, Physical Block # Block ID File System Map CS 4432 lecture #5

26 Recap: Storing records in blocks
(1) Separating records (2) Spanned vs. Unspanned (3) Mixed record types - Clustering (4) Split records (5) Sequencing (6) Indirection CS 4432 lecture #5


Download ppt "Next: Data Items Records Blocks Files Memory CS 4432 lecture #5."

Similar presentations


Ads by Google