Download presentation
Presentation is loading. Please wait.
1
CS 4432lecture #41 CS4432: Database Systems II Lecture #5 Professor Elke A. Rundensteiner
2
CS 4432lecture #42 TODAY Lecture on chapter 3 for ~30 min Then introduction to project 1 ~20 min Project-1 team formation later today (watch for it on my.wpi)
3
CS 4432lecture #43 Storage Layout : How to lay out data on disk. ( chapter 3)
4
CS 4432lecture #44 Data Items Records Blocks Files Memory Overview
5
CS 4432lecture #45 Record - Collection of related data items (called FIELDS) E.g.: Employee record: name field, salary field, date-of-hire field,...
6
CS 4432lecture #46 Types of records: Main choices: –FIXED vs VARIABLE FORMAT –FIXED vs VARIABLE LENGTH
7
CS 4432lecture #47 A SCHEMA contains information such as: - # fields (attributes) - type of each field (length) - order of attributes in record - meaning of each field (domain) - constraints (primary key, etc). Fixed format Not associated with each record.
8
CS 4432lecture #48 Example: fixed format and length Employee record (1) E#, 2 byte integer (2) E.name, 10 char. Schema (3) Dept, 2 byte code We can simply concatenate fields. 55 s m i t h 02 83 j o n e s 01 Records
9
CS 4432lecture #49 What : Not all fields are included in the record, and possibly in different orders. Then : Record itself must contain format, i.e., it is “Self Describing”: Variable format
10
CS 4432lecture #410 Why Variable format ? “sparse” records repeating fields evolving formats
11
CS 4432lecture #411 Example: variable format and length 4I524SDROF46 Field name codes could also be strings, i.e. TAGS # Fields Code identifying field as E# Integer type Code for Ename String type Length of str.
12
CS 4432lecture #412 EXAMPLE: variable format record with repeating fields e.g., Employee has one or more children 3E_name: FredChild: SallyChild: Tom Do repeating fields always require variable format and length?
13
CS 4432lecture #413 e.g., a person and her hobbies. MarySailingChess-- Then must allocate maximum number of repeating fields (if not used, set to null)
14
CS 4432lecture #414 Many variants between fixed - variable format: Example1: Include record type in record recordtype record length tells me what to expect (i.e., points to schema) 527....
15
CS 4432lecture #415 Record header - data at beginning that describes record May contain: - pointer to schema (record type) - length of record - time stamp (create time, mod. time) - other stuff (e.g., ROW-ID in Oracle)
16
CS 4432lecture #416 Example2: Variant btw FIXED/VAR format Hybrid format : one part is fixed, other is variable E.g.: All employees have E#, name, dept; and other fields vary. 25SmithToy2retiredHobby:chess # of var fields
17
CS 4432lecture #417 Also, many variations in internal organization of record Just to show one: length of field 3F310F1 5 F212 * * * 33251520F1F2F3 total size offsets 0 1 2 3 4 5 15 20
18
CS 4432lecture #418 Question: We have seen examples for : * Fixed format and length records * Variable format and length records (a) Does fixed format and variable length make sense? (b) Does variable format and fixed length make sense?
19
CS 4432lecture #419 Data Items Records Blocks Files Memory Next:
20
CS 4432lecture #420 Next: placing records into blocks blocks... a file assume fixed length blocks assume a single file (for now)
21
CS 4432lecture #421 (1) separating records (2) spanned vs. unspanned (3) mixed record types – clustering (4) split records (5) sequencing (6) indirection Options for storing records in blocks:
22
CS 4432lecture #422 Block (a) no need to separate - fixed size recs. (b) special marker (c) give record lengths (or offsets) - within each record - in block header (1) Separating records R2R1R3
23
CS 4432lecture #423 Unspanned: records must be within one block block 1 block 2... Spanned block 1 block 2... (2) Spanned vs. Unspanned R1R2 R1 R3R4R5 R2 R3 (a) R3 (b) R6R5R4 R7 (a)
24
CS 4432lecture #424 Unspanned is much simpler, but may waste space… Spanned essential if record size > block size Spanned vs. unspanned:
25
CS 4432lecture #425 Example 10 6 records each of size 2,050 bytes (fixed) block size = 4096 bytes block 1 block 2 2050 bytes wasted 2046 2050 bytes wasted 2046 R1R2 Utiliz = 50% -> ½ of space is wasted
26
CS 4432lecture #426 Mixed - records of different types (e.g. EMPLOYEE, DEPT) allowed in same block e.g., a block (3) Mixed record types EMP e1 DEPT d1 DEPT d2
27
CS 4432lecture #427 Why do we want to mix? Answer: CLUSTERING Records that are frequently accessed together should be in the same block Problems Creates variable length records in block Must avoid duplicates (how to cluster?) Insert/deletes are harder
28
CS 4432lecture #428 Example Clustering Q1: select A#, C_NAME, C_CITY, … from DEPOSIT, CUSTOMER where DEPOSIT.C_NAME = CUSTOMER.C.NAME a block CUSTOMER,NAME=SMITH DEPOSIT,NAME=SMITH
29
CS 4432lecture #429 If Q1 frequent, clustering good But if Q2 frequent Q2: SELECT * FROM CUSTOMER CLUSTERING IS COUNTER PRODUCTIVE
30
CS 4432lecture #430 Compromise: No mixing, but keep related records in same cylinder...
31
CS 4432lecture #431 (1) Separating records (2) Spanned vs. Unspanned (3) Mixed record types - Clustering (4) Split records (5) Sequencing (6) Indirection Recap: Storing records in blocks
32
CS 4432lecture #532 Now on to Project 1 …
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.