Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fall 2004 ECE569 Lecture 03-2.1 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang www.ece.rutgers.edu/~yyzhangwww.ece.rutgers.edu/~yyzhang Course.

Similar presentations


Presentation on theme: "Fall 2004 ECE569 Lecture 03-2.1 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang www.ece.rutgers.edu/~yyzhangwww.ece.rutgers.edu/~yyzhang Course."— Presentation transcript:

1 Fall 2004 ECE569 Lecture 03-2.1 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang www.ece.rutgers.edu/~yyzhangwww.ece.rutgers.edu/~yyzhang Course URL www.ece.rutgers.edu/~yyzhang/fall04www.ece.rutgers.edu/~yyzhang/fall04

2 Fall 2004 ECE569 Lecture 03-2.2 Topic IV: Physical Data Organization  Chapter 14 of Gray and Reuter  Goal: Efficiently implement tuple-oriented file system on top of block-oriented file system. l Storage allocation. l Tuple addressing l Enumeration l Content addressing l Maintenance l Protection

3 Fall 2004 ECE569 Lecture 03-2.3 Design Questions -Fixed vs. Variable size tuples -Small vs. Large tuples -Pinned vs. Unpinned tuples - Homogenous vs. Heterogeneous blocks - Reorganization?

4 Fall 2004 ECE569 Lecture 03-2.4 Basic Assumptions -Blocks are mapped into memory by buffer manager. -To avoid unnecessary data movement, tuples are accessed in page. -Fields must be aligned properly.

5 Fall 2004 ECE569 Lecture 03-2.5 Fixed size tuples  All fields are stored in the sequence in which they are declared in the CREATE statement, at the maximum length that was specified. create table foo( Achar(1), Binteger, Cvarchar(5), Dinteger) }; header AB C D 0148131620  Pros  Filed offset is easy to calculate  Updating a tuple is simple  Cons  Waste of space due to VARCHAR  Tuples cannot cross page boundaries.

6 Fall 2004 ECE569 Lecture 03-2.6 Variable Size Tuples Variable size fields - All fields stored - Access to fixed size fields is direct - Access to variable sized fields relies on one level of indirection.  Note  The stored order is not the same as declared

7 Fall 2004 ECE569 Lecture 03-2.7 Optional Fields Optional fields - ft i indicates the field its corresponding pointer refers to - First part of tuple (descriptor) describes the rest

8 Fall 2004 ECE569 Lecture 03-2.8 Large Objects -Modify previous representation. Pointers point to components in another page if necessary. -Descriptor must fit in one page. -Cross page pointers take significantly more space only store if necessary. (offset in page = 2bytes, page# = 4bytes, file# ≥ 1bytes)

9 Fall 2004 ECE569 Lecture 03-2.9 Long Tuples  Historically, long tuples (which span more than one page) are treated differently.  Long tuples are because one or more attributes are big.  Solutions: l Long pages (DB2 uses page sizes as big as 32K or 64K) l Separate Pages for Long fields l Overflow Pages

10 Fall 2004 ECE569 Lecture 03-2.10 Separate Pages for Long Field header tuple header long field Page PPage K  tuple body stored on a regular page while long field stored in a large page. E.g., images  Rationale: long fields are accessed less often and have different access pattern.

11 Fall 2004 ECE569 Lecture 03-2.11 Overflow Pages Page PPage KPage M  The whole tuple is an object.

12 Fall 2004 ECE569 Lecture 03-2.12 Page Layouts -Efficient and simple way to allocate storage within page -Efficient way to locate tuples based on tuple-id +Allow tuples to move within page (reorganization) +Allow tuples to move to other pages (overflow)

13 Fall 2004 ECE569 Lecture 03-2.13 Storage Allocation I -Bitmap (fixed size tuples) +Partition page into tuple-sized portions +Each tuple-slot has a “free” bit +To avoid dangling references also need a “deleted” bit Tuple 0 Free 00000001111111111 Deleted 00001000000000000 Tuple 1 Tuple 2 Tuple 3 Tuple 4 Tuple 5 Tuple 6

14 Fall 2004 ECE569 Lecture 03-2.14 Storage Allocation II: -Stack Allocation + Allocated tuples stored contiguously at beginning of page + “free” pointer to beginning of free space + When space is freed up in the page, slide tuples “up” to fill the space.

15 Fall 2004 ECE569 Lecture 03-2.15 Storage Allocation III: -Unpinning Tuples +Page directory containing tuple pointers (indirection) +Tuple-id (TID) is (fileID, Block#, directory entry) +Tuples and directory grow toward each other. +Tuples can be easily moved within page +If a tuple “overflows,” leave a TID indicating where it moved to. +If tuple is deleted, reclaim space but leave directory entry intact. (tombstone)

16 Fall 2004 ECE569 Lecture 03-2.16 Unpinning Tuples

17 Fall 2004 ECE569 Lecture 03-2.17 Tuple Access Information from data dictionary is needed to access tuples, e.g., attribute names, data type and offset within tuple. Also need pointer to first/last block of relation to allow insertion and searches. To use an index must know where bucket/root directory is (i.e., its page number). Also need to know what attributes constitute the search key. To improve performance, information in data dictionary can be “compiled” into internal C structures. When new relation is created or modified must modify the in- memory structure accordingly.

18 Fall 2004 ECE569 Lecture 03-2.18 Example struct tables { char *table_name; int num_attributes; int num_key_attributes; page_id_t hash_directory; struct attribute *attribute_desc; } table_catalog[MAX_RELATIONS]; struct attribute { char *attribute_name; int position; data_type_t data_type; int field_offset; };

19 Fall 2004 ECE569 Lecture 03-2.19 Tuple Access


Download ppt "Fall 2004 ECE569 Lecture 03-2.1 ECE 569 Database System Engineering Fall 2004 Yanyong Zhang www.ece.rutgers.edu/~yyzhangwww.ece.rutgers.edu/~yyzhang Course."

Similar presentations


Ads by Google