Chapter 31 Chapter 3 Representing Data Elements. Chapter 32 Fields, Records, Blocks, Files Fields (attributes) need to be represented by fixed- or variable-length.

Chapter 31 Chapter 3 Representing Data Elements

Chapter 32 Fields, Records, Blocks, Files Fields (attributes) need to be represented by fixed- or variable-length sequences of bytes Fields are put together in fixed- or variable- length collections called “records” (tuples, or objects) A collection of records that forms a relation or the extent of a class is stored as a collection of blocks, called a file

Chapter 33 Technical Issues e.g.,CREATE TABLE MovieStar( name CHAR(30) PRIMARY KEY, address VARCHAR(255), gender CHAR(1), birthdate DATE ); How do we represent SQL datatypes as fields? Tuples as records? Collections of records or tuples in blocks of memory? Relations as collections of blocks? How do we cope with record sizes that May be different for different tuples What happens if the size of a record changes?

Chapter 34 Representing Objects OODBMS and ORDBMS Approximations –An object is a tuple –Its fields or “instance variables” are attributes Differences –Objects can have methods –Objects can have an object identifier (OID), which is an address

Chapter 35 Representing Objects e.g.,interface Star { attribute string name; attribute Struct Addr { string street, string city} address; relationship Set starredIn inverse Movie::stars; }; –A Star object can be represented by a record –Problems of representing Star objects Address is a structure starredIn is a set of references to Movie objects

Chapter 36 Representing Data Elements Let us begin by considering how the principal SQL datatypes are represented as fields of a record All data is represented as a sequence of bytes

Chapter 37 How to Represent SQL Datatypes (1) Integer (short): 2 bytes e.g., 35 is Real, floating point –N bits for mantissa, m for exponent… 0000000000100011

Chapter 38 How to Represent SQL Datatypes (2) Fixed-Length character strings, CHAR(n) –The field is an array of n bytes –If necessary, filled by a special pad character –The pad character’s 8-bit code is not one of the legal characters for SQL strings –For example, an attribute A were declared to have type CHAR(5), with value ‘‘cat’ cta

Chapter 39 How to Represent SQL Datatypes (3) Variable-length character strings –SQL type, VARCHAR(n) –(n+1) bytes are dedicated to the value of the string, no matter how long it is Common representations for VARCHAR –Length plus content n itself cannot exceed 255 (why ???) Any unused bytes are ignored –Null-terminated string

Chapter 310 How to Represent SQL Datatypes (4) Example –Assumptions The type for an attribute A: VARCHAR(10) The value for an attribute A: ‘cat’ –Length plus content –Null terminated ctacta3

Chapter 311 How to Represent SQL Datatypes (5) Dates –Represented as a fixed-length character string –“YYYY-MM-DD” (10-character string)

Chapter 312 How to Represent SQL Datatypes (6) Times –“HH:MM:SS” (8-character string) –HH: represented on a 24-hour clock –SQL 2 also allows another type, i.e., ’20:19:02.25’ to include fractions of a second Times can have a limit on the precision of times and be stored as if they were type VARCHAR(n), where n is the greatest length (maybe 9 + 2) Times can be stored as true variable-length values

Chapter 313 How to Represent SQL Datatypes (7) Bits –SQL type, BIT(n) –A sequence of bits can be packed eight to a byte Example 010111110011 -> 01011111 and 00110000 Boolean –One implementation may be –TRUE –FALSE 1111 0000

Chapter 314 How to Represent SQL Datatypes (9) Enumerated Types –These values are given symbolic names –We can represent the values of an enumerated type by integer codes Example A set of colors: {RED, GREEN, BLUE, YELLOW} RED -> 1GREEN -> 2 BLUE -> 3YELLOW -> 4

Chapter 315 Records (1) We shall now begin the discussion of how fields are grouped together into records Record –Collection of related data items (called FIELDS) E.g.: Employee record, name field, salary field, date-of-hire field, …

Chapter 316 Records (2) Record have a schema, which includes –Names of fields –Data types of fields –Field offsets within the record

Chapter 317 Building Fixed-Length Records Example (MovieStar relation) name, a 30-byte string address, of type VARCHAR(255) gender, a single byte, hold either ‘F’ or ‘M’ birthdate, of type DATE nameaddressbirthdate gender 030286287297

Chapter 318 Efficiency Issues –Some machines allow more efficient reading and writing of data that begins at a byte of main memory whose address is a multiple of 4 (or 8 if the machine has a 64-bit processor) –With a 32-bit processor machine, we round all field and record lengths up to the next multiple of 4 nameaddressbirthdate gender 032288292304

Chapter 319 Record Headers –There is information that can be kept in the record but that is not the value of any field The record schema (a pointer) The length of the record Timestamps (last modified or read) nameaddressbirthdate gender 012300304316 gender length to schema 44 header

Chapter 320 Packing Fixed-Length Records into Blocks (1) Records are stored in blocks of the disk Example (A typical block holding records) Does that scheme work well for variable- length records? headerrecord 1record 2…

Chapter 321 Packing Fixed-Length Records into Blocks (2) –Block header holds information such as: Links to one or more other blocks (Chapter 4) The role played by this block Relation the tuples of this block belong to A “directory” giving the offset of each record in the block A “block ID” Timestamp(s) And more (For example, …)

Chapter 322 Representing Block and Record Addresses Here, we consider how addresses, pointers, or references to records and blocks can be represented Server processes vs. client processes Database addresses vs. memory addresses

Chapter 323 Database Address vs. Memory Address Database address –An address in the server’s database address space –Typically eight or so bytes –An address in the secondary storage Memory address –An address in virtual memory –Typically 4 bytes

Chapter 324 How to Represent Database Addresses Physical Addresses –Host id, disk id –The cylinder number, track number, block number –The offset of the beginning of the record –8-16 bytes long Logical Addresses –An arbitrary string of bytes of some fixed length

Chapter 325 A Map Table –The level of indirection involved in the map table allows us flexibility –When we move or delete a record, all we have to do is to change the entry for the record in the table Logical address Physical address logicalphysical

Chapter 326 Structured Addresses Structured address schemes –Many combinations of logical and physical addresses are possible Example –Record address = physical address of block + slot number in the offset table –The table grows from the front, and records are placed starting at the end of the block record 3record 2record 1 headerunused offset table

Chapter 327 Addresses are now part of records in OODBs and ORDBs ! Pointers (addresses) are part of records –Not common in relational databases –Common in OODBs or ORDBs Every block, record, object, or referenceable data has two forms of address –Database address –Memory address

Chapter 328 The Translation Table –Following a database address in memory is time- consuming –Turns database addresses into their equivalents in memory –Only those items currently in memory are mentioned in the translation table database address memory address DBaddr mem-addr

Chapter 329 Pointer Swizzling (1) To avoid the cost of translating repeatedly from database addresses to memory addresses, several techniques have been developed (pointer swizzling) –When we move a block from secondary to main memory, pointers within the block may be swizzled (i.e. translating from the database address space to the virtual address space)

Chapter 330 Pointer Swizzling (2) Read into memory Block 1 Block 2 Disk Memory Swizzled Unswizzled

Chapter 331 Pointer Swizzling (3) Automatic swizzling –As soon as a block is brought into memory, we locate all its pointers and addresses and enter them into the translation table –These pointers include both the pointers from records in the block to elsewhere The addresses of the block itself and/or its records

Chapter 332 Pointer Swizzling (4) Swizzling on demand –To leave all pointers unswizzled when the block is first brought into memory –If and when we follow a pointer P that is inside some block of memory, we swizzle it Automatic swizzling or swizzling on demand –Which do you want to use?

Chapter 333 Pointer Swizzling (5) No Swizzling –Never swizzle pointers –Records cannot be pined in memory (adv.) Programmer control of swizzling –Programmers have a full control of about swizzling When a block is moved from memory back to disk, memory addresses must be replaced by database addresses (unswizzled)

Chapter 334 Pinned Records and Blocks (1) Pinned records –A block in memory is said to be pinned if it cannot at the moment be safely written back to disk –For instance If a block B1 has a swizzled pointer to some in block B2 A block, like B2, that is referred to by a a swizzled pointer from somewhere else, is therefore pinned (why) –To unpin a block, any pointers to it must be unswizzled

Chapter 335 Pinned Records and Blocks (2) –A linked list of occurrences of a swizzled pointer xy y y Swizzled pointer Transition table

Chapter 336 Variable-Length Data and Records How to represent –Data items whose size varies –Repeating fields –Variable-format records –Enormous fields (such as BLOB)

Chapter 337 Records with Variable-Length Fields (1) One solution –To put all fixed-length fields ahead of the variable- length fields Record header include: –The length of the record –Pointers to the beginnings of all the variable- length fields (possibly except the first variable- length field)

Chapter 338 Records with Variable-Length Fields (2) –A Moviestar record with name and address implemented as variable-length character strings addressnamebirthdate gender to address record length other header information

Chapter 339 Records with Repeating Fields (1) Solution 1 –To group all occurrences of field F together –And then, put in the record header a pointer to the first

Chapter 340 Records with Repeating Fields (2) –A record with a repeating group of references to movies addressname to movie pointers to address record length other header information pointers to movies

Chapter 341 Records with Repeating Fields (3) Solution 2 –To keep the record of fixed length –And then, to put the variable-length portion on a separate block What to keep –Pointers to the place where each repeating field begin –Either how many repetitions there are, or where the repetitions end

Chapter 342 Records with Repeating Fields (4) to address length of name to name Record header information nameaddress number of references length of address to movie references Record Additional space

Chapter 343 Variable-Format Records (1) Solution –To have tagged fields Tagged fields consists of: –Information about the role of this field The attribute or field name The type of the field (if needed) The length of the field (if needed) –The value of the field

Chapter 344 Variable-Format Records (2) A record with tagged fields … length code for string type code for name NS14Client EastwoodRS16Hog’s Inn length code for string type code for name

Chapter 345 Records that Do Not Fit in a Block (1) When occurred? –Often, values do not fit in one block A record with two or more fragments is called spanned. A record that does not cross a block boundary is unspanned

Chapter 346 Records that Do Not Fit in a Block (2) Storing spanned records across blocks record 1 record 2-a record 2-b record 3 Block header Record header

Chapter 347 BLOBs (1) Binary large object Examples: Images: GIF, JPEG Movies: MPEG audio, radar, …

Chapter 348 BLOBs (2) Storage and retrieval of BLOBS –Better to store BLOB consecutively allocated blocks (say, on one cylinder) –May be necessary to stripe the BLOB across several disks to allow parallel access

Chapter 349 Record Modifications: Insertion (1) No ordered relation –Find a block with some empty space –Get a new block if there is none Ordered relation –First, locate the appropriate block for that record –If there is space in the block, we can easily slide records –If there is no space in the block, we have to find room outside the block

Chapter 350 Record Modifications: Insertion (2) Finding room outside the block –Find space on a “nearby” block –Create an overflow block –Combination of the above is also possible –Leave “a forward address” if necessary block Boverflow block for B

Chapter 351 Record Modifications: Deletion (1) To save space –Sliding if necessary –Or possibly maintain an available-space list in the block header Overflow chain –Consider the total amount of used space on that chain –Reorganization (if needed)

Chapter 352 Record Modifications: Deletion (2) Pointers to the deleted record –Put a “tombstone” in place of the record Where the tombstone is placed –Using an offset-table scheme: a null pointer in the offset table –Using a map table: in place of the physical address in the translation table

Chapter 353 Record Modifications: Update Fixed-length record update Variable-length record update –Associated with insertion –Associated with deletion, but a tombstone

Chapter 31 Chapter 3 Representing Data Elements. Chapter 32 Fields, Records, Blocks, Files Fields (attributes) need to be represented by fixed- or variable-length.

Similar presentations

Presentation on theme: "Chapter 31 Chapter 3 Representing Data Elements. Chapter 32 Fields, Records, Blocks, Files Fields (attributes) need to be represented by fixed- or variable-length."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 31 Chapter 3 Representing Data Elements. Chapter 32 Fields, Records, Blocks, Files Fields (attributes) need to be represented by fixed- or variable-length.

Similar presentations

Presentation on theme: "Chapter 31 Chapter 3 Representing Data Elements. Chapter 32 Fields, Records, Blocks, Files Fields (attributes) need to be represented by fixed- or variable-length."— Presentation transcript:

Similar presentations

About project

Feedback