Download presentation
Presentation is loading. Please wait.
Published byMerry Pierce Modified over 9 years ago
1
IT 20303 The Relational DBMS Section 06
2
Relational Database Theory Physical Database Design
3
Relational Database Theory Physical Database Design –Goals Improve performance –By minimizing disk I/O Improving management of the data –By grouping tables that can be managed as a group
4
Relational Database Theory Physical design decisions are based on: –Use of the data (volume, frequency) –Features supported by the specific RDBMS –Disk storage configuration
5
Relational Database Theory DBA initially sets up the physical database –Tunes physical parameters on a ongoing basis As usage patterns change As new hardware/software options become available
6
Steps in Physical Design Process –Determine which tables can be managed as a group Many RDBMSs support the concept of a Container (Oracle Tablespace, db space, Access uses the.mdb) –A collection of tables, and indexes Relational Database Theory
7
–Develop a plan for allocating tables to disk devices Consider parallel disk controllers Group tables together that are frequently joined Distribute heavily accessed table to different disk devices –To avoid excessive head movement on one disk Relational Database Theory
8
–Build indexes on table columns, based on frequency of use –Restructure tables if necessary Fragment large tables into multiple smaller ones De-normalize tables if appropriate Relational Database Theory
9
Example of a Container Table 1 Table 2 Table N Tablespace OS File
10
Managing a collection of Tables, Indexes –Purpose of container concept Relate tables, indexes to physical disk files Aid in the management of the database –Example: A tablespace can be taken offline, backed up, and restored while the remainder of the database is online Relational Database Theory
11
–Support clustering data from related tables in the same file So that related data is read with the same I/O request Relational Database Theory
12
How the RDBMS processes a user request –RDBMS parses, validates, and optimizes the SQL request –Determines disk file in which the table is written Specific to each RDBMS & OS Relational Database Theory
13
–Initiates I/O request to operating system, if necessary I/O is requested if file is not currently in buffers –Processes execution plan using data in its buffer Relational Database Theory
14
Indexes –Index is a separate structure (table) Points into the data table Built on one or more columns in the data table Relational Database Theory
15
Comments on Indexing –An index can be built on any column or combination of columns –An index can be unique or non-unique –An index on the primary key is called the primary index –Most RDBMSs use an internal row id as the pointer to the row –Use of the index is transparent to the user Relational Database Theory
16
Use of an index –Provides access to a row based on data value(s) –Avoids duplicates – only way –Supports sequential processing on the indexed field –Improves performance Relational Database Theory
17
Use of an index improves performance on Retrieval –Processing an index is more efficient than processing a table – for reads Index is usually small, relative to the table –Can be held entirely in memory The smaller the index value, the more entries per block the more likely the index will be in memory Relational Database Theory
18
Most RDBMSs use a type of B-Tree Index –B-tree indexes were designed for efficient search of a sorted list –Algorithms exist for managing and maintaining B-trees Relational Database Theory
19
B-trees were introduced by Bayer (1972) and McCreight. –They are a special m-ary balanced tree used in databases because their structure allows records to be inserted, deleted, and retrieved with guaranteed worst-case performance Relational Database Theory
20
B-Tree Relational Database Theory
21
Use of index degrades performance on Updates –Inserting a row is the source of much disk I/O (overhead) Every index on the table must be searched and updated also Relational Database Theory
22
Frequently inserting rows leads to index block overflow –Causes much disk I/O as overflow condition is processed Relational Database Theory
23
Techniques for managing volatile tables (many interests, deletes) –Partially fill index blocks when creating the index –Periodically restructure (Drop, Create) the indexes Relational Database Theory
24
Indexing: Strengths and Weaknesses –Strengths Improves performance on retrieval of data Can be built or dropped at any time Usage is transparent to the user –Weaknesses Degrades update performance Relational Database Theory
25
De-normalization –De-normalization means combining two (or more) tables Usually done when tables are frequently joined –De-normalization (joining two tables) depends on usage Depends on how applications and users access the data Relational Database Theory
26
De-normalization is done to improve performance –Tailors data structures for one specific application’s use –Improves performance of one type of access at expense of others Relational Database Theory
27
De-normalization Trade-Offs NormalizationDe-normalization Eliminates update anomaliesImproves performance for specific application(s) Minimizes data redundancy Supports simpler logic Provides application- independent database design Encourages sharing of data
28
When to De-Normalize –This is EVIL, Do Not Do… –When does de-normalization have minimal impact? Data is accessed primarily on a read-only basis Data is accessed primarily by one application Relational Database Theory
29
When to de-normalize –After database design is done and tables are normalized to 3NF –After clustering related tables in the same logical container –After considering trade-offs and usage of data Relational Database Theory
30
Alternatives to de-normalization –Physical placement of data Use of container Can improve performance without impacting logical design –Selective hardware upgrades More main memory, expanded storage, cache storage devices Relational Database Theory
31
Fragmentation – Better alternative to de- normalization –Means breaking one table into two (or more) tables Usually done when one table is very large Or groups of user almost exclusively access a subset of data in a table Relational Database Theory
32
Fragmentation can be based on selection or projection –Must be able to reconstruct the original table – by union or join –Primary key column(s) must be included in all vertical fragments Disadvantage is that the DBA must be aware of all the fragmented tables Relational Database Theory
33
Physical Design Review Relational Database Theory
34
Physical Database Design –Goals Improve performance –By minimizing disk I/O Improving management of the data –By grouping tables that can be managed as a group
35
Indexing: Strengths and Weaknesses –Strengths Improves performance on retrieval of data Can be built or dropped at any time Usage is transparent to the user –Weaknesses Degrades update performance Relational Database Theory
36
Questions? Relational Database Theory
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.