1 Agenda – 04/18/2006 and 04/20/2006 Identify tasks in physical database design. Define the design goals for physical database design. Discuss relevant.

Slides:



Advertisements
Similar presentations
Physical DataBase Design
Advertisements

Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Physical Database Monitoring and Tuning the Operational System.
1 Methodology : Conceptual Databases Design © Pearson Education Limited 1995, 2005.
© 2005 by Prentice Hall 1 Chapter 6: Physical Database Design and Performance Modern Database Management 7 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
Modern Systems Analysis and Design Third Edition
8-1 Outline  Overview of Physical Database Design  File Structures  Query Optimization  Index Selection  Additional Choices in Physical Database Design.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
10/3/2000SIMS 257: Database Management -- Ray Larson Relational Algebra and Calculus University of California, Berkeley School of Information Management.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Team Dosen UMN Physical DB Design Connolly Book Chapter 18.
Logical Database Design Nazife Dimililer. II - Logical Database Design Two stages –Building and validating local logical model –Building and validating.
© 2005 by Prentice Hall 1 Chapter 6: Physical Database Design and Performance Modern Database Management 7 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
1 C omputer information systems Design Instructor: Mr. Ahmed Al Astal IGGC1202 College Requirement University Of Palestine.
Copyright © 2012 Pearson Education, Inc. Publishing as Prentice Hall 9.1.
Systems analysis and design, 6th edition Dennis, wixom, and roth
CSC271 Database Systems Lecture # 30.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
CODD’s 12 RULES OF RELATIONAL DATABASE
Chapter 16 Methodology – Physical Database Design for Relational Databases.
Lecture 12 Designing Databases 12.1 COSC4406: Software Engineering.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
1 © Prentice Hall, 2002 Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott,
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
© Pearson Education Limited, Chapter 13 Physical Database Design – Step 4 (Choose File Organizations and Indexes) Transparencies.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Physical Database Design Transparencies. ©Pearson Education 2009 Chapter 11 - Objectives Purpose of physical database design. How to map the logical database.
CIS 210 Systems Analysis and Development Week 6 Part II Designing Databases,
Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
Database Management COP4540, SCS, FIU Physical Database Design (2) (ch. 16 & ch. 6)
Methodology – Physical Database Design for Relational Databases.
File and Database Design Class 22. File and database design: 1. Choosing the storage format for each attribute from the logical data model. 2. Grouping.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
1 Agenda: 04/22 and 04/24 Answer questions about Replica Toys post-sales project. Total points remaining for the project = 140. Currently split into 30%
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Session 1 Module 1: Introduction to Data Integrity
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall Chapter 9 Designing Databases 9.1.
Physical Database Design DeSiaMorePowered by DeSiaMore 1.
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
Description and exemplification use of a Data Dictionary. A data dictionary is a catalogue of all data items in a system. The data dictionary stores details.
Converting ER/EER to logical schema; physical design issues 1.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Logical Database Design and the Rational Model
Physical Database Design and Performance
Methodology – Physical Database Design for Relational Databases
Physical Database Design for Relational Databases Step 3 – Step 8
Modern Systems Analysis and Design Third Edition
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Basic Concepts in Data Management
國立臺北科技大學 課程:資料庫系統 fall Chapter 18
Physical Database Design
Chapter 12 Designing Databases
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Chapter 17 Designing Databases
Presentation transcript:

1 Agenda – 04/18/2006 and 04/20/2006 Identify tasks in physical database design. Define the design goals for physical database design. Discuss relevant tasks in physical database design. Discuss considerations for database performance.

2 What is physical database design? The process of translating a logical description of data into technical specifications for storing and retrieving data. Preparing documentation for actual implementation of tables in a database.

3 Physical vs. logical design A physical design can look exactly like a logical design. Small database: Logical design usually is the same as physical design. Or a physical design can look different than a logical design. Large database: Physical design will probably change entity structure to ensure good performance. Differences between physical and logical design stem from: Goals. Constraints.

4 Design goals for physical database design Provide adequate performance. Ensure database integrity. Provide database security. Anticipate recoverability.

5 Tasks in physical design Convert entities into tables. Identify all necessary data attributes. Determine correct size and data type for each data attribute. Choose an appropriate primary key. Identify foreign keys necessary to sustain relationships. Define necessary constraints. Enhance performance. Identify size and access methods of data. Choose appropriate hardware. Create indices. De-normalize the design as necessary. Create design and procedures for archiving data.

7 Questions to answer during physical design for the sample database How should the super-type of EMPLOYEE be related to the required sub-types? Separate tables or the same table? How do you relate a sub-type of a generalization relationship (FACULTY) with a weak entity (COURSE OFFERING)? How will the supertype of COURSE be related to the potential sub-types of the course? Separate tables or the same table? What should you do with the concatenated key in COURSE OFFERING?

NameTypePrimary Key Foreign Key Other Constraints SSNChar(9)YesNoNot null NameVarchar2(30)No Not null Address1Varchar2(30)No Address2Varchar2(30)No CityVarchar2(20)No StateChar(2)No ZipChar(9)No Birth_dateDateNo Emp_typeChar(2)No Must be f or c Ed_levelChar(6)No Grant_typeChar(8)No Must be ‘A1’ through ‘A7’ Fund_CategoryNumber(4)No Emp_levelChar(5)No Contract_typeChar(4)No

NameTypePrimary Key Foreign Key Constraints Course_idChar(6)YesNoNot null NameVarchar2(25)No Not null DescriptionVarchar2(75)No Min_creditsNumber(1)No Must be >= 1 Max_creditsNumber(1)No Must be <= 6 NameTypePrimary KeyForeign KeyConstraints Course_idChar(6)YesYes – ref course Not null Course_typeChar(6)YesNoMust be ‘d’ or ‘cap’ Start_dateDateNo Not null ApprovalChar(15)No Not null QualVarchar2(100)No Not null

10 Choosing datatypes for attributes A datatype is a name or label for a set of values and some operations which one can perform on that set of values. Examples in SQL: varchar, date, number, integer Concept of “strongly data typed.” Objectives for choosing an appropriate data type: Minimize storage space. Represent all possible values. Improve data integrity. Support all data manipulations.

11 Choosing an appropriate primary key General rules: Must be a unique value for each row in the table. Cannot be null. Should be static over the life of the row. Physical primary key design heuristics: Should be a single attribute. Should be numeric. Should not be “intelligent.” Should be able to be an “enterprise key.”

12 Overview of Database Performance Key metrics for database performance Minimize response time to access data in a database. Minimize response time to change contents in a database. Most concerned with balancing disk access and memory capacity.

13 Input data relevant to performance Table profile Number of tables Number of rows in a table Number of attributes in a table Application profile Number of screens Number of reports Frequency of screen/reports Number of intended joins Types of queries Expected response time

14 Improving performance With optimizing use of existing resources. With better or more resources. With indexes. With denormalization. With procedures to archive data.

Cluster files to better use memory and disk access time

CREATE CLUSTER ordering (CLUSTERKEY CHAR(6)) CREATE TABLE tbl_customer (customer_idCHAR(6) NOT NULL, AddressVARCHARs(25)) CLUSTER ordering (customer_id); CREATE TABLE tbl_order (order_idCHAR(6) NOT NULL, Customer_idCHAR(6) NOT NULL, Order_datedate) CLUSTER ordering(customer_id);

Add or change resources to improve performance. Will help a little: more processor power. Will help more: more memory. Will really help: Faster, more efficient disk. RAID: Redundant arrays of inexpensive (or independent) disks. A set of multiple physical disk drives that appear to the designer and user as a single storage unit. Segments of data, called stripes, cut across all of the disk drives. Access can occur concurrently. Different types of RAID are available. RAID-0 through RAID-7, RAID-10, 53, 0+1.

RAID Example

19 Improving performance with indexes Indexes are probably the single most important tool for improving the performance of a database. Can add an index to a database with a simple SQL command: Create index index_name on table (column_name); Understanding what happens when an index is created requires a basic understanding of indexing and file organization.

20 File organization and access concepts File organization. The physical arrangement of data in a file into records and pages on secondary storage. File organization dictates the physical placement of records. File access methods. The steps involved in retrieving records from a file. File access methods dictate how data can be retrieved from secondary storage. Options include: Sequential access from beginning. Sequential access from pre-defined point. Backwards from end. Backwards from pre-defined point. Direct. (not really direct – has to go through a series of indices)

21 General file organization options Sequential file organization. Records are stored one after another. Referred to as a “heap” or “pile.” Indexed file organization. Records are stored either ordered or not as in sequential organization. Additional structure, index, is built based on pre-determined keys for the records.

22 What is an index? An additional physical file. An index is a sorted list of pointers stored along with the actual data. Benefit: Indexes provide faster direct data access. Drawbacks: Indexes create slower data updates. Indexes require periodic reorganization.

23 What types of indices are used? Indexes are frequently stored in a structure called a B+-tree. Other types of indices are: Bitmap index. Identifies the value of a given column in a given row as being “true/on” or “false/off”. Join index. Creates an index for multiple tables that are commonly joined together for pre-defined queries.

24 Clustered vs. non-clustered indices Clustered index. Declaration means actual table data will be ordered by the clustered index. Can only have one clustered index per table. Greatly improves access time for tables frequently accessed by clustered index. Decreases update performance if data is volatile. Not available on all DBMS’s. Non-clustered index. Usually the default indexing structure. Does not change the order of the table data. Functions as a “secondary” index.

25 Rules of thumb for applying indexes Use on larger tables. Use when a relatively small percentage of the table will be accessed. Index the primary key of each table. Index frequently used search attributes. Index attributes in SQL “ORDER BY” and “GROUP BY” commands. Use indexes heavily for non-volatile databases; limit the use of indexes for volatile databases. Avoid indexing attributes that consist of long character strings.

26 Issues in indexing Indexes affect table maintenance performance. Each time an add or delete is performed, the index must be updated along with the data. Depending on the size of the database, these index updates can be extremely time-consuming. Imagine the problems with having an index declared for every attribute. Solutions: Remove indexes prior to batch updates. Recreate indexes after the batch update is finished. Consider using a batch procedure to create indexes after a table has been updated, and before queries are run.

27 Improving performance with denormalization Modify the degree of normalization. Recognize that joins require much time when used in queries. More joins = more time. Combine entities with 1:1 relationship into a single entity. Combine entities with 1:m relationship into a single entity. Usually done with brief repeating groups.

28 Example for denormalization Example: A patient can have up to 4 insurance companies. Patient is a strong entity. Insurance company is a strong entity. Normally, the repeating group of insurance companies would be in a separate intersection entity relating a patient to one or more insurance companies. Diagram on next page

Insurance example - Denormalized

31 Issues in denormalization Can be risky. Introduces potential for data redundancy. Can result in data anomalies. Should be documented. This documentation must be maintained as an “audit path” to the actual implementation of the database. Logical data model details fully normalized database with an ERD. Physical data model will show denormalized database with an ERD. Include in the documentation the reasons for denormalization.

32 Improving performance with derived data Derived or calculated data is usually not included in a database. Not ever included on a logical data model. Examples of derived data include: extended price, total amount, total pay, etc. Problems with including derived data in a database: What happens when the underlying data is changed? How do you ensure that the derived data will also be changed? For example, let’s say that the total of an order is kept in the database. What happens when an item quantity changes, or an item price changes? The order total, if stored, must also be changed to reflect those changes in the underlying data.

33 When to include derived data Sometimes it is a good idea to include derived data in the physical database design: Use when aggregate values are regularly retrieved. Use when aggregate values are costly to calculate. Permit updating only of source data. Do not put derived rows in same table as table containing source data. Examples of derived data frequently stored on databases: Student class standing. Order and invoice total. Credit card balance. Checking account balance.

34 Organization must manage data resources Types of data used by an organization: Current transaction data. Historical data for decision making. Audit data for accounting and/or governmental regulations. Data differentiation: external vs. internal All must be designed, implemented and maintained. Must have procedures for extracting, transforming and loading (ETL) data as necessary.

35 Archive data for audit purposes Not all data must be stored on a directly accessible data storage device (disk). Examples of archived data: Checking transactions. Tax data. Accounting audit trail. Can store data on tape or other cheaper, less accessible media. Must have procedures for extracting, transforming and loading (ETL) data as necessary. Archive database design is usually a copy of the transaction database design.

36 Use a data warehouse A Data warehouse differs from a transaction database. Used to support decision making. Contains aggregated data. Is frequently denormalized to improve performance. Contains data in a format specific to answering queries. Data warehouse is separate from transaction database. A data warehouse is built from data stored in the transaction database. Different design. May use a data warehouse and a transaction database concurrently to answer queries.