Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data normalization. Integrity and Robustness.

Similar presentations


Presentation on theme: "Data normalization. Integrity and Robustness."— Presentation transcript:

1 Data normalization. Integrity and Robustness.
Creating Databases Data normalization. Integrity and Robustness. Work session. Homework: Prepare short presentation on enhancement projects. Continue working on implementation.

2 What is normalization? Data analysis is a process that prepares a data model for implementation as a simple, non-redundant, flexible, and adaptable database. The specific technique is called normalization. Normalization is a data analysis technique that organizes data attributes such that they are grouped to form non-redundant, stable, flexible, and adaptive entities. ASSUMES that you have reports, input forms that capture all the information. This and following charts are from information systems analysis course. Don't be overcome by the language. This is to show you the process, that there is a process—it isn't guess work.

3 Goals of normalization
Have well-defined tables—at most one value for each field. Store each item of information exactly one place so if/when it changes, only have to change one place. Relationships are valid: foreign keys refer to existing primary keys. Don't store items that can be calculated so making changes is simplified.

4 Process of defining database
Generally starts with the desired end products (sometimes called artifacts) Reports Forms May be from original, possibly even non-automated version of application May be from combination of applications. Goal is to produce single database that serves multiple uses.

5 But… Problems are common.
Example: change address [one place] but old information persists. Your experience?

6 Normalization process
First step is to do what is necessary to get each entity into 1st normal form: An entity is in first normal form (1NF) if there are no attributes that can have more than one value for a single instance of the entity. Any attributes that can have multiple values actually describe a separate entity, possibly an entity and relationship. Common situation is so-called multiple values, such as distinct items in an order (distinct beneficiaries, game-machines) Action is to create new entity In the typical product order case, there can be multiple items ordered in each 'order. So these are extracted.

7 Modifying model to 1st NF
The primary key for the new table is a so-called concatenated key. Here it is the promotion and the title. So notice that the primary key for the title promotion records consist of two foreign keys=keys pointing to other tables. Many items (titles) Associative entity: Use combination of keys for new (concatenated) key

8 Moving to 2nd NF If you do not have any concatenated keys, no work is needed. Model is already in 2nd NF. If you do have any concatenated (combination) keys, you need to examine these entities. An entity is in second normal form (2NF) if it is already in 1NF and if the values of all nonprimary key attributes are dependent on the full primary key—not just part of it. Any nonkey attributes that are dependent on only part of the primary key should be moved to any entity where that partial key is actually the full key. This may require creating a new entity and relationship on the model. The process of getting something in 1NF may have carried over data that appears somewhere else. You don't want data twice. (You may choose to make whole databases redundant, for efficiency of access or for backup, but that is done separately.)

9 Moving to 2nd NF Some attributes relate to the
product itself, not the fact that the product is part of this order. Remove these attributes. To say the same thing a slightly different way: you don't need this product only data in the ordered product record.

10 So… In my store example, the ordered items records would NOT have information just relating to the order or just relating to the product.

11 Moving to 3rd NF Make sure that all non-primary attributes depend just on the key, not, for example, on another attribute. An entity is in third normal form (3NF) if it is already in 2NF and if the values of its nonprimary key attributes are not dependent on any other non-primary key attributes. Any nonkey attributes that are dependent on other nonkey attributes must be moved or deleted. Again, new entities and relationships may have to be added to the data model. Typical example is something that can be calculated.

12 Example of move to 3rd NF First of two examples

13 Example of move to 3rd NF Second of two examples.

14 The whole ER diagram this particular example.
MEMBER Primary Key Member-Number [PK1] Non-Key Attributes Member-Name Member-Status Member-Street-Address Member-Post-Office-Box Member-City Member-State Member-Zip-Code Member-Daytime-Phone-Number Member-Date-of-Last-Order Member-Balance-Due Member-Credit-Card-Type Member-Credit-Card-Number Member-Credit-Card-Expire-Date Member-Bonus-Balance-Available Audio-Category-Preference Audio-Media-Preference Date-Enrolled Expiration-Date Game-Category-Preference Game-Media-Preference Number-of-Credits-Earned Video-Category-Preference Video-Media-Preference Agreement-Number [FK] Privacy-Code -Address MEMBER ORDER Order-Number [PK1] Order-Creation-Date Order-Fill-Date Shipping-Address-Name Shipping-Street-Address Shipping-City Shipping-State Shipping-Zip Shipping-Instructions Order-Sub-Total Order-Sales-Tax Order-Shipping-Method Order-Shipping-&-Handling-Cost Order-Status Order-Prepaid-Amount Order-Prepayment-Method Promotion-Number [FK] Member-Number [FK] Member-Number Member-Number [FK] PRODUCT Product-Number [PK1] "Universal-Product-Code (Alternate Key)" Quantity-in-Stock Product-Type Suggested-Retail-Price Default-Unit-Price Current-Special-Unit-Price Current-Month-Units-Sold Current-Year-Units-Sold Total-Lifetime-Units-Sold VIDEO TITLE Product-Number [PK1] [FK] Producer Director Video-Category Video-Sub-Category Closed-Captioned Language Running-Time Video-Media-Type Video-Encoding Screen-Aspect MPA-Rating-Code AUDIO TITLE Artist Audio-Category Audio-Sub-Category Number-of-Units-in-Package Audio-Media-Code Content-Advisory-Code GAME TITLE Manufacturer Game-Category Game-Sub-Category Game-Platform Game-Media-Type Number-of-Players Parent-Advisory-Code TRANSACTION Transaction-Reference-Number [PK1] Transaction-Date Transaction-Type Transaction-Description Transaction-Amount Order-Number [FK] TITLE Title-of-Work Title-Cover Catalog-Description Copyright-Date Entertainment-Category Credit-Value MEMBER ORDERED PRODUCT Order-Number [PK1] [FK] Product-Number [PK2] [FK] Quantity-Ordered Quantity-Shipped Quantity-Backordered Purchase-Unit-Price Credits-Earned MERCHANDISE Merchandise-Name Merchandise-Description Merchandise-Type Unit-of-Measure AGREEMENT Agreement-Number [PK1] Agreement-Expire-Date Agreement-Active-Date Fulfillment-Period Required-Number-of-Credits PROMOTION Promotion-Number [PK1] Promotion-Release-Date Promotion-Status Promotion-Type 3NF Member Services (Entity Relation Subject Area) SA/2001 Tue May 02, :41 Comment Sandra Shepherd TITLE PROMOTION Promotion-Number [PK2] [FK] places binds features is featured as is a has conducted responds to generates sold as sells The whole ER diagram this particular example.

15 Normalization …. is a process. It is [somewhat] mechanical. There is chance that your model may be in 1st, 2nd, or even 3rd without action or much action on your part, but it can be good to go through the process. Note: Some may argue for certain redundancies, for example, storing a calculated value. Why or why not? Answers to question?

16 About assignment You can specify conditions that simplify your application For a course catalog application, allow 0 or 1 pre-req, not unlimited number. Upload files using Filezilla. I will demonstrate file upload code and related issues. ? Do some checking of user input. I will expect more for your final project.

17 Classwork/Homework Prepare short presentation on enhancement project, including Entity-Relationship diagrams (which may not have changed from mine) and Data Flow Diagram (which probably did change) Work on enhancement projects.


Download ppt "Data normalization. Integrity and Robustness."

Similar presentations


Ads by Google