Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSIS 254 Oracle Normalization. Relational Databases (Review) In relational databases, all data is stored in tables, which correspond roughly to entitiesIn.

Similar presentations


Presentation on theme: "CSIS 254 Oracle Normalization. Relational Databases (Review) In relational databases, all data is stored in tables, which correspond roughly to entitiesIn."— Presentation transcript:

1 CSIS 254 Oracle Normalization

2 Relational Databases (Review) In relational databases, all data is stored in tables, which correspond roughly to entitiesIn relational databases, all data is stored in tables, which correspond roughly to entities Each table is two-dimensional, consisting of rows and columnsEach table is two-dimensional, consisting of rows and columns Each row in a table, called a t-uple, corresponds to an occurrence of the entityEach row in a table, called a t-uple, corresponds to an occurrence of the entity Columns in each table contain similar data across all rows in the tableColumns in each table contain similar data across all rows in the table

3 Relational Database Example The following table is an example of a relational table describing classes that students have taken at a mythical college used in the rest of this lesson Student Student Course Student Student Course Id Name Id Course Name Grade Term Teacher Id Name Id Course Name Grade Term Teacher 0194327 Joe Adams CSIS-840 VB Concepts C Spr-02 Wilkins 0194327 Joe Adams CSIS-840 VB Concepts C Spr-02 Wilkins 0194327 Joe Adams CSIS-824 Intro to C++ B Fal-02 Smythe 0194327 Joe Adams CSIS-824 Intro to C++ B Fal-02 Smythe 1850243 Joe Adams CSIS-740 Oracle Admin A Spr-03 Wallace 1850243 Joe Adams CSIS-740 Oracle Admin A Spr-03 Wallace 1850243 Jane Smith CSIS-941 Systems Des. B Fal-02 Evans 1850243 Jane Smith CSIS-941 Systems Des. B Fal-02 Evans 1850243 Jane Smith CSIS-840 VB Concepts B Spr-02 Wolkins 1850243 Jane Smith CSIS-840 VB Concepts B Spr-02 Wolkins 8502432 Ida Know CSIS-184 Networks A Sum-03 Farmer 8502432 Ida Know CSIS-184 Networks A Sum-03 Farmer 7402943 Eunice Eye CSIS-824 PowerPoint W Spr-02 Simpson 7402943 Eunice Eye CSIS-824 PowerPoint W Spr-02 Simpson

4 Relational Database Example Each row (or t-uple) in the table describes a Class taken by a Student in a term at our collegeEach row (or t-uple) in the table describes a Class taken by a Student in a term at our college The data in each column is consistent throughout the tableThe data in each column is consistent throughout the table However, there are three inconsistencies in the table itself. Can you find them?However, there are three inconsistencies in the table itself. Can you find them?

5 Primary Keys (Review) Each row in a table has a primary key, which is the column or set of columns identified to our DBMS that uniquely identifies it from every other row in the tableEach row in a table has a primary key, which is the column or set of columns identified to our DBMS that uniquely identifies it from every other row in the table No attribute value in a primary key can be NULLNo attribute value in a primary key can be NULL A table can have only one primary keyA table can have only one primary key If a primary key is not specified, Oracle supplies oneIf a primary key is not specified, Oracle supplies one What would be the primary key for our sample database?What would be the primary key for our sample database?

6 Foreign Keys (Review) An attribute (or group of attributes) in a table can also be a foreign key, meaning that it references the primary key (or at least unique attribute) to another tableAn attribute (or group of attributes) in a table can also be a foreign key, meaning that it references the primary key (or at least unique attribute) to another table An example would be a Customer Id attribute on an invoice header, which would reference the customer account information for that invoiceAn example would be a Customer Id attribute on an invoice header, which would reference the customer account information for that invoice

7 Normalization Let’s begin our discussion of normalization by using an example -- we want to expand the sample relational table for our mythical college by tracking data for: –students –courses –departments –teachers –classes (courses offered during a term) –teachers assigned to each class –students enrolled in each class

8 Database Normalization Example We might start off with an entity for each Class that looks something like this CLASS Course Id Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Name Student #1 Data Student #2 Data …. …. Student #30 Data

9 Database Normalization Example The information stored for each student would be What problems can you see with this scheme? CLASS (exploded) … Student #1 Data Id Full Name e-mail Addresses Grade for Class GPA Student #2 Data Id Full Name e-mail Addresses Grade for Class GPA.

10 Problems with Our Example We can’t have more than 30 students in a classWe can’t have more than 30 students in a class There’s lots of duplicate information in our tablesThere’s lots of duplicate information in our tables –This design would require many updates whenever a change was made to data about a department, a teacher, a student, etc. Does it make sense for us to have to know, for example, a course number in order to to look up a teacher’s name?Does it make sense for us to have to know, for example, a course number in order to to look up a teacher’s name?

11 Problems with Our Example (continued) Removing a class entity occurrence might remove valuable information from our databaseRemoving a class entity occurrence might remove valuable information from our database We don’t have any data verification checksWe don’t have any data verification checks –We might wind up with inconsistent data across two or more records (is this necessarily bad if we are trying to take snapshots?)

12 Normalization Goal #1 Remove redundant data Duplicated data wastes disk spaceDuplicated data wastes disk space Duplicated data may not necessarily be consistent, that is, stored in exactly the same wayDuplicated data may not necessarily be consistent, that is, stored in exactly the same way Redundant data creates problems for our coders Redundant data creates problems for our coders –Ideally, data should be stored (and changed) in exactly the same way in all locations, which not only is time consuming for the system’s programmers, but also takes computer resources to perform once the system is implemented

13 Normalization Goal #2 Remove dependency issues It is not intuitive for a user of our new system to look in the CLASS entity to find, for example, a student’s email address.It is not intuitive for a user of our new system to look in the CLASS entity to find, for example, a student’s email address. It would probably make more sense to move this information into a separate entity (i.e., a database table that defines students).It would probably make more sense to move this information into a separate entity (i.e., a database table that defines students).

14 Normalization The Bottom Line “In summary, normal forms insure that we do not compromise the integrity of our data by either creating false data or destroying true data.” Ensor & Stevenson

15 Forms of Normalization To accomplish these goals, we have created a set of rules which define normal forms or levels.To accomplish these goals, we have created a set of rules which define normal forms or levels. There are five normal forms, each progressively more restrictive, which are called first normal form (1NF), second normal form (2NF), …There are five normal forms, each progressively more restrictive, which are called first normal form (1NF), second normal form (2NF), … Most database designers only consider the first three forms in their work, as we willMost database designers only consider the first three forms in their work, as we will As we shall see, there might be good reasons to deviate from these normal formsAs we shall see, there might be good reasons to deviate from these normal forms

16 First Normal Form (1NF) A database is in first normal form (1NF) if each attribute of the database is simple, single-valued (atomic), and does not repeatA database is in first normal form (1NF) if each attribute of the database is simple, single-valued (atomic), and does not repeat –Let’s assume column definitions are consistent across rows Method:Method: –Reduce all attributes into atomic components –Eliminate duplicative columns (repeating groups) and multi- valued attributes from the same table –Create a separate table for each group of related data –Identify each row with a unique column or set of columns (a primary key)

17 Our Sample Database Here’s what our database entity for classes at our college currently looks like CLASS Course Id Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Name Student #1 Data Student #2 Data …. …. Student #30 Data

18 Our Sample Database in 1NF We should divide the Course Id into a Department Id and Course Number (e.g., Course ID “CSIS-254” would be divided into Department Id “CSIS”, Course Number “254”) (Won’t this make the Department Name redundant?) CLASS Department Id (added) Course Number (added) Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Last Name Student #1 Data Student #2 Data …. …. Student #30 Data

19 Our Sample Database in 1NF Next, break out Student Ids, Names, e-mail Address, and Grades into a separate entity, eliminating the repeating Student groups. Department Id Course Number Term Offered Student Id Student Full Name Student e-mail Addresses Student Grade for Class Student GPA CLASS / STUDENT

20 Our Sample Database in 1NF We need to break down the Student’s Names into their simpler components Department Number Course Number Term Offered Student Id Student Full Name First Name Middle Name Last Name Student e-mail Addresses Student Grade for Class Student GPA CLASS / STUDENT

21 Our Sample Database in 1NF Finally, we need to break out Student e- mail Addresses into another entity, where each occurrence represents a single e-mail address Department Id Course Number Term Offered Student Id Address Number or Id Student e-mail Address CLASS / STUDENT E-MAIL ADDRESS E-MAIL ADDRESS

22 Our Sample Database in 1NF Department Id Course Number Term Offered Department Name Course Description Classroom (or “Internet”) Credits / Hours Teacher Id Teacher Last Name CLASS CLASS / STUDENT Department Id Course Number Term Offered Student Id Student Full Name First Name Middle Name Last Name Student Grade for Class Student GPA

23 Our Sample Database in 1NF CLASS / STUDENT Department Id Course Number Term Offered Student Id Student Full Name First Name Middle Name Last Name Student Grade for Class Student GPA CLASS / STUDENT E-MAIL ADDRESS E-MAIL ADDRESS Department Id Course Number Term Offered Student Id Address Number or Id Student e-mail Address

24 1NF Advantages Removes limits artificially introduced into a database design by using repeating groupsRemoves limits artificially introduced into a database design by using repeating groups Ensures that attributes are broken into their most basic units and are not multi-valuedEnsures that attributes are broken into their most basic units and are not multi-valued

25 Exercise Put the following table in 1NF, then draw an ERD for your new system FAVORITE TV SHOWS TV Show Name Category Main Star Name #1 Main Star Name #2 Main Star Name #3 Day and Time Shown NetworkChannel My Rating (1-10)

26 One Possible Answer FAVORITE TV SHOWS TV Show Name Category My Rating (1-10) SHOW / STARS TV Show Name Star Number Star Name SHOW TIMES TV Show Name Slot Number Date and Time NetworkChannel

27 Second Normal Form (2NF) 2NF implies 1NF by definition2NF implies 1NF by definition All non-key attributes must be fully-dependent on every key attribute in the primary keyAll non-key attributes must be fully-dependent on every key attribute in the primary key –In other words, a non-key attribute cannot depend on only part of the primary key –This restriction applies only to tables with composite keys 2NF reduces redundant data in a table by extracting it, placing it in new table(s), then creating relationships between those tables.2NF reduces redundant data in a table by extracting it, placing it in new table(s), then creating relationships between those tables.

28 Second Normal Form (2NF) Method:Method: –Remove subsets of data that appear in multiple rows of a table, and place into separate tables –Create relationships between these new tables and their predecessors through the use of foreign keys.

29 Our Sample Database in 2NF We can break out the Department Name from the CLASS entity, as it will be the same for each Class having the same Department Department Id Department Name DEPARTMENT

30 Our Sample Database in 2NF We also can break out the Course Description from this entity, as it also will be the same for each Class referencing the same Course Department Id Course Number Course Description Credits / Hours COURSE Note that we’ve kept the Department Id in this entity. Why?

31 Our Sample Database in 2NF We can also break out the information about each Teacher, since it also will be the same for each Class that a Teacher conducts, irrespective of the Class Teacher Id Teacher Last Name TEACHER

32 Our Sample Database in 2NF Our new CLASS / STUDENT entity can also have its student- related attributes (names, and GPA) broken out, that is, attributes that do not change with the class number Student Id Student Full Name First Name Middle Name Last Name Student GPA STUDENT

33 Our Sample Database in 1NF Student e-mail Addresses are not dependent upon Department Id, Course Number, or Term, so remove them from the e- mail entity Department Id (deleted) Course Number (deleted) Term Offered (deleted) Student Id Address Number or Id Student e-mail Address STUDENT E-MAIL ADDRESS E-MAIL ADDRESS

34 Our Sample Database in 2NF Our final CLASS / STUDENT entity, minus all of the attributes that have been moved to other entities, looks like Department Id Course Id Student Id Term Student Grade for Class CLASS / STUDENT

35 2NF and Foreign Keys To ensure data integrity, we would implement four foreign keys in our CLASS, CLASS / STUDENTTo ensure data integrity, we would implement four foreign keys in our CLASS, CLASS / STUDENT –Department Id must reference an occurrence in DEPARTMENT entity –Course Id must reference a row in COURSE –Student Id must reference a row in STUDENT –Teacher Id must reference a row in TEACHER Would we implement a similar restriction on our student e-mail address entity?Would we implement a similar restriction on our student e-mail address entity?

36 2NF Advantages All advantages of 1NFAll advantages of 1NF Common data is forced to be consistent, since it is stored in only one place in the databaseCommon data is forced to be consistent, since it is stored in only one place in the database We can store data about separate entities without implying the existence of othersWe can store data about separate entities without implying the existence of others –In our original database design, we can’t store information about Students, Teachers, or Departments if we don’t have any classes in which they are involved.

37 Exercise Convert the following table into 2NF, and draw a new ERD SALES ORDER SALES ORDER Order Number Order Number Customer Account Number Customer Account Number Customer Account Name Customer Account Name Customer Address Customer Address Date of Entry of Order Date of Entry of Order Date of Requested Shipment Date of Requested Shipment Item Numbers Item Numbers Item Descriptions Item Descriptions Quantities Ordered Quantities Ordered Unit Prices Unit Prices Extended Prices Extended Prices Total Order Price Total Order Price

38 Third Normal Form (3NF) 3NF implies 2NF (which implies 1NF)3NF implies 2NF (which implies 1NF) A database is in third normal form (3NF) if the data in every column of each row (occurrence) in a table (entity) is dependent ONLY upon each column in the keyA database is in third normal form (3NF) if the data in every column of each row (occurrence) in a table (entity) is dependent ONLY upon each column in the key –In general, any time the contents of a group of fields may apply to more than a single record in the table, consider placing those fields in a separate table. –This means that derived attributes are not allowed in 3NF

39 Third Normal Form (3NF) All attributes depend upon the key, the whole key, and nothing but the keyAll attributes depend upon the key, the whole key, and nothing but the key Method:Method: –Remove all derived columns –Move all remaining columns not dependent on the key into a new table

40 Our Sample Database in 3NF Our STUDENT entity cannot contain a GPA, since that is a derived attribute (the average of all of the Grades received) Student Id Student Names First Name Middle Name Last Name Student GPA (deleted) STUDENT

41 Advantages of 3NF All advantages of 1NF and 2NFAll advantages of 1NF and 2NF Information is stored in one and only one place in the databaseInformation is stored in one and only one place in the database All entities are now 2-dimensional, non- redundant, and can be implemented in relational tablesAll entities are now 2-dimensional, non- redundant, and can be implemented in relational tables

42 Disadvantages of Normalization Proliferation of tables, resulting in increased system complexityProliferation of tables, resulting in increased system complexity –Can be overcome with views for end-users Performance hits through added tables and lack of derived attributesPerformance hits through added tables and lack of derived attributes –May be partially offset by reduced computing needs of maintaining data only once We will discuss these in detail next week...We will discuss these in detail next week...

43 Last Slide Next Week’s Assignment Draw a complete ERD for our normalized 3NF mythical college database. Does it make sense to you?Draw a complete ERD for our normalized 3NF mythical college database. Does it make sense to you? Normalize the two organizations / systems that you used in last week’s homework by updating their ERD’s (Engineering Method only).Normalize the two organizations / systems that you used in last week’s homework by updating their ERD’s (Engineering Method only). Introduce at least two derived attributes that you might include in your design, and explain why.Introduce at least two derived attributes that you might include in your design, and explain why. Prepare for a quiz next week on what we have covered so far in class:Prepare for a quiz next week on what we have covered so far in class: Stages of SDLC, Entities, Attributes, Relationships, Diagramming, and Normalization


Download ppt "CSIS 254 Oracle Normalization. Relational Databases (Review) In relational databases, all data is stored in tables, which correspond roughly to entitiesIn."

Similar presentations


Ads by Google