Database: Review Sept. 2004Yangjun Chen 91.39021 Database Introduction system architecture, Basic concepts, ER-model, Data modeling, B+-tree Hashing Relational.

Slides:



Advertisements
Similar presentations
Functional Dependencies and Normalization for Relational Databases
Advertisements

primary key constraint foreign key constraint
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Defined by Edgar Codd in 1970 Defined by Edgar Codd in 1970 Considered ingenious but impractical Considered ingenious but impractical Conceptually simple.
Relational Database. Relational database: a set of relations Relation: made up of 2 parts: − Schema : specifies the name of relations, plus name and type.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Ch 10, Functional Dependencies and Normal forms
Functional Dependencies and Normalization for Relational Databases.
Exploring Microsoft Access 2003 Chapter 4 Proficiency: Relational Databases, External Data, Charts, Pivot, and the Switchboard.
Databases and Database Users Sept. 2012Yangjun Chen ACS Outline: Introduction (Chapter 1 – 3 rd, 4 th, 5 th, 6 th ed.) What is a database? The main.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Review Database Application Development Access Database Development ER-diagram Forms Reports Queries.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Database: Review Sept. 2009Yangjun Chen ACS Database Introduction system architecture, Basic concepts, ER-model, Data modeling, B+-tree Hashing Relational.
1 DATABASE By Mr. Abdalla A. Shaame.  What is a database?  The main characters of a database system  The basic database design method  The entity-relationship.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 6 NORMALIZATION FOR RELATIONAL DATABASES Instructor Ms. Arwa Binsaleh.
Review: Application of Database Systems
IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter Functional Dependencies and Normalization for Relational Databases.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
1 Functional Dependencies and Normalization Chapter 15.
Normalization Sept. 2012ACS-3902 Yangjun Chen1 Outline: Normalization Chapter 14 – 3rd ed. (Chap. 10 – 4 th, 5 th ed.; Chap. 6, 6 th ed.) Redundant information.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Normalization Sept. 2014ACS-3902 Yangjun Chen1 Outline: Normalization Redundant information and update anomalies Function dependencies Normal forms -1NF,
Review Database Application Development Access Database Development Theory Practice.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Ch 7: Normalization-Part 1
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع.
10/3/2017.
Functional Dependency and Normalization
Introduction to the database systems (1)
Functional Dependencies and Normalization for Relational Databases
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for RDBs
Database Management systems Subject Code: 10CS54 Prepared By:
Outline: Normalization
Normalization.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Normalization February 28, 2019 DB:Normalization.
1. Explain the following concepts: (a) superkey (b) key
1.(5) Describe the working process with a database system.
1. Explain the following concepts of the ER data model:
Presentation transcript:

Database: Review Sept. 2004Yangjun Chen Database Introduction system architecture, Basic concepts, ER-model, Data modeling, B+-tree Hashing Relational algebra, Relational data model SQL: DDL, DMLNormalizationLossless join Hierarchical databases

Database: Review Sept. 2004Yangjun Chen Introduction to the database systems What is a database? The main characters of a database The basic database design method The entity-relationship data model for application modeling

Database: Review Sept. 2004Yangjun Chen The main characteristics of the database approach: single repository of data sharable by multiple users concurrency control and transaction concept security and integrity constraints self-describing - system catalogue contains meta data program-data independence some changes to the database are transparent to programs/users multiple views of data - to support individual needs of programs/users

Database: Review Sept. 2004Yangjun Chen Database schema, Schema evolution, Database state Working process with a database system Database system architecture Data independence concept Concepts and Architecture

Database: Review Sept. 2004Yangjun Chen Database schema Relation schema Schema evolution Database state Student Name StNo Class Major Smith 17 1 CS Brown 8 2 CS Course CName CNo CrHrs Dept Database CS C CS Section SId CNo Semester Yr Instructor Spring 2000 Smith Winter 2000 Smith Spring 2000 Jones Grades StNo Sid Grade A B

Database: Review Sept. 2004Yangjun Chen Working process with a database system: Definition record structure data elements names data types constraints etc Construction create database files populate the database with records Manipulation querying updating

Database: Review Sept. 2004Yangjun Chen Database Management System (DBMS) collection of software facilitating the definition, construction and manipulation of databases Users/ actors Request manager Storage manager, Query evaluation Meta data Stored database DBMS

Database: Review Sept. 2004Yangjun Chen Three-schema architecture External view External view Conceptual schema Internal schema Physical storage structures and details Describes the whole database for all users A specific user or groups view of the database

Database: Review Sept. 2004Yangjun Chen Data modeling using ER-model Entity-relationship model -Entity types -strong entities -weak entities -Relationships among entities -Attributes - attribute classification -Constraints -cardinality constraints -participation constraints ER-to-Relation-mapping

Database: Review Sept. 2004Yangjun Chen employee department project dependent ER-model: works for manages works on dependents of controls supervision bdate ssn name lname minit fname sex address salary birthdatenamesex relationship name numberlocation name numberlocation number of employees startdate hours N supervisorsupervisee N M N 1 M N1 M

Database: Review Sept. 2004Yangjun Chen external hashing static hashing & dynamic hashing hash function mathematical function that maps a key to a bucket address collisions collision resolution scheme - open addressing - chaining - multiple hashing linear hashing Hashing technique

Database: Review Sept. 2004Yangjun Chen External hashing: the data are on the disk. Static hashing: using a hashing function to map keys to bucket addresses primary area can not be changed collision resolution schema: open addressing chaining multiple hashing Dynamic hashing: primary area can be changed linear hashing

Database: Review Sept. 2004Yangjun Chen Linear hashing: 1.What is a phase? 2.How to split a bucket? 3.When to split a bucket? 4.What bucket will be chosen to split next?

Database: Review Sept. 2004Yangjun Chen Linear hashing: initially hash file contains M buckets h i = key mod 2 i  M (i = 0, 1, 2,...) insertion process can be divided into several phases phase 1: insertion using h 0 = key mod M splitting using h 1 = key mod 2  M splitting rule: overflow of a bucket or if load factor > constant (e.g., 0.70) overflow will be put in the overflow area or redistributed through splitting a bucket splitting buckets from n = 0 to n = M- 1 (after each splitting n is increased by 1. Phase 1 finishes when n = M (in this case, the primary area becomes 2  M buckets long)

Database: Review Sept. 2004Yangjun Chen phase 2: insertion using h 1 = key mod 2  M splitting using h 2 = key mod 4  M splitting rule: overflow of a bucket or if load factor > constant (e.g., 0.70) overflow will be put in the overflow area or redistributed through splitting a bucket splitting buckets from n = 0 to n = 2  M- 1 (after each splitting n is increased by 1. Phase 1 finishes when n = 2  M (in this case, the primary area will contain 4  M buckets.) phase 3:... … h 2 = …, h 3 = …,...

Database: Review Sept. 2004Yangjun Chen tree - root, internal, leaf, subtree - parent, child, sibling balanced, unbalanced b + -tree - splits on overflow; merge on underflow - in practice it is usually 3 or 4 levels deep search, insert, delete algorithms Multi-level index

Database: Review Sept. 2004Yangjun Chen B+-tree insertion: leaf node splitting, internal node splitting Leaf splitting When a leaf splits, a new leaf is allocated the original leaf is the left sibling, the new one is the right sibling key and pointer pairs are redistributed: the left sibling will have smaller keys than the right sibling a 'copy' of the key value which is the largest of the keys in the left sibling is promoted to the parent insert 31

Database: Review Sept. 2004Yangjun Chen Internal node splitting If an internal node splits and it is not the root, insert the key and pointer and then determine the middle key a new 'right' sibling is allocated everything to its left stays in the left sibling everything to its right goes into the right sibling the middle key value along with the pointer to the new right sibling is promoted to the parent (the middle key value 'moves' to the parent to become the discriminator between this left and right sibling) Insert 26 33

Database: Review Sept. 2004Yangjun Chen Internal node splitting When a new root is formed, a key value and two pointers must be placed into it Insert

Database: Review Sept. 2004Yangjun Chen Deleting nodes from a B+-tree: 1. When deleting a key from a node A, check whether the number of the remaining keys (or pointers) is   p/2 . 2. If it is not the case, redistribute the keys in the left sibling B or in the right sibling C if it is possible. Otherwise, merge A and B or merge A and C. 3.When redistributing or merging, change the key values in the parent node so that the following condition is satisfied: K 1 < K 2 <... < K q-1 (i.e. it is an ordered set) for the key values, X, in the subtree pointed to by P i K i-1 < X <= K i for 1 < i < q X <= K 1 for i = 1 K q-1 < X for i = q

Database: Review Sept. 2004Yangjun Chen A b + -tree Records p = 3, p leaf = 2.

Database: Review Sept. 2004Yangjun Chen Entry deletion - deletion sequence: 8, 12, 9, Deleting 8 causes the node redistribute.

Database: Review Sept. 2004Yangjun Chen Entry deletion - deletion sequence: 8, 12, 9, is removed.

Database: Review Sept. 2004Yangjun Chen Entry deletion - deletion sequence: 8, 12, 9, is removed.

Database: Review Sept. 2004Yangjun Chen Entry deletion - deletion sequence: 8, 12, 9, Deleting 7 makes this pointer no use. Therefore, a merge at the level above the leaf level occurs.

Database: Review Sept. 2004Yangjun Chen Entry deletion - deletion sequence: 8, 12, 9, 7 53 For this merge, 5 will be taken as a key value in A since any key value in B is less than or equal to 5 but any key value in C is larger than A B C 5 This point becomes useless. The corresponding node should also be removed.

Database: Review Sept. 2004Yangjun Chen Entry deletion - deletion sequence: 8, 12, 9,

Database: Review Sept. 2004Yangjun Chen Data modeling using Relational model Relational algebra Relational Data Model -relational schema, relations -database schema, database state -integrity constraints and updating Relational algebra -select, project, join, cartesian product -division -set operations: union, intersection, difference,

Database: Review Sept. 2004Yangjun Chen Integrity Constraints any database will have some number of constraints that must be applied to ensure correct data (valid states) 1. domain constraints a domain is a restriction on the set of valid values domain constraints specify that the value of each attribute A must be an atomic value from the domain dom(A). 2. key constraints a superkey is any combination of attributes that uniquely identify a tuple: t 1 [superkey]  t 2 [superkey]. -Example: (in Employee) a key is superkey that has a minimal set of attributes -Example: (in Employee)

Database: Review Sept. 2004Yangjun Chen Integrity Constraints If a relation schema has more than one key, each of them is called a candidate key. one candidate key is chosen as the primary key (PK) foreign key (FK) is defined as follows: i) Consider two relation schemas R 1 and R 2 ; ii) The attributes in FK in R 1 have the same domain(s) as the primary key attributes PK in R 2 ; the attributes FK are said to reference or refer to the relation R 2 ; iii) A value of FK in a tuple t 1 of the current state r(R 1 ) either occurs as a value of PK for some tuple t 2 in the current state r(R 2 ) or is null. In the former case, we have t 1 [FK] = t 2 [PK], and we say that the tuple t 1 references or refers to the tuple t 2. Example: Employee(SSN, …, Dno)Dept(Dno, … ) FK

Database: Review Sept. 2004Yangjun Chen Integrity Constraints 3. entity integrity no part of a PK can be null 4. referential integrity domain of FK must be same as domain of PK FK must be null or have a value that appears as a PK value 5. semantic integrity other rules that the application domain requires: state constraint: gross salary > net income transition constraint: Widowed can only follow Married; salary of an employee cannot decrease

Database: Review Sept. 2004Yangjun Chen Relational algebra Retrieve for each female employee a list of the names of her dependents: FEMALE_EMPS   SEX = ‘F’ (EMPLOYEE) ACTUAL_DEPENDENTS  EMPNAMES EMPNAMES   FNAME,LNAME, SSN (FEMALE_EMPS) RESULT  FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPENDENTS ) DEPENDENT SSN = ESSN

Database: Review Sept. 2004Yangjun Chen DDL - creating schemas - modifying schemas DML - select-from-where clause - group by, having, order by - update - view SQL

Database: Review Sept. 2004Yangjun Chen DDL - Examples: Create schema: Create schema COMPANY authorization JSMITH; Create table: Create table EMPLOYEE (FNAMEVARCHAR(15)NOT NULL, MINITCHAR, LNAMEVARCHAR(15)NOT NULL, SSNCHAR(9)NOT NULL, BDATEDATE, ADDRESSVARCHAR(30), SEXCHAR, SALARYDECIMAL(10, 2), SUPERSSNCHAR(9), DNOINTNOT NULL, PRIMARY KEY(SSN), FOREIGN KEY(SUPERSSN) REFERENCES EMPLOYEE(SSN), FOREIGN KEY(DNO) REFERENCES DEPARTMENT(DNUMBER));

Database: Review Sept. 2004Yangjun Chen DDL - Examples: drop schema DROP SCHEMA CAMPANY CASCADE; DROP SCHEMA CAMPANY RESTRICT; drop table DROP TABLE DEPENDENT CASCADE; DROP TABLE DEPENDENT RESTRICT; alter table ALTER TABLE COMPANY.EMPLOYEE ADD JOB VARCHAR(12); ALTER TABLE COMPANY.EMPLOYEE DROP ADDRESS CASCADE;

Database: Review Sept. 2004Yangjun Chen DML - select-from-where clause Retrieve a list of employees and the projects they are working on, ordered by department, within each department, ordered alphabetically by last name, first name: SELECTDNAME, LNAME, FNAME, PNAME FROM DEPARTMENT, EMPLOYEE, WORKS_ON, PROJECT WHEREDNUMBER = DNO AND SSN = ESSN AND PNO = PNUMBER ORDER BY DNAME, LNAME, FNAME order by – clause group by – clause having – clause aggregation functions: max, min, average, count, sum

Database: Review Sept. 2004Yangjun Chen DML - select-from-where clause Insert Update Delete INSERT INTO employee ( fname, lname, ssn, dno ) VALUES ( "Joe", "Smith", 909, 1); UPDATE employee SET salary = WHERE ssn=909; DELETE FROM employee WHERE ssn=909; Note that Access changes the above to read: INSERT INTO employee ( fname, lname, ssn, dno ) SELECT "Joe", "Smith", 909, 1;

Database: Review Sept. 2004Yangjun Chen View definition Use a Create View command essentially a select specifying the data that makes up the view Create View Enames as select lname, fname from employee CREATE VIEWEnames (lname, fname) AS SELECTLNAME, FNAME FROMEMPLOYEE

Database: Review Sept. 2004Yangjun Chen CREATE VIEWDEPT_INFO (DEPT_NAME, NO_OF_EMPS, TOTAL_SAL) AS SELECTDNAME, COUNT(*), SUM(SALARY) FROMDEPARTMENT, EMPLOYEE WHEREDNUMBER = DNO GROUP BYDNAME;

Database: Review Sept. 2004Yangjun Chen function dependencies - data redundancy, update anomalies - what is a function dependency? - inference rules, minimal set of FDs normal forms - first normal form - second normal form - third normal form - Boyce Codd normal form Normalization

Database: Review Sept. 2004Yangjun Chen Data redundancy and update anomalies: enamessnbdateaddress EmployeeDepartment dnumberdname This is similar to Employee, but we have included dname

Database: Review Sept. 2004Yangjun Chen In the two prior cases with EmployeeDepartment and EmployeeProject, we have redundant information in the database … if two employees work in the same department, then that department name is replicated if more than one employee works on a project then the project location is replicated if an employee works on more than one project his/her name is replicated Redundant data leads to additional space requirements update anomalies

Database: Review Sept. 2004Yangjun Chen Suppose EmployeeDepartment is the only relation where department name is recorded insert anomalies adding a new department is complicated unless there is also an employee for that department deletion anomalies if we delete all employees for some department, what should happen to the department information? modification anomalies if we change the name of a department, then we must change it in all tuples referring to that department

Database: Review Sept. 2004Yangjun Chen Functional dependencies: Suppose we have a relation R comprising attributes X,Y, … We say a functional dependency exists between the attributes X and Y, if, whenever a tuple exists with the value x for X, it will always have the same value y for Y. XY XY LHSRHS

Database: Review Sept. 2004Yangjun Chen student_nostudent_namecourse_nogender Student Given a specific student number, there is only one value for student name and only one value for gender found with it. Student_noStudent_name gender

Database: Review Sept. 2004Yangjun Chen Inference Rules for Function Dependencies From a set of FDs, we can derive some other FDs Example: F = {ssn  {Ename  Bdate, Address, dnumber}, dnumber  {dname, dmgrssn}} ssn  {dname, dmgrssn}, ssn  dnumber, dnumber  dname. inference F + (closure of F): The set of all FDs that can be deduced from F (with F together) is called the closure of F.

Database: Review Sept. 2004Yangjun Chen Inference Rules for Function Dependencies Inference rules: - IR1 (reflexive rule): If X  Y, then X  Y. (X  X.) - IR2 (augmentation rule): {X  Y} |= ZX  Y. - IR3 (transitive rule): {X  Y, Y  Z} |= X . - IR4 (decomposition, or projective, rule): {X  Y} |= X  Y, X  Z. - IR5 (union, or additive, rule): {X  Y, Y  Z} |= X  Y. - IR6 (pseudotransitive rule): {X  Y, WY  Z} |= WX .

Database: Review Sept. 2004Yangjun Chen Equivalence of Sets of FDs E and F are equivalent if E + = F +. Minimal sets of FDs every dependency has a single attribute on the RHS the attributes on the LHS of a dependency are minimal we cannot remove any dependency from F and still have a set of dependencies that is equivalent to F. ssnpnumberhoursenameplocation {ssn, pnumber}  hours, ssn  ename, pnumber  plocation.

Database: Review Sept. 2004Yangjun Chen Normal Forms A series of normal forms are known that have, successively, better update characteristics. We’ll consider 1NF, 2NF, 3NF, and BCNF. A technique used to improve a relation is decomposition, where one relation is replaced by two or more relations. When we do so, we want to eliminate update anomalies without losing any information.

Database: Review Sept. 2004Yangjun Chen NF - First Normal Form The domain of an attribute must only contain atomic values. This disallows repeating values, sets of values, relations within relations, nested relations, … In the example database we have a department located in possibly several locations: department 5 is located in Bellaire, Sugarland, and Houston. If we had the relation then it would not be 1NF because there are multiple values to be kept in dlocations. Department dnumberdnamedmgrssndlocations 5Research Bellaire, Sugarland, Houston

Database: Review Sept. 2004Yangjun Chen NF - First Normal Form If we have a non-1NF relation we can decompose it, or modify it appropriately, to generate 1NF relations. There are 3 options: option 1: split off the problem attribute into a new relation (create a DepartmentLocation relation). dnumberdnamedmgrssndlocation Department dnumber DepartmentLocation 5Research Bellaire5 5Sugarland 5Houston Generally considered the best solution

Database: Review Sept. 2004Yangjun Chen NF - Second Normal Form full functional dependency X  Y is a full functional dependency if removal of any attribute A from X means that the dependency does not hold any more. ssnpnumberhoursenameplocation EmployeeProject {ssn, pnumber}  hours is a full dependency (neither ssn  hours, nor pnumber  hours).

Database: Review Sept. 2004Yangjun Chen NF - Second Normal Form partial functional dependency X  Y is a partial functional dependency if removal of some attribute A from X does not affect the dependency. {ssn, pnumber}  ename is a partial dependency because ssn  ename holds.) ssnpnumberhoursenameplocation EmployeeProject

Database: Review Sept. 2004Yangjun Chen NF - Second Normal Form A relation schema is in 2NF if (1) it is in 1NF and (2) every non-key attribute must be fully functionally dependent on the primary key. If we had the relation EmployeeProject ssnpnumberhoursenameplocation then this relation would not be 2NF because of two separate violations of the 2NF definition:

Database: Review Sept. 2004Yangjun Chen NF - Second Normal Form We correct this by decomposing the relation into three relations - splitting off the offending attributes - splitting off partial dependencies on the key. ssnpnumberhoursenameplocation EmployeeProject ssnpnumberhours ename plocation ssn pnumber 2NF

Database: Review Sept. 2004Yangjun Chen NF - Third Normal Form Transitive dependency A functional dependency X  Y in a relation schema R is a transitive dependency if there is a set of attributes Z that is not a subset of any key of R, and both X  Z and Z  Y hold. enamessnbdateaddress EmployeeDept dnumberdname ssn  dnumber and dnumber  dname

Database: Review Sept. 2004Yangjun Chen NF - Third Normal Form A relation schema is in 3NF if (1) it is in 2NF and (2) each non-key attribute must not be fully functionally dependent on another non-key attribute (there must be no transitive dependency of a non-key attribute on the PK) If we had the relation enamessnbdateaddressdnumberdname then this relation would not be 3NF because dname is functionally dependent on dnumber and neither is a key attribute

Database: Review Sept. 2004Yangjun Chen NF - Third Normal Form We correct this by decomposing - splitting off the transitive dependencies enamessnbdateaddress EmployeeDept dnumberdname enamessnbdateaddressdnumber dnamednumber 3NF

Database: Review Sept. 2004Yangjun Chen Boyce Codd Normal Form, BCNF Consider a different definition of 3NF, which is equivalent to the previous one. A relation schema R is in 3NF if, whenever a function dependency X  A holds in R, either (a)X is a superkey of R, or (b)A is a prime attribute of R. A superkey of a relation schema R = {A1, A2,..., An} is a set of attributes S  R with the propertity that no tuples t1 and t2 in any legal state r of R will have t1[S] = t2[S]. An attribute is called a prime attribute if it is a member of any key.

Database: Review Sept. 2004Yangjun Chen Boyce Codd Normal Form, BCNF If we remove (b) from the previous definition for 3NF, we have the definition for BCNF. A relation schema is in BCNF if every determinant is a superkey key. Stronger than 3NF: - no partial dependencies - no transitive dependencies where a non-key attribute is dependent on another non-key attribute - no non-key attributes appear in the LHS of a functional dependency.

Database: Review Sept. 2004Yangjun Chen Boyce Codd Normal Form, BCNF Consider: student_nocourse_noinstr_no Instructor teaches one course only. Student takes a course and has one instructor. In 3NF! {student_no, course_no}  instr_no instr_no  course_no

Database: Review Sept. 2004Yangjun Chen Boyce Codd Normal Form, BCNF This decomposition preserves all the information. course_noinstr_no student_noinstr_no S#C#I# Only FD is instr_no course_no but the join preserves {student_no, course_no} instr_no

Database: Review Sept. 2004Yangjun Chen Definition of lossless join property - relation decomposition - lossless join property Testing algorithm - matrix construction - matrix initialization - matrix modification Lossless join

Database: Review Sept. 2004Yangjun Chen Basic definition of Lossless-join A decomposition D = {R 1, R 2,..., R m } of R has the lossless join property with respect to the set of dependencies F on R if, for every relation r of R that satisfies F, the following holds,  (  R1 (r),...,  Rm (r)) = r, where  is the natural join of all the relations in D. The word loss in lossless refers to loss of information, not to loss of tuples.

Database: Review Sept. 2004Yangjun Chen SSNPNUMhoursENAME Emp_PROJ PNAMEPLOCATION F = {SSN  ENAME, PNUM  {PNAME, PLOCATION}, {SSN, PNUM}  hours} SSNENAME R1 PNUMPNAMEPLOCATION R2 SSNPNUMhours R3 Lossless join

Database: Review Sept. 2004Yangjun Chen decomposion-1 A1 SSN A2 ENAME A3 PNUM A4 PNAME A5 PLOCATION A6 hours b11 b21 b31 b12 b22 b32 b13 b23 b33 b14 b24 b34 b15 b25 b35 b16 b26 b36 R1 R2 R3 a1 b21 a1 a2 b22 b32 b13 a3 b14 a4 b34 b15 a5 b35 b16 b26 a6 R1 R2 R3

Database: Review Sept. 2004Yangjun Chen a1 b21 a1 a2 b22 a2 b13 a3 b14 a4 b34 b15 a5 b35 b16 b26 a6 R1 R2 R3 a1 b21 a1 a2 b22 a2 b13 a3 b14 a4 b15 a5 b16 b26 a6 R1 R2 R3 SSN  ENAME PNUM  {PNAME, PLOCATION} SSNENAME PNUMPNAMEPLOCATION

Database: Review Sept. 2004Yangjun Chen Example: decomposition-2 SSNPNUMhoursENAME Emp_PROJ PNAMEPLOCATION F = {SSN  ENAME, PNUM  {PNAME, PLOCATION}, {SSN, PNUM}  hours} ENAME R1 SSNPNAME PLOCATION R2 PNUMhours Not lossless join PLOCATION

Database: Review Sept. 2004Yangjun Chen decomposition-2 A1 SSN A2 ENAME A3 PNUM A4 PNAME A5 PLOCATION A6 hours b11 b21 b12 b22 b13 b23 b14 b24 b15 b25 b16 b26 R1 R2 b11 a1 a2 b22 b13 a3 b14 a4 a5 b16 a6 R1 R2 The matrix can not be changed! SSN  ENAME PNUM  {PNAME, PLOCATION} {SSN, PNUM}  hours

Database: Review Sept. 2004Yangjun Chen Hierarchical database schema - hierarchical schema - record type, PCR type - virtual PCR: virtual child, virtual parent Database languages - HDDL - HDML Hierarchical databases

Database: Review Sept. 2004Yangjun Chen dependent Dept_locations employee department project ERD for Chapter 6 database example n n n n n n m Works on

Database: Review Sept. 2004Yangjun Chen Virtual Parent-child Relationships -Hierarchical schema using VPCR - for a Company database Department Dname Dnum Project Pname …... Dlocation Location Demployee EPTR Dmanager MPTR Pworker Hours WPTR Employee Ename Minit …... Esupervisee SPTR Dependent DEPnameMinit... DE L P Y M W S T StartDate