Presentation is loading. Please wait.

Presentation is loading. Please wait.

RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM.

Similar presentations


Presentation on theme: "RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM."— Presentation transcript:

1 RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM

2 A Brief History of Data Models
1950s file systems, punched cards 1960s hierarchical IMS 1970s network CODASYL, IDMS 1980s relational INGRES, ORACLE, DB2, Sybase Paradox, dBase 1990s object oriented and object relational O2, GemStone, Ontos

3 Relational Model Sets Mappings collections of items of the same type
no order no duplicates Mappings domain range 1:many many:1 1:1 many:many

4 Definitions Database A database consists of a collection of interrelated data. RDBMS A relational database management system (RDBMS) is a database engine/system based on the relational model specified by Edgar F. Codd--the father of modern relational database design--in 1970. A relational database refers to a database that stores data in a structured format, using rows and columns. Relation a subset of the cartesian product of its domains. Given a relation schema R, a relation on that schema r, a set of attributes A1..An for that relation then r(R)  (dom(A1)  dom(A2)  ...  dom(An)) Relation Schema denoted by R(A1, A2, …, An), is made up of relation name R and list of attributes A1, A2, …, An.

5 (N)-tuple a set of (n) attribute-value pairs representing a single instance of a relation’s mapping between its domains. Degree the number of attributes a relation has. Cardinality a number of tuples a relation has. Attribute a function on a domain for each instance of the mapping or tuple Attribute Value the result of the attribute function. Each instance of the mapping is represented by one attribute value drawn from each domain or a special NULL value. Given a tuple t and an attribute A for a relation r, t[A]--> a, where a is the attribute’s value for that tuple. Domain set of all possible values for an attribute; for attribute A, the domain is represented as dom(A). A domain has a format and a base data type.

6 Keys SuperKey Candidate Key Primary Key Keys can be composite
a set of attributes whose values together uniquely identify a tuple in a relation Candidate Key a superkey for which no proper subset is a superkey…a key that is minimal . Can be more than one for a relation Primary Key a candidate key chosen to be the main key for the relation. One for each relation Keys can be composite Foreign Key A foreign key is a column or group of columns in a relational database table that provides a link between data in two tables. It acts as a cross-reference between tables because it references the primary key of another table, thereby establishing a link between them.

7 Integrity Constraints
Integrity constraints guard against accidental damage to the database, by ensuring that authorized changes to the database do not result in a loss of data consistency. Domain Constraints Referential Integrity Assertions Triggers Functional Dependencies

8 Functional Dependencies
Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs FDs and keys are used to define normal forms for relations FDs are constraints that are derived from the meaning and interrelationships of the data attributes

9 Functional Dependencies (2)
A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y X Y holds if whenever two tuples have the same value for X, they must have the same value for Y If t1[X]=t2[X], then t1[Y]=t2[Y] in any relation instance r(R) X  Y in R specifies a constraint on all relation instances r(R) FDs are derived from the real-world constraints on the attributes

10 Inference Rules for FDs
Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold Armstrong's inference rules A1. (Reflexive) If Y subset-of X, then X  Y A2. (Augmentation) If X  Y, then XZ  YZ (Notation: XZ stands for X U Z) A3. (Transitive) If X  Y and Y  Z, then X  Z A1, A2, A3 form a sound and complete set of inference rules

11 Introduction to Normalization
Normalization: Process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form 2NF, 3NF, BCNF based on keys and FDs of a relation schema 4NF based on keys, multi-valued dependencies

12 First Normal Form First Normal Form
We say a relation is in 1NF if all values stored in the relation are single-valued and atomic. 1NF places restrictions on the structure of relations. Values must be simple.

13 First Normal Form EmpNum EmpPhone EmpDegrees 123 233-9876 333 233-1231
The following in not in 1NF EmpNum EmpPhone EmpDegrees 123 333 BA, BSc, PhD 679 BSc, MSc EmpDegrees is a multi-valued field: employee 679 has two degrees: BSc and MSc employee 333 has three degrees: BA, BSc, PhD

14 Second Normal Form Second Normal Form
A relation is in 2NF if it is in 1NF, and every non-key attribute is fully dependent on each candidate key. (That is, we don’t have any partial functional dependency.) 2NF (and 3NF) both involve the concepts of key and non-key attributes. A key attribute is any attribute that is part of a key; any attribute that is not a key attribute, is a non-key attribute. Relations that are not in BCNF have data redundancies A relation in 2NF will not have any partial dependencies

15 Third Normal Form Third Normal Form
A relation is in 3NF if the relation is in 2NF and all determinants of non-key attributes are candidate keys That is, for any functional dependency: X  Y, where Y is a non-key attribute (or a set of non-key attributes), X is a candidate key. This definition of 3NF differs from BCNF only in the specification of non-key attributes - 3NF is weaker than BCNF. (BCNF requires all determinants to be candidate keys.) A relation in 3NF will not have any transitive dependencies of non-key attribute on a candidate key through another non-key attribute.

16 Boyce-Codd Normal Form
BCNF is defined very simply: a relation is in BCNF if it is in 1NF and if every determinant is a candidate key. If our database will be used for OLTP (on line transaction processing), then BCNF is our target. Usually, we meet this objective. However, we might denormalize (3NF, 2NF, or 1NF) for performance reasons.

17 Instructor teaches one course only.
student_no course_no instr_no Instructor teaches one course only. Student takes a course and has one instructor. In 3NF, but not in BCNF: {student_no, course_no}  instr_no instr_no  course_no since we have instr_no  course-no, but instr_no is not a Candidate key.

18 Declartive query language
Relational Algebra Formalism for creating new relations from existing ones Its place in the big picture: Declartive query language Algebra Implementation Relational algebra Relational bag algebra SQL, relational calculus

19 Relational Algebra Five operators: Derived or auxiliary operators:
Union:  Difference: - Selection: s Projection: P Cartesian Product:  Derived or auxiliary operators: Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: r

20 1. Union and 2. Difference R1  R2 Example: R1 – R2
ActiveEmployees  RetiredEmployees R1 – R2 AllEmployees -- RetiredEmployees

21 What about Intersection ?
It is a derived operator R1  R2 = R1 – (R1 – R2) Also expressed as a join (will see later) Example UnionizedEmployees  RetiredEmployees

22 3. Selection Returns all tuples which satisfy a condition
Notation: sc(R) Examples sSalary > (Employee) sname = “Smith” (Employee) The condition c can be =, <, , >, , <>

23 SSN Name Salary John 200000 Smith 600000 Fred 500000 sSalary > (Employee) SSN Name Salary Smith 600000 Fred 500000

24 4. Projection Eliminates columns, then removes duplicates
Notation: P A1,…,An (R) Example: project social-security number and names: P SSN, Name (Employee) Output schema: Answer(SSN, Name)

25 SSN Name Salary 1234545 John 200000 5423341 600000 4352342 Name Salary
P Name,Salary (Employee) Name Salary John 20000 60000

26 5. Cartesian Product Each tuple in R1 with each tuple in R2
Notation: R1  R2 Example: Employee  Dependents Very rare in practice; mainly used to express joins

27 Natural Join Notation: R1 || R2 Meaning: R1 || R2 = PA(sC(R1  R2))
Where: The selection sC checks equality of all common attributes The projection eliminates the duplicate common attributes

28 Natural Join R= S= R || S= A B X Y Z V B C Z U V W A B C X Z U V Y W

29 Theta Join Eq-join A join that involves a predicate
R1 || q R2 = s q (R1  R2) Here q can be any condition Eq-join A theta join where q is an equality R1 || A=B R2 = s A=B (R1  R2) Example: Employee || SSN=SSN Dependents Most useful join in practice

30 Semijoin R | S = P A1,…,An (R || S)
Where A1, …, An are the attributes in R Example: Employee | Dependents

31 Finally: RA has Limitations !
Cannot compute “transitive closure” Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program Name1 Name2 Relationship Fred Mary Father Joe Cousin Bill Spouse Nancy Lou Sister


Download ppt "RDBMS RELATIONAL DATABASE MANAGEMENT SYSTEM."

Similar presentations


Ads by Google