IT Relational Database Theory
The Relational Theory –Ways of working with data Types of “Models” –File database model –Hierarchical database model –Network database model –Relational database model
Relational Database Theory The Relational Theory –Meaning of database model The way data is organized & stored The way data is manipulated
Relational Database Theory Relational Model of Data –Published in 1970 by Dr. Edgar (Ted) Codd – IBM “A Relational Model of Data for Large Shared Data Banks”
Relational Database Theory Relational Model of Data –Purpose Achieve program/data structure independence Treat data in a disciplined way –Apply rigor of mathematics –Uses Set Theory – sets of related data Improve programmer productivity
Relational Database Theory The Relational Model –Relational uses familiar concepts The data is perceived as organized in tables –Relational also incorporates the rigor of mathematics Rows of the table are treated as elements in a set Manipulation of rows is based on set operations – (Vinn Diagrams) –User works with a set of rows at a time
Relational Database Theory Relational also impacts Data Design –Files were often constructed to support an application –Tables are designed to describe one thing or Entity in the database
Relational Database Theory Example of a Relation: –ANIMAL – Entity (Relation) ANAMEAFAMILYWEIGHT CandiceCamel1800 ZonaZebra900 SamSnake5 ElmerElephant5000 LeonardLion1200
Relational Database Theory Definition of a Relation –Data is organized & stored in structures called relations –A relation is a table that adheres to certain rules A relation can be called a table
Relational Database Theory Definition of a Relation –A relation is a table containing all the data about some entity An entity is a thing or object that is important in this application area Data items in the table are related
Relational Database Theory Relational Data Structure ANAMEAFAMILYWEIGHT CandiceCamel1800 ZonaZebra900 SamSnake5 ElmerElephant5000 LeonardLion1200 NameSpeciesWeight Domains Primary Key Relation Attributes Tuples
Relational Database Theory Relational Data Structure Definitions –Relation The Table –Tuple A Row –Attribute A Column
Relational Database Theory Relational Data Structure Definitions –Primary Key A unique identifier for the table –Domain A pool of legal values from which an attribute value is selected –Related to meaning –Has a Data Type
Relational Database Theory Relational Data Structure Definitions –Degree The number of attributes –Cardinality The number of tuples
Relational Database Theory Relational Table Rules –A Relation is a table that adheres to the following rules: There are No Duplicate Tuples in the table –The tuples in the table are treated as a mathematical set
Relational Database Theory Relational Table Rules –By definition, a set is a collection of unique elements There must be a primary key (unique identifier) for each tuple
Relational Database Theory Relational Table Rules There is no order to the tuples (top to bottom) There is no order to the attributes (left to right) –By convention, the primary key attribute is usually the first one on the left side of the table
Relational Database Theory Attributes –Each attribute has a datatype Examples: Integer, character, date, user-defined –The data value of an attribute can be null
Relational Database Theory Attributes –Each attribute value is atomic There is One & Only One data value in each cell of the table There are no Lists or Arrays One fact per field, one field per fact –Can be called a Field (MS Access)
Relational Database Theory Relational Data Structure: Design –Each relation contains data about only one entity Each row corresponds to one unique occurrence of the entity –A relation does not contain arrays, lists or repeating groups No multi-valued attributes
Relational Database Theory –Tables are designed according to Rules of Normalization Each data item in the table is determined –By the Primary Key –By the Whole Primary Key –Only by the Primary Key
Relational Database Theory –Normalization avoids well-known update problems Optimizes design to minimize redundancy & storage requirements
Relational Database Theory Example: Table with repeating group –Animal ANAMEAFAMILYWEIGHTFOOD CandiceCamel1800Hay Buns ZonaZebra900Brush SamSnake5Mice People ElmerElephant5000Leaves LeonardLion1200People Meat
Relational Database Theory Example: Table with no repeating group ANAMEFOOD CandiceHay CandiceBuns ZonaBrush SamMice SamPeople ElmerLeaves LeonardPeople LeonardMeat ANAMEAFAMILYWEIGHT CandiceCamel1800 ZonaZebra900 SamSnake5 ElmerElephant5000 LeonardLion1200 Animal Animal-Food
Relational Database Theory A Database Models the Real World –A Database represents Reality –The database is a collection of relations A relation represents an entity type Each tuple represents one occurrence of that entity type Each occurrence of an entity is unique
Relational Database Theory A Database Models the Real World –A database contains information about Entities Relationships between entities Rules about the entities’ data & the relationships
Relational Database Theory Relational Databases Support Relationships –Relational databases support relationships between entities Relationship is established by a Foreign Key Repeat the Primary Key of one table in the related table(s)
Relational Database Theory Example: The Zoo has an “Adopt-an-Animal” program –A zoo member can adopt an animal MIDMNAMEMADDR***ANAME 171N. Harrison1400 Blush RdZona 144J. Montagano th AveLeonard 194J. Spence1244 Lark LnCandice 303E. Wingate5222 Gains DrCandice 101H. Yarchun177 Beach Rd 270K. Steeg140 Crystal DrZona 291S. Ackerman1172 Park DrSam 301K. Snyder th Ave ANAMEAFAMILYWEIGHT CandiceCamel1800 ZonaZebra900 SamSnake5 ElmerElephant5000 LeonardLion1200 Animal Foreign Key Zoo-Member
Relational Database Theory Example: Another Relationship ANAMEFOOD CandiceHay CandiceBuns ZonaBrush SamMice SamPeople ElmerLeaves LeonardPeople LeonardMeat ANAMEAFAMILYWEIGHT CandiceCamel1800 ZonaZebra900 SamSnake5 ElmerElephant5000 LeonardLion1200 Animal Animal-Food Composite Primary Key Foreign Key
Relational Database Theory Relational Integrity Rules –Entity Integrity No part of the Primary Key (PK) may be Null –Referential Integrity The value of a Foreign Key (FK) must either –Be Null or –Be one of the values of the PK in the related table
Relational Database Theory Keys, Keys, and More Keys –Characteristic of a Primary Key (PK) Unique Mandatory Unchanging Under the control of IT organization
Relational Database Theory Keys, Keys, and More Keys –Names or Types of Keys Candidate Key –A minimal set of attributes that can be used as the unique identifier for a table
Relational Database Theory Keys, Keys, and More Keys –Names or Types of Keys Primary Key –One of the candidate keys Alternate Key –A candidate key that is not the primary key
Relational Database Theory Keys, Keys, and More Keys –Names or Types of Keys Foreign Key –A primary key of a related table –Indicates relationships
Relational Database Theory Keys, Keys, and More Keys –Names or Types of Keys Composite Key –A key composed of more than one attribute Search Key –One or more attributes on which a retrieval is based »Indexes
Relational Database Theory Characteristics of Relationships –Referential integrity applies to the relationship between entities Also known as an existence constraint or an enterprise rule For every relationship, referential integrity must be defined
Relational Database Theory Relationships have Cardinality –One-To-One –One-To-Many –Many-To-Many Relationships have Optionality –Each entity’s participation is either Mandatory or Optional
Relational Database Theory Cardinality reflects Business Rules –One-To-One Relationship One animal is cared for by one zoo worker One zoo worker cares for one animal
Relational Database Theory Cardinality reflects Business Rules –One-To-Many Relationship One animal is cared for by many zoo workers One zoo worker cares for only one animal
Relational Database Theory Cardinality reflects Business Rules –Many-To-Many Relationship One animal is cared for by many zoo workers One zoo worker cares for many animals
Relational Database Theory Mandatory Relationship –The Foreign Key Cannot be Null –Every purchase order must have a supplier –In the example below the FK, SNO, cannot be Null
Relational Database Theory Example: ONOSNOODATE*** /09/ /10/ /12/02 *** SUPPLIER SNOSNAMESADDR 1234Farm & Feed 7000 Booth Rd 2079The Grain House 2001 Larkin Dr *** PORDER
Relational Database Theory Example: FK can be Null ANIDANAMEAFAMILYWEIGHT 0001CandiceCamel ZonaZebra SamSnake5 0004ElmerElephant LeonardLion1200 ANIMAL MIDMNAMEMADDR***ANID 171N. Harrison1400 Blush Rd J. Montagano th Ave J. Spence1244 Lark Ln E. Wingate5222 Gains Dr H. Yarchun177 Beach Rd 270K. Steeg140 Crystal Dr S. Ackerman1172 Park Dr K. Snyder th Ave Foreign Key ZOO-MEMBER
Relational Database Theory What happens when a Tuple is deleted? –For every relationship, there are three possible delete options Cascades –Delete the target tuple and –Delete the related tuples
Relational Database Theory Restricted –Delete restricted to cases for which there are no related tuples Nullifies –Delete the target tuple and –Set the FK to null in the related tuples
Relational Database Theory Relational Algebra Operations –Select –Project –Join –Union –Intersect –Difference
Relational Database Theory Our Zoo Database Tables ANIDANAMEAFAMILYWEIGHT 0001CandiceCamel ZonaZebra SamSnake5 0004ElmerElephant LeonardLion1200 ANIMAL MIDMNAMEMADDR***ANID 171N. Harrison1400 Blush Rd J. Montagano th Ave J. Spence1244 Lark Ln E. Wingate5222 Gains Dr H. Yarchun177 Beach Rd 270K. Steeg140 Crystal Dr S. Ackerman1172 Park Dr K. Snyder th Ave ZOO-MEMBER ANIMAL-FOOD ANIDFOOD 0001Hay 0001Buns 0002Brush 0003Mice 0003People 0004Leaves 0005People 0005Meat
Relational Database Theory Relational Algebra: SELECT –Extracts specified tuples from a relation (or get rows from a table)
Relational Database Theory Example: SELECT out from the ANIMAL-FOOD table (display) the rows where FOOD=PEOPLE ANIMAL-FOOD ANIDFOOD 0001Hay 0001Buns 0002Brush 0003Mice 0003People 0004Leaves 0005People 0005Meat ANIDFOOD 0003People 0005People RESULTS
Relational Database Theory Relational Algebra: PROJECT –Extracts specified attributes(columns) from a relation (or get columns from a table)
Relational Database Theory Example: PROJECT from the ZOO-MEMBER table columns (MID, NAME) MIDMNAMEMADDR***ANID 171N. Harrison1400 Blush Rd J. Montagano th Ave J. Spence1244 Lark Ln E. Wingate5222 Gains Dr H. Yarchun177 Beach Rd 270K. Steeg140 Crystal Dr S. Ackerman1172 Park Dr K. Snyder th Ave ZOO-MEMBER MIDMNAME 171N. Harrison 144J. Montagano 194J. Spence 303E. Wingate 101H. Yarchun 270K. Steeg 291S. Ackerman 301K. Snyder RESULTS
Relational Database Theory Relational Algebra: JOIN –Join the data in two tables Concatenate one row from Table 1 with one row from Table 2 –Usually based on a common column called the join condition
Relational Database Theory Example: JOIN T1 and T2 based on the AFAMILY column ANIDAFAMILY 0001Camel 0002Zebra T1 AFAMILYAREA Camel01 Zebra03 T2 ANIDAFAMILY AREA 0001Camel Zebra 03 RESULT
Relational Database Theory Different types of Joins –Equijoin – means a row in T1 is joined with a row in T2 where the values in the common column(s) are equal –This is the most common type of join ANIDAFAMILY 0001Camel 0002Zebra T1 AFAMILYAREA Camel01 Zebra03 T2 ANIDAFAMILY AREA 0001Camel Zebra 03 RESULT Join T1 and T2 where T1.AFAMILY=T2.AFAMILY
Relational Database Theory Natural Join –The rows of T1 are joined with the rows of T2 where the PK value in one table equals the FK value in the other table Where column name are the same Don’t use this in a Production Database – renaming causes problems ANIDAFAMILY 0001Camel 0002Zebra T1 AFAMILYAREA Camel01 Zebra03 T2 ANIDAFAMILY AREA 0001Camel Zebra 03 RESULT T1 NATURAL JOIN T2
Relational Database Theory Inner Join –The rows of T1 are joined with the rows of T2 based on the join condition specified Only rows from T1 with a matching row in T2 are in the result Often an Inner Join is both a Natural & a Equijoin
Relational Database Theory Example: Inner Join –T1 INNER JOIN T2 on T1.AFAMILY=T2.AFAMILY ANIDAFAMILY 0001Camel 0002Zebra T1 AFAMILYAREA Camel01 Zebra03 T2 ANIDAFAMILY AREA 0001Camel Zebra 03 RESULT
Relational Database Theory Outer Join –The rows of T1 are joined with the rows of T2 All rows from one of the tables are included in the result even if there is no matching row in the other table
Relational Database Theory Example: Outer Join –T1 RIGHT OUTER JOIN T2 on T1.AFAMILY=T2.AFAMILY ANIDAFAMILY 0001Camel 0002Zebra T1 AFAMILYAREA Camel01 Zebra03 Snake05 T2 ANIDAFAMILY AREA 0001Camel Zebra 03 Snake05 RESULT
Relational Database Theory Cross Join –Every row in T1 is joined with every row in T2 All possible combinations of rows in the two tables Also called a Cartesian Product
Relational Database Theory Example: Cross Join –T1 CROSS JOIN T2 ANIDAFAMILY 0001Camel 0002Zebra T1 AFAMILYAREA Camel01 Zebra03 T2 ANIDAFAMILY AREA 0001Camel CamelZebra ZebraCamel Zebra 03 RESULT
Relational Database Theory An RDBMS manipulates Data using Relational Algebra Operations –There are (usually) several sequences of operations to answer a query One sequence may be more efficient than another –A relational DBMS internally has routines that do the relational algebra
Relational Database Theory –A relational DBMS generates a sequence or plan of relational algebra operations to accomplish the request –A relational DBMS has a query optimizer to develop an efficient query plan A least-cost optimizer generates several execution plans and chooses the least- cost one; i.e.. Least amount of I/O
Relational Database Theory Union, Intersection, and Minus Union – union together (append) the result tables from two queries Intersect – take only the rows that are identical in the result tables from two queries Difference – take only the rows in the first result table that have no identical rows in the second result table
Relational Database Theory Relational Algebra: UNION –Union together the results of two queries Result contains every element in either one or both sets –Query 1 Select the rows from ANIMAL where WEIGHT > 2000 into T1 Project from T1(ANID) into result 1
Relational Database Theory –Query 2 Select the rows from ANIMAL-FOOD where FOOD=PEOPLE into T2 Project from T2(ANID) into Result 2 –Query 1 UNION Query 2
Relational Database Theory ANID ANID ANID 0004 RESULT 1RESULT 2RESULT UNION
Relational Database Theory Relational Algebra: INTERSECTION –Take only the rows (tuples) that are identical in the result tables of two queries Query 1 –Select out the rows from ANIMAL where WEIGHT > 1000 into T1 –Project from T1(ANID) into Result 1
Relational Database Theory Query 2 –Project from ZOO-MEMBER(ANID) into Result 2 Query 1 INTERSECT Query 2 ANID ANID ANID RESULT 1RESULT 2RESULT INTERSECT
Relational Database Theory Relational Algebra: Minus/Difference/Except –Subtract from the results of one query from the results of a second query Query 1 –Project from ANIMAL(ANID) into Result 1 Query 2 –Project from ZOO-MEMBER(ANID) into Result 2
Relational Database Theory Query 1 EXCEPT Query 2 ANID 0004 ANID ANID RESULT 1RESULT 2RESULT EXCEPT
Relational Database Theory Strengths of the Relational Approach –Simple People are familiar with tables Few rules Few operations –Easy to learn Relational algebra is straightforward Multiple high-level, non-procedural languages are available -SQL
Relational Database Theory –Well founded Basis is mathematics, set theory