Relational Model (CB Chapter 4) CPSC 356 Database Ellen Walker Hiram College
Relational Model Proposed by E.F. Codd, 1970 Goals –Data independence (physical vs. conceptual) –Normalization (avoiding redundancy) –Set-oriented data manipulation language (relational algebra) Examples of Relational DBMS: –DBMS System R (IBM), Ingres (UC Berkeley) –Oracle (all versions), Access, RDB, DB2 …
Terminology A relation (instance) is a table with columns and rows. –(don’t confuse with relationships in ER models) An attribute is a named column of a table. –(corresponds to simple attribute in ER) A domain is the set of allowable values for an attribute A tuple is a row of a table.
More Terminology The degree or arity of a relation is its number of attributes (columns) The cardinality of a relation is its number of tuples (rows) A relational database is a collection of normalized relations with distinct names. –We’ll get to normalization later!
Example Relation (Phone Book) Attributes: First, last, dept, , etc. Domain for Title: {Prof, Inst, Assoc, Asst} Tuple: (Obie,Slotterbeck,CS,Obie,5275,Prof) Degree = 6, Cardinality = 4 FirstLastDept PhoneTitle ObieSlotterbeckCSObie5275Prof IrinaLomonosovCSLomonosovI5002Inst EllenWalkerCSWalkerEL5250Prof AngelaGuercioCSGuercioA6048Asst
Order Doesn’t Matter! This is the same phone book relation. TitleFirstPhoneLastDept LomonosovIInstIrina5002LomonosovCS ObieProfObie5275SlotterbeckCS WalkerELProfEllen5250WalkerCS GuercioAAsstAngela5048GuercioCS
Sets and Subsets A set is an unordered collection of unique elements Set S = {1,2,3} “1 is an element of S” {a,b,c} = {a,c,b} A subset of a set is another set whose elements all come from the original set. {a,b} is a subset of {a,c,b} {1,2,3} is a subset of {1,2,3} {1,2,4} is not a subset of {1,2,3} {} (the empty set) is a subset of every set!
Mathematical Relations Cartesian product: a set of ordered pairs, where each contains one element from each original set {1,2,3} x {a, b} = {(1,a), (1,b), (2,a), (2,b), (3,a), (3,b)} Relation: any subset of a Cartesian product –R1 = {(1,a),(2,b),(3,a)} –R2 = {(1,a), (1,b), (2,b), (3,b)} –R3 = { }
Bigger Relations First, take the cross product of n sets –If n = 3, get triples; if n=4, get quadruples –For arbitrary n, get “n-tuples” Now, take a subset of that big cross product –Several n-tuples Arrange them in rows, so the commas line up –Looks familiar… Domains of attributes are sets from the cross product.
Database Schemas A schema describes a database, but not the data Relation schema: set of attribute and domain name pairs (First: char-string)(Last: char-string) (Dept: {CS,Math,Comm,Econ,Mgmt, etc.}) (Phone: 4-digit-string)( char-string) (Title: {Inst,Asst,Assoc,Prof}) Relational Database Schema: set of relational schemas, each with a distinct name and integrity constraints (later)
Properties of (Database) Relations Every relation has a unique name Every attribute has a unique name Attribute’s values are all from its domain Because relations are sets… –No duplicate tuples –Order of tuples doesn’t matter (theoretically) Every cell contains a single value (single- valued attributes)
Single-valued Attributes “Oldest-child” is an ok attribute “Children” is not, because it is likely to be multi-valued Break up composite attributes into multiple attributes –Name --> First-name, Middle-init, Last-name Use “relationships” in the ER sense for multivalued attributes.
Keys Superkey: set of attributes that uniquely identifies a tuple. – (The whole tuple is always a superkey - why?) Candidate Key: a superkey that doesn’t have a proper subset that is also a superkey Primary Key: the candidate key that is selected by the designer (often generated) Foreign Key: Set of attributes in one relation that matches the Primary key of another.
Properties of a Candidate Key Uniqueness –The set of attribute values in the candidate key uniquely identifies a tuple Irreducibility –If one attribute is removed from the candidate key, then uniqueness no longer holds
NULL NULL is considered a member of every domain NULL means: –Value is currently unknown –Value is not applicable for this entry (e.g. “spouse” of a single person) Don’t confuse NULL with –Zero –Blank or empty string
Relational Schema “Shorthand” List the relation name first Attribute names in parentheses following the relation name Primary key attribute(s) underlined Example –Branch (BranchNo, street, city, postcode) –Staff (StaffNo, fName, position, sex, DOB, salary, BranchNo) Arrow from foreign key to its referent
Integrity Constraints Rules that apply to all instances of any database If these rules are violated, the instance is not a legal database (it doesn’t “make sense”) Some come from business rules of enterprise –“Students can take only 1 course in 3-week term” Others from the schema design –“InstructorID cannot be NULL and must match the ID of a listed faculty member”
Key Constraint Specifies which attributes compose a candidate key (or primary key) for the relation Example: –{first, last} and { } are both keys of the Phone Book relation Generally, only the primary key is specified In practice, an auto-generated id number is used as the key
Basic Integrity Constraints –Entity Integrity: No attribute of a primary key can be NULL for any tuple. This constraint is implicit in the KEY constraint in SQL –Referential Integrity: If a relation has a foreign key, either the foreign key is NULL, or the foreign key matches the primary key of an existing tuple in another relation. Attribute names do not have to be the same
Maintaining Referential Integrity Adding a new tuple –Add with NULL foreign keys –Add tuple in the other relation (with primary key matching the foreign key) first Deleting a tuple –Find all tuples in other relations whose foreign keys match this one –Either delete those tuples, or change their foreign keys to NULL Modifying a tuple –If a foreign key is changed, make sure that the new value points to an existing tuple or is NULL
Summary of Constraints Domain constraint –Each attribute’s values come from its domain Key constraint –Every relation has a primary key Entity integrity constraint –No NULL primary key attributes Referential integrity constraint –Foreign keys are NULL or refer to existing tuples’ primary keys Semantic constraints –Any additional constraints that must be satisfied based on the client’s needs
Views “Relations” that appear to exist, but actually combine attributes from multiple base (implemented) relations. When a base table changes, the view changes & vice versa Useful for –Security: limit which attributes a user sees –Customization –Simplification