Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chap 2. The Relational Model of Data

Similar presentations


Presentation on theme: "Chap 2. The Relational Model of Data"— Presentation transcript:

1 Chap 2. The Relational Model of Data

2 Contents An Overview of Data Models Basics of the Relational Model
Defining a Relation Schema in SQL after Chapter 6 (SQL) An Algebraic Query Language Constraints on Relations

3 An Overview of Data Models
Data model (when focused on the structure): abstract description on the logical structure of data Data model abstract description of data the description generally consists of structure and operations with certain constraints structure of the data high-level description on the structure of the data sometimes referred to as a conceptual (data) model Higher level than data structures in C or Java such as arrays and structures

4 An Overview of Data Models (cont’d)
operations on the data usually a limited set of high-level operations in DB data model queries operations that retrieve information modifications operations that change the database constraints on the data a way to describe limitations on what the data can be (ex) “a movie has at most one title” “a day of the week is an integer between 1 and 7”

5 An Overview of Data Models (cont’d)
Various data models relational model widely used in all commercial database management systems semistructured-data model includes XML and related standards other data models object-oriented model may be used for some special purpose applications object-relational model O-O features are added to the relational model hierarchical model, network model: used in earlier DBMS

6 Basics of the Relational Model
a two-dimensional table set of tuples whose components have atomic values attributes The relation Movies (or table) title year length genre Gone With the Wind drama Star Wars sciFi Wayne’s World comedy movie1 tuples (rows) movie3 Each row represents a movie Each column represents a property of movies

7 Basics of the Relational Model (cont’d)
Attributes names for the columns of the relation (ex) title, year, length, genre in relation Movies Tuples rows of a relation (ex) (Star Wars, 1977, 124, sciFi) Domains an elementary type associated with each attribute of a relation (ex) The value for an attribute title must be a string whose length is less than or equal to 30 the relational model requires that each attribute be atomic, i.e., a record structure, set, list, etc are not allowed

8 Basics of the Relational Model (cont’d)
Schema description of data itself relation schema name of a relation and the set of attributes for a relation (ex) the schema for relation Movies Movies (title, year, length, genre) relational database schema (or simply, database schema) a set of schemas for the relations of a database Relation instance a set of tuples for a given relation

9 Basics of the Relational Model (cont’d)
Equivalent representations of a relation the order of tuples in a relation is irrelevant a relation is a set of tuples, not a list of tuples the column order is also irrelevant year genre title length sciFi Star Wars comedy Wayne’s World drama Gone With the Wind Another presentation of the relation Movies

10 Basics of the Relational Model (cont’d)
Key of a relation a fundamental constraint an attribute (or a set of attributes) in a relation, where no two tuples are allowed to have the same values in all the attributes of the key (ex) Declare that title and year form a key in Movies for unique identification of a tuple (King Kong, 1980, ) No

11 Basics of the Relational Model (cont’d)
Notation for the key attribute(s) use underlines e.g., Movies(title, year, length, genre) Key constraint is about all possible instances of the relation not about a single instance There can be several keys in a relation (ex) Suppose a relation Students the social-security number, student ID, etc. can serve as a key

12 Basics of the Relational Model (cont’d)
An example database schema Movies (title:string, year:integer, length:integer, genre:string, studioName:string, producerC#:integer) MovieStar (name:string, address:string, gender:char, birthdate:date) StarsIn (movieTitle:string, movieYear:integer, starName:string) MovieExec (name:string, address:string, cert#:integer, netWorth:integer) Studio (name:string, address:string, presC#:integer) All move executives: including producers in Movies and presidents in Studio

13 Basics of the Relational Model (cont’d)
Movies(title, year, length, genre, studioName, producerC#) MovieStar(name, address, gender, birthdate) StarsIn(movieTitle, movieYear, starName) MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#) DBMS allows us to see the data in this way. Do not need to know how data are physically organized. - order of attributes - delimiters between values - length of strings - existence of indexes, etc

14 Defining a Relation Schema in SQL
sometimes pronounced “sequel” the principal language used to describe and manipulate relational databases Data Definition Language (DDL) for declaring database schemas Data Manipulation Language (DML) for querying and modifying the database

15 Defining a Relation Schema in SQL (cont’d)
Relations in SQL stored relations (or tables) relations that exist in the database views relations defined by a computation not stored, but constructed in whole or in part, when needed temporary tables constructed by the SQL language processor during execution thrown away and not stored

16 Defining a Relation Schema in SQL (cont’d)
Data types INT (or INTEGER), SHORTINT FLOAT (or REAL), DOUBLE PRECISION, DECIMAL DECIMAL(n,d) n decimal digits with the decimal point assumed to be d positions from the right e.g., DECIMAL(6, 2): NUMERIC: almost a synonym for DECIMAL CHAR(n), VARCHAR(n) character strings of fixed or varying length

17 Defining a Relation Schema in SQL (cont’d)
BIT(n), BIT VARYING(n) bit strings of fixed or varying length BOOLEAN TRUE, FALSE, UNKNOWN DATE and TIME character strings of a special form

18 Defining a Relation Schema in SQL (cont’d)
Table declarations: CREATE TABLE table-name CREATE TABLE relation name and a parenthesized, comma-separated list of the attribute names and their types CREATE TABLE MovieStar ( name CHAR(30), address VARCHAR(255), gender CHAR(1), birthdate DATE ); Table deletions: DROP TABLE table-name DROP TABLE MovieStar;

19 Defining a Relation Schema in SQL (cont’d)
Modifying relation schemas: ALTER TABLE table-name ALTER TABLE ADD followed by an attribute name and its data type DROP followed by an attribute name ALTER TABLE MovieStar ADD phone CHAR(16); ALTER TABLE MovieStar DROP birthdate; Existing tuples do not have values. NULL value is used when a specific value is not given. NULL: unknown value (or undefined value)

20 Defining a Relation Schema in SQL (cont’d)
Default values keyword DEFAULT and appropriate value gender CHAR(1) DEFAULT ‘?’, birthdate DATE DEFAULT DATE ‘ ’ ALTER TABLE MovieStar ADD phone CHAR(16) DEFAULT ‘unlisted’;

21 Defining a Relation Schema in SQL (cont’d)
Declaring keys declare in the CREATE TABLE statement PRIMARY KEY NULL is not allowed in the attributes of a key UNIQUE NULL is permitted

22 Defining a Relation Schema in SQL (cont’d)
(Ex) Declaring keys CREATE TABLE MovieStar ( name CHAR(30) PRIMARY KEY, address VARCHAR(255), gender CHAR(1), birthdate DATE); CREATE TABLE MovieStar ( name CHAR(30), address VARCHAR(255), gender CHAR(1), birthdate DATE, PRIMARY KEY(name) ); CREATE TABLE Movies( title CHAR(100), year INT, length INT, genre CHAR(10), studioName CHAR(30), producerC# INT, PRIMARY KEY(title, year) ) ; When no PRIMARY KEY, the relation is a bag.

23 An Algebraic Query Language
Relational algebra a formal query language construct new relations from given relations simple but powerful not used directly in commercial DBMS, but SQL incorporates the relational algebra at its center SQL query is often translated into relational algebra

24 An Algebraic Query Language (cont’d)
Advantages of relational algebra over conventional programming languages like C or Java ease of programming though less powerful than C or Java optimized by the compiler e.g., compiler can choose the best available sorting algorithm for the relation to be sorted Algebra in general consists of operators and operands operands in the relational algebra: relations (x + y) * z ((x + 7) / (y – 3)) + x

25 An Algebraic Query Language (cont’d)
Operations of the relational algebra usual set operations union, intersection, difference operations that remove parts of a relation selection, projection operations that combine the tuples of two relations Cartesian product, join renaming operations change the names of the attributes or the name of the relation itself

26 An Algebraic Query Language (cont’d)
Set operations: ⋃, ⋂, – R, S: relations union: R ⋃ S intersection: R ⋂ S difference: R – S Condition R and S must have schemas with identical sets of attributes the order of attributes in R and S must be the same

27 An Algebraic Query Language (cont’d)
Projection: π π A1,A2,...,An(R) produce a relation that has only A1,A2,...,An attributes of R Movies title year length genre studioName producerC# Star Wars Galaxy Quest Wayne s World 1977 1999 1992 124 104 95 sciFi comedy Fox DreamWorks Paramount 12345 67890 99999 πtitle, year, length(Movies) πgenre(Movies) genre sciFi comedy title year length Star Wars Galaxy Quest Wayne s World 1977 1999 1992 124 104 95

28 An Algebraic Query Language (cont’d)
Selection: s produces a relation with a subset of tuples of the operand relation sC(R) a set of tuples that satisfy a condition C C: conditional expression operands in C are either constants or attributes of R  length>100 (Movies)

29 An Algebraic Query Language (cont’d)
Cartesian Product: × set of pairs of tuples from R and S first element of the pair: any tuple of R second element of the pair: any tuple of S A R.B S.B C D 1 3 2 4 9 5 7 10 6 8 11 R S R×S

30 An Algebraic Query Language (cont’d)
Natural Joins: ⋈ set of pairs of tuples from R and S that agree in common attributes of R and S Remove duplicate columns Dangling tuple - a tuple that fails to be joined (Ex) Natural Join common attribute R ⋈ S One of duplicated columns are removed R S dangling tuple

31 An Algebraic Query Language (cont’d)
(Ex) Natural Join: when there are more than one common attributes U V One of duplicated columns are removed U⋈V

32 Note: Natural join Definition of Natural Joins
R ⋈ S = πL [sC (R ⨉ S)], where L : union of all the attributes in R and S C : R.A1= S.A1  R.A2= S.A2   R.An= S.An {A1, A2, , An}: set of common attributes of R and S If R and S have no common attributes, R ⋈ S = R ⨉ S Because there is no selection condition s, π : produce a subset of a single relation ⋈ : produce a subset of a Cartesian product of two relations

33 An Algebraic Query Language (cont’d)
Theta-Joins: R ⋈C S = sC (R ⨉ S) pair tuples from two relations on some condition 1. take the product of R and S 2. select from the product only those tuples that satisfy the condition C U V A U.B U.C V.B V.C D 1 6 9 2 7 3 8 4 5 10 U ⋈A<DV Duplicated columns are not eliminated

34 An Algebraic Query Language (cont’d)
Combining operations to form queries construct complex expressions by applying operations to the results of other expressions (Ex) Find the titles and years of movies made by “Fox” studio that are at least 100 minutes long. p title, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) relations

35 An Algebraic Query Language (cont’d)
ptitle, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) Expression tree for a relational algebra expression leaf node: a relation nonleaf node: an operator p title, year s length  100 s studioName = ‘Fox’ Movies evaluated bottom-up by applying the operator (at a nonleaf node) to its children

36 Note: Equivalent expression
Equivalent expressions expressions that produce the same answer whenever they are given the same relations as operands (ex) p title, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) p title, year (s length >100 AND studioName = ‘Fox’ (Movies)) p title, year s length  100 AND studioName = ‘Fox’ Movies Query optimizer replace one expression by an equivalent expression that is more efficiently evaluated

37 An Algebraic Query Language (cont’d)
Renaming: r S(A1,A2,...,An) (R) only change names same tuples as R resulting relation has name S and attributes A1, A2, ..., An the resulting relation has exactly the same tuples R S R ⨉ r S(X,C,D) (S)

38 An Algebraic Query Language (cont’d)
R⋂S Relationships among operations dependent operators R ⋂ S = R – (R – S) R ⋈C S = sC (R ⨉ S) R ⋈ S = pL (sC (R ⨉ S)) independent operators (or fundamental operators) selection, projection, union, difference, cartesian product, (renaming) cannot be written in terms of others R S R-S

39 An Algebraic Query Language (cont’d)
Linear notation for algebraic expressions use temporary relations together with a sequence of assignments (ex) ptitle, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) R (t, y, l, g, s, p) :=  length  100 (Movies) S (t, y, l, g, s, p) :=  studioName = ‘Fox’ (Movies) T (t, y, l, g, s, p) := R ∩ S Answer (title, year) := p t, y (T) temporary relations: R, S, T, Answer Answer(title, year) := p t, y (R ∩ S) relational algebra expression expression tree sequence of assignments to temporary relations

40 Constraints on Relations
restriction on the data, e.g., possible values in attribute “gender” Relational algebra as a constraint language relational algebra can be used to express constraints e.g., key constraint two ways to express constraints R, S: expressions of relational algebra R = f : “There are no tuples in the result of R” R Í S : “Every tuple in R must also be in S” These two ways are actually equivalent R Í S can be written R - S = f R = f can be written R Í f

41 Constraints on Relations (cont’d)
Referential integrity constraints if a value v appears in attribute A of relation R, then v must appear in a particular attribute (say B) in relation S referential integrity constraint in relational algebra πA(R) ⊆ πB(S), or πA(R) – πB(S) = ϕ We expect that every department is in the Departments table CS ... Smith CS ... ... ??? Departments Students Jones Stuart BioChem

42 Constraints on Relations (cont’d)
(Ex) Consider the following two relations: Movies (title, year, length, genre, studioName, producerC#) MovieExec (name, address, certificate#, netWorth) The producer of every movie has to appear in MovieExec. p producerC# (Movies) Í p certificate# (MovieExec), or p producerC# (Movies) - p certificate# (MovieExec) = f

43 Constraints on Relations (cont’d)
(Ex) A referential integrity where the value involved is represented by more than one attribute. StarsIn(movieTitle, movieYear, starName) Movies(title, year, length, genre, studioName, producerC#) Any movie mentioned in StarsIn also appears in Movies. p movieTitle, movieYear (StarsIn) Í p title, year (Movies)

44 Constraints on Relations (cont’d)
Key constraints MovieStar(name, address, gender, birthdate) name attribute is a key no two tuples agree on the name component if two tuples agree on name, then they must also agree on address these two tuples must be the same tuples and agree in all attributes s MS1.name = MS2.name AND MS1.address ¹ MS2.address (MS1×MS2) = f MS1=r MS1(name, address, gender, birthdate) (MovieStar) MS2=r MS2(name, address, gender, birthdate) (MovieStar) a correct key constraint Not exactly a key constraint, but a functional dependency MS1.name = MS2.name AND (MS1.address ¹ MS2.address OR MS1.gender ¹ MS2.gender OR MS1.birthdate ¹ MS2.birthdate)

45 Constraints on Relations (cont’d)
Additional constraints (Ex) Values of gender attribute of MovieStar must be ‘F’ or ‘M’ s gender ¹ ‘F’ AND gender¹’M’ (MovieStar) = f Domain constraint

46 Constraints on Relations (cont’d)
(Ex) One must have a net worth of at least $10,000,000 to be the president of a movie studio. We have assumed a referential integrity constraint from Studio.presC# to MovieExec.cert# MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#) snetWorth< (Studio ⋈presC#=cert# MovieExec) = f or ppresC# (Studio) Í pcert# (snetWorth ³ (MovieExec)) Neither domain constraint, nor referential integrity constraint


Download ppt "Chap 2. The Relational Model of Data"

Similar presentations


Ads by Google