Chap 2. The Relational Model of Data

Slides:



Advertisements
Similar presentations
IS698: Database Management Min Song IS NJIT. The Relational Data Model.
Advertisements

Database Modifications CIS 4301 Lecture Notes Lecture /30/2006.
Database Modifications, Data Types, Views. Database Modifications A modification command does not return a result as a query does, but it changes the.
Database Modifications, Data Types, Views. Database Modifications A modification command does not return a result as a query does, but it changes the.
Subqueries Example Find the name of the producer of ‘Star Wars’.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 28 Database Systems I The Relational Data Model.
SQL. 1.SQL is a high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 7.
Operations in the Relational Model These operation can be expressed in an algebra, called “relational algebra”. In this algebra relations are the operands.
SQL SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in.
Database Modifications A modification command does not return a result as a query does, but it changes the database in some way. There are three kinds.
Joins Natural join is obtained by: R NATURAL JOIN S; Example SELECT * FROM MovieStar NATURAL JOIN MovieExec; Theta join is obtained by: R JOIN S ON Example.
SQL Overview Defining a Schema CPSC 315 – Programming Studio Spring 2008 Project 1, Lecture 3 Slides adapted from those used by Jeffrey Ullman, via Jennifer.
Correlated Queries SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title); Movie (title, year, director, length)
1 Relational Data Model CS 157B Nidhi Patel. 2 What is a Data Model? A notation for describing data or information A notation for describing data or information.
SQL Overview Defining a Schema CPSC 315 – Programming Studio Slides adapted from those used by Jeffrey Ullman, via Jennifer Welch Via Yoonsuck Choe.
Structured Query Language (SQL) A2 Teacher Up skilling LECTURE 2.
Relational Algebra CIS 4301 Lecture Notes Lecture /28/2006.
Relational Model 2015, Fall Pusan National University Ki-Joune Li.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
Relational Algebra Spring 2012 Instructor: Hassan Khosravi.
Introduction to Indexes. Indexes An index on an attribute A of a relation is a data structure that makes it efficient to find those tuples that have a.
SQL Fundamentals  SQL: Structured Query Language is a simple and powerful language used to create, access, and manipulate data and structure in the database.
Advanced Database CS-426 Week 1 - Introduction. Database Management System DBMS contains information about a particular enterprise Collection of interrelated.
Lu Chaojun, SJTU Relational Data Model 1. Lu Chaojun, SJTU What’s a Data Model? A notation (collection of conceptual tools) for describing data as seen.
1 More SQL uDatabase Modification uDefining a Database Schema uViews.
CS 338The Relational Model2-1 The Relational Model Lecture Topics Overview of SQL Underlying relational model Relational database structure SQL DDL and.
Referential Integrity checks, Triggers and Assertions Examples from Chapter 7 of Database Systems: the Complete Book Garcia-Molina, Ullman, & Widom.
Advanced SQL Concepts - Checking of Constraints CIS 4301 Lecture Notes Lecture /6/2006.
© D. Wong Normalization  Purpose: process to eliminate redundancy in relations due to functional or multi-valued dependencies.  Decompose relation.
CS 157B Database Systems Dr. T Y Lin. Updates 1.Red color denotes updated data (ppt) 2.Class participation will be part of “extra” credits to to “quiz.
CMPT 258 Database Systems The Relationship Model (Chapter 3)
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Chapter 2 THE RELATIONAL MODEL OF DATA 1.
Week 8-9 SQL-1. SQL Components: DDL, DCL, & DML SQL is a very large and powerful language, but every type of SQL statement falls within one of three main.
SQL Exercises – Part I April
The Relational Model of Data Prof. Yin-Fu Huang CSIE, NYUST Chapter 2.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 18 A First Course in Database Systems.
CS 157B Database Systems Dr. T Y Lin. 1.2 Overview of a Database Management System Data-Definition Language Commands –Illustrated by three examples.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
1 Constraints and Triggers in SQL. 2 Constraints are conditions that must hold on all valid relation instances SQL2 provides a variety of techniques for.
Subqueries CIS 4301 Lecture Notes Lecture /23/2006.
Fundamentals of DBMS Notes-1.
CPSC-310 Database Systems
Relational Data Model Lu Chaojun, SJTU.
CS 480: Database Systems Lecture 13 February 13,2013.
Chap 5. The DB Language (SQL)
Introduction to Structured Query Language (SQL)
THE RELATIONAL MODEL OF DATA
Lecture 2 The Relational Model
Database Construction and Usage
Chapter 2: Intro to Relational Model
Relational Algebra Chapter 4, Part A
Database Models Relational Model
THE RELATIONAL MODEL OF DATA
Relational Databases The Relational Model.
Relational Databases The Relational Model.
SQL OVERVIEW DEFINING A SCHEMA
2018, Fall Pusan National University Ki-Joune Li
Defining a Database Schema
Data Model.
CMPT 354: Database System I
SQL-1 Week 8-9.
Session - 6 Sequence - 1 SQL: The Structured Query Language:
Chapter 2: Intro to Relational Model
Chapter 2: Intro to Relational Model
SQL – Constraints & Triggers
Query Compiler By:Payal Gupta Shirali Choksi Professor :Tsau Young Lin.
CMSC-461 Database Management Systems
SQL (Structured Query Language)
2019, Fall Pusan National University Ki-Joune Li
Presentation transcript:

Chap 2. The Relational Model of Data

Contents An Overview of Data Models Basics of the Relational Model Defining a Relation Schema in SQL after Chapter 6 (SQL) An Algebraic Query Language Constraints on Relations

An Overview of Data Models Data model (when focused on the structure): abstract description on the logical structure of data Data model abstract description of data the description generally consists of structure and operations with certain constraints structure of the data high-level description on the structure of the data sometimes referred to as a conceptual (data) model Higher level than data structures in C or Java such as arrays and structures

An Overview of Data Models (cont’d) operations on the data usually a limited set of high-level operations in DB data model queries operations that retrieve information modifications operations that change the database constraints on the data a way to describe limitations on what the data can be (ex) “a movie has at most one title” “a day of the week is an integer between 1 and 7”

An Overview of Data Models (cont’d) Various data models relational model widely used in all commercial database management systems semistructured-data model includes XML and related standards other data models object-oriented model may be used for some special purpose applications object-relational model O-O features are added to the relational model hierarchical model, network model: used in earlier DBMS

Basics of the Relational Model a two-dimensional table set of tuples whose components have atomic values attributes The relation Movies (or table) title year length genre Gone With the Wind 1939 231 drama Star Wars 1977 124 sciFi Wayne’s World 1992 95 comedy movie1 tuples (rows) movie3 Each row represents a movie Each column represents a property of movies

Basics of the Relational Model (cont’d) Attributes names for the columns of the relation (ex) title, year, length, genre in relation Movies Tuples rows of a relation (ex) (Star Wars, 1977, 124, sciFi) Domains an elementary type associated with each attribute of a relation (ex) The value for an attribute title must be a string whose length is less than or equal to 30 the relational model requires that each attribute be atomic, i.e., a record structure, set, list, etc are not allowed

Basics of the Relational Model (cont’d) Schema description of data itself relation schema name of a relation and the set of attributes for a relation (ex) the schema for relation Movies Movies (title, year, length, genre) relational database schema (or simply, database schema) a set of schemas for the relations of a database Relation instance a set of tuples for a given relation

Basics of the Relational Model (cont’d) Equivalent representations of a relation the order of tuples in a relation is irrelevant a relation is a set of tuples, not a list of tuples the column order is also irrelevant year genre title length 1977 sciFi Star Wars 124 1992 comedy Wayne’s World 95 1939 drama Gone With the Wind 231 Another presentation of the relation Movies

Basics of the Relational Model (cont’d) Key of a relation a fundamental constraint an attribute (or a set of attributes) in a relation, where no two tuples are allowed to have the same values in all the attributes of the key (ex) Declare that title and year form a key in Movies for unique identification of a tuple (King Kong, 1980, . . . ) No

Basics of the Relational Model (cont’d) Notation for the key attribute(s) use underlines e.g., Movies(title, year, length, genre) Key constraint is about all possible instances of the relation not about a single instance There can be several keys in a relation (ex) Suppose a relation Students the social-security number, student ID, etc. can serve as a key

Basics of the Relational Model (cont’d) An example database schema Movies (title:string, year:integer, length:integer, genre:string, studioName:string, producerC#:integer) MovieStar (name:string, address:string, gender:char, birthdate:date) StarsIn (movieTitle:string, movieYear:integer, starName:string) MovieExec (name:string, address:string, cert#:integer, netWorth:integer) Studio (name:string, address:string, presC#:integer) All move executives: including producers in Movies and presidents in Studio

Basics of the Relational Model (cont’d) Movies(title, year, length, genre, studioName, producerC#) MovieStar(name, address, gender, birthdate) StarsIn(movieTitle, movieYear, starName) MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#) DBMS allows us to see the data in this way. Do not need to know how data are physically organized. - order of attributes - delimiters between values - length of strings - existence of indexes, etc

Defining a Relation Schema in SQL sometimes pronounced “sequel” the principal language used to describe and manipulate relational databases Data Definition Language (DDL) for declaring database schemas Data Manipulation Language (DML) for querying and modifying the database

Defining a Relation Schema in SQL (cont’d) Relations in SQL stored relations (or tables) relations that exist in the database views relations defined by a computation not stored, but constructed in whole or in part, when needed temporary tables constructed by the SQL language processor during execution thrown away and not stored

Defining a Relation Schema in SQL (cont’d) Data types INT (or INTEGER), SHORTINT FLOAT (or REAL), DOUBLE PRECISION, DECIMAL DECIMAL(n,d) n decimal digits with the decimal point assumed to be d positions from the right e.g., DECIMAL(6, 2): 0123.45 NUMERIC: almost a synonym for DECIMAL CHAR(n), VARCHAR(n) character strings of fixed or varying length

Defining a Relation Schema in SQL (cont’d) BIT(n), BIT VARYING(n) bit strings of fixed or varying length BOOLEAN TRUE, FALSE, UNKNOWN DATE and TIME character strings of a special form

Defining a Relation Schema in SQL (cont’d) Table declarations: CREATE TABLE table-name CREATE TABLE relation name and a parenthesized, comma-separated list of the attribute names and their types CREATE TABLE MovieStar ( name CHAR(30), address VARCHAR(255), gender CHAR(1), birthdate DATE ); Table deletions: DROP TABLE table-name DROP TABLE MovieStar;

Defining a Relation Schema in SQL (cont’d) Modifying relation schemas: ALTER TABLE table-name ALTER TABLE ADD followed by an attribute name and its data type DROP followed by an attribute name ALTER TABLE MovieStar ADD phone CHAR(16); ALTER TABLE MovieStar DROP birthdate; Existing tuples do not have values. NULL value is used when a specific value is not given. NULL: unknown value (or undefined value)

Defining a Relation Schema in SQL (cont’d) Default values keyword DEFAULT and appropriate value gender CHAR(1) DEFAULT ‘?’, birthdate DATE DEFAULT DATE ‘0000-00-00’ ALTER TABLE MovieStar ADD phone CHAR(16) DEFAULT ‘unlisted’;

Defining a Relation Schema in SQL (cont’d) Declaring keys declare in the CREATE TABLE statement PRIMARY KEY NULL is not allowed in the attributes of a key UNIQUE NULL is permitted

Defining a Relation Schema in SQL (cont’d) (Ex) Declaring keys CREATE TABLE MovieStar ( name CHAR(30) PRIMARY KEY, address VARCHAR(255), gender CHAR(1), birthdate DATE); CREATE TABLE MovieStar ( name CHAR(30), address VARCHAR(255), gender CHAR(1), birthdate DATE, PRIMARY KEY(name) ); CREATE TABLE Movies( title CHAR(100), year INT, length INT, genre CHAR(10), studioName CHAR(30), producerC# INT, PRIMARY KEY(title, year) ) ; When no PRIMARY KEY, the relation is a bag.

An Algebraic Query Language Relational algebra a formal query language construct new relations from given relations simple but powerful not used directly in commercial DBMS, but SQL incorporates the relational algebra at its center SQL query is often translated into relational algebra

An Algebraic Query Language (cont’d) Advantages of relational algebra over conventional programming languages like C or Java ease of programming though less powerful than C or Java optimized by the compiler e.g., compiler can choose the best available sorting algorithm for the relation to be sorted Algebra in general consists of operators and operands operands in the relational algebra: relations (x + y) * z ((x + 7) / (y – 3)) + x

An Algebraic Query Language (cont’d) Operations of the relational algebra usual set operations union, intersection, difference operations that remove parts of a relation selection, projection operations that combine the tuples of two relations Cartesian product, join renaming operations change the names of the attributes or the name of the relation itself

An Algebraic Query Language (cont’d) Set operations: ⋃, ⋂, – R, S: relations union: R ⋃ S intersection: R ⋂ S difference: R – S Condition R and S must have schemas with identical sets of attributes the order of attributes in R and S must be the same

An Algebraic Query Language (cont’d) Projection: π π A1,A2,...,An(R) produce a relation that has only A1,A2,...,An attributes of R Movies title year length genre studioName producerC# Star Wars Galaxy Quest Wayne ’ s World 1977 1999 1992 124 104 95 sciFi comedy Fox DreamWorks Paramount 12345 67890 99999 πtitle, year, length(Movies) πgenre(Movies) genre sciFi comedy title year length Star Wars Galaxy Quest Wayne ’ s World 1977 1999 1992 124 104 95

An Algebraic Query Language (cont’d) Selection: s produces a relation with a subset of tuples of the operand relation sC(R) a set of tuples that satisfy a condition C C: conditional expression operands in C are either constants or attributes of R  length>100 (Movies)

An Algebraic Query Language (cont’d) Cartesian Product: × set of pairs of tuples from R and S first element of the pair: any tuple of R second element of the pair: any tuple of S A R.B S.B C D 1 3 2 4 9 5 7 10 6 8 11 R S R×S

An Algebraic Query Language (cont’d) Natural Joins: ⋈ set of pairs of tuples from R and S that agree in common attributes of R and S Remove duplicate columns Dangling tuple - a tuple that fails to be joined (Ex) Natural Join common attribute R ⋈ S One of duplicated columns are removed R S dangling tuple

An Algebraic Query Language (cont’d) (Ex) Natural Join: when there are more than one common attributes U V One of duplicated columns are removed U⋈V

Note: Natural join Definition of Natural Joins R ⋈ S = πL [sC (R ⨉ S)], where L : union of all the attributes in R and S C : R.A1= S.A1  R.A2= S.A2  . . .  R.An= S.An {A1, A2, . . . , An}: set of common attributes of R and S If R and S have no common attributes, R ⋈ S = R ⨉ S Because there is no selection condition s, π : produce a subset of a single relation ⋈ : produce a subset of a Cartesian product of two relations

An Algebraic Query Language (cont’d) Theta-Joins: R ⋈C S = sC (R ⨉ S) pair tuples from two relations on some condition 1. take the product of R and S 2. select from the product only those tuples that satisfy the condition C U V A U.B U.C V.B V.C D 1 6 9 2 7 3 8 4 5 10 U ⋈A<DV Duplicated columns are not eliminated

An Algebraic Query Language (cont’d) Combining operations to form queries construct complex expressions by applying operations to the results of other expressions (Ex) Find the titles and years of movies made by “Fox” studio that are at least 100 minutes long. p title, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) relations

An Algebraic Query Language (cont’d) ptitle, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) Expression tree for a relational algebra expression leaf node: a relation nonleaf node: an operator p title, year ∩ s length  100 s studioName = ‘Fox’ Movies evaluated bottom-up by applying the operator (at a nonleaf node) to its children

Note: Equivalent expression Equivalent expressions expressions that produce the same answer whenever they are given the same relations as operands (ex) p title, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) p title, year (s length >100 AND studioName = ‘Fox’ (Movies)) p title, year s length  100 AND studioName = ‘Fox’ Movies Query optimizer replace one expression by an equivalent expression that is more efficiently evaluated

An Algebraic Query Language (cont’d) Renaming: r S(A1,A2,...,An) (R) only change names same tuples as R resulting relation has name S and attributes A1, A2, ..., An the resulting relation has exactly the same tuples R S R ⨉ r S(X,C,D) (S)

An Algebraic Query Language (cont’d) R⋂S Relationships among operations dependent operators R ⋂ S = R – (R – S) R ⋈C S = sC (R ⨉ S) R ⋈ S = pL (sC (R ⨉ S)) independent operators (or fundamental operators) selection, projection, union, difference, cartesian product, (renaming) cannot be written in terms of others R S R-S

An Algebraic Query Language (cont’d) Linear notation for algebraic expressions use temporary relations together with a sequence of assignments (ex) ptitle, year (slength  100 (Movies) ∩ sstudioName= ‘Fox’ (Movies)) R (t, y, l, g, s, p) :=  length  100 (Movies) S (t, y, l, g, s, p) :=  studioName = ‘Fox’ (Movies) T (t, y, l, g, s, p) := R ∩ S Answer (title, year) := p t, y (T) temporary relations: R, S, T, Answer Answer(title, year) := p t, y (R ∩ S) relational algebra expression expression tree sequence of assignments to temporary relations

Constraints on Relations restriction on the data, e.g., possible values in attribute “gender” Relational algebra as a constraint language relational algebra can be used to express constraints e.g., key constraint two ways to express constraints R, S: expressions of relational algebra R = f : “There are no tuples in the result of R” R Í S : “Every tuple in R must also be in S” These two ways are actually equivalent R Í S can be written R - S = f R = f can be written R Í f

Constraints on Relations (cont’d) Referential integrity constraints if a value v appears in attribute A of relation R, then v must appear in a particular attribute (say B) in relation S referential integrity constraint in relational algebra πA(R) ⊆ πB(S), or πA(R) – πB(S) = ϕ We expect that every department is in the Departments table CS ... 1500 ... Smith CS ... ... ??? Departments Students Jones ... Stuart ... BioChem

Constraints on Relations (cont’d) (Ex) Consider the following two relations: Movies (title, year, length, genre, studioName, producerC#) MovieExec (name, address, certificate#, netWorth) The producer of every movie has to appear in MovieExec. p producerC# (Movies) Í p certificate# (MovieExec), or p producerC# (Movies) - p certificate# (MovieExec) = f

Constraints on Relations (cont’d) (Ex) A referential integrity where the value involved is represented by more than one attribute. StarsIn(movieTitle, movieYear, starName) Movies(title, year, length, genre, studioName, producerC#) Any movie mentioned in StarsIn also appears in Movies. p movieTitle, movieYear (StarsIn) Í p title, year (Movies)

Constraints on Relations (cont’d) Key constraints MovieStar(name, address, gender, birthdate) name attribute is a key no two tuples agree on the name component if two tuples agree on name, then they must also agree on address these two tuples must be the same tuples and agree in all attributes s MS1.name = MS2.name AND MS1.address ¹ MS2.address (MS1×MS2) = f MS1=r MS1(name, address, gender, birthdate) (MovieStar) MS2=r MS2(name, address, gender, birthdate) (MovieStar) a correct key constraint Not exactly a key constraint, but a functional dependency MS1.name = MS2.name AND (MS1.address ¹ MS2.address OR MS1.gender ¹ MS2.gender OR MS1.birthdate ¹ MS2.birthdate)

Constraints on Relations (cont’d) Additional constraints (Ex) Values of gender attribute of MovieStar must be ‘F’ or ‘M’ s gender ¹ ‘F’ AND gender¹’M’ (MovieStar) = f Domain constraint

Constraints on Relations (cont’d) (Ex) One must have a net worth of at least $10,000,000 to be the president of a movie studio. We have assumed a referential integrity constraint from Studio.presC# to MovieExec.cert# MovieExec(name, address, cert#, netWorth) Studio(name, address, presC#) snetWorth<10000000 (Studio ⋈presC#=cert# MovieExec) = f or ppresC# (Studio) Í pcert# (snetWorth ³10000000 (MovieExec)) Neither domain constraint, nor referential integrity constraint