Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ch 8: Foundations of Relational Implementation

Similar presentations


Presentation on theme: "Ch 8: Foundations of Relational Implementation"— Presentation transcript:

1 Ch 8: Foundations of Relational Implementation
Relational data definition & relational terminology Explaining how a design is defined to the DBMS Data Manipulation Language (DML) Basic operators of relational algebra

2 8.1 Defining Relational Data
Tasks in implementation of a relational DB 1. Structure of the DB must be defined to the DBMS. developer uses (DDL) or some equivalent means such as graphical display 2. Database is allocated to physical media 3. Filled with data

3 8.1.1 Review of Terminology Properties of Relation
1. The entries in the relation are single value 2. All the entries in any column are of the same kind each column(Attribute) has a unique name and order is not important to the relation Each attribute has a domain 3. No two rows(Tuples) in the relation are identical and the order of the rows is not important

4 Example of Relation PATIENT relation
Relation structure PATIENT(Name, Age, Gender, AccountNumber, Physician) If we add constraints on allowable data values to the relation structure, we then have a relational schema

5

6 Confusion Regarding the Term Key
(1) During the Design- logical key -Key refers to one or more columns that uniquely identify a row in a relation (2) During the Implementation- physical key -Key refers a column on which the DBMS builds an index or other data structure -used to access rows quickly (physical key need not be unique, and often, in fact, they are not) Ex.:ORDER(OrderNumber, OrderDate, CustNumber, Amount) In relational design: OrderNumber is unique identifier In relational Implementation: any of 4 columns could be a key

7 Index Some people consider Physical key  Index , Logical Key  Key
Three reasons for defining Indexes 1. To allow rows to be quickly accessed by means of the indexed attribute’s value 2. Facilitate sorting rows by that attribute e.g. In ORDER, OrderDate might be defined as a key so that a report showing orders by date can be more quickly generated 3. Uniqueness Indexes do not have to be unique. But sometime DBMS creates an unique index to ensure that no duplicated values are accepted by the DBMS

8 8.1.2 Implementing a Relational Database
Implementation procedures (1) Defining the Database Structure to the DBMS (2) Allocating Media Space (3) Creating the Database Data

9 (1) Defining the Database Structure to the DBMS
DDL (Data Definition Language) Graphical definition facilities: DBMS products on PC Textual DDL: DBMS products on Servers and mainframes

10 (1) Defining the Database Structure to the DBMS

11 (2) Allocating Media Space
For personal database: assign the database to a directory and give the database a name (DBMS allocates storage space automatically) For server and mainframe: to improve performance and control, the distribution of the database data across disks and channels must be carefully planned (e.g. is may be advantageous to locate certain tables on the same disk, or it may be important to ensure that certain tables are not located on the same disk) Ex.: consider an order object that composed of data from ORDER, LINE-ITEM, ITEM tables Application retrieves one row from ORDER, several rows from LINE-ITEM, and one row from ITEM for each LINE-ITEM row. LINE-ITEM rows for a given order tend to be clustered together, but ITEM rows are not at all clustered

12 (2) Allocating Media Space
Suppose that an organization concurrently processes many order and has one large, fast disk and one small, slower disk. Developer must determine the best place to locate the data 1. Item table is stored on the larger, fast disk and ORDER and LINE-ITEM data on the smaller, slower disk. 2. ORDER and LINE-ITEM data for prior months’ order are placed on the slower disk and all the data for this month’s order are placed on the faster disk.

13

14 (3) Creating the Database Data
Once database has been defined and allocated to physical storage, it can be filled with data The means by which this is done depends on the application requirements and the features of the DBMS the Best case: all of the data are already in a computer-sensible format, and the DBMS has features and tools to facilitate importing the data from magnetic media the Worst case: all of the data must be entered via manual key entry using application programs created from scratch by the developers

15 8.2 Relational Data Manipulation
8.2.1 Categories of Relational Data Manipulation Language 1. Relational algebra: Defines operators that work on relations (akin to the operators +, -). Relational algebra is hard to use, partly because it is procedural. Must know not only what we want but also how to get it -infrequently used in commercial DB processing -discussed as a foundation to learn SQL 2. Relational calculus: not procedural. Need to express what we want without expressing how to get it. -Never used in commercial DB processing

16 3. Transform-Oriented Language: non-procedural language that transform input data expressed as relations into results expressed as a single relation -These languages provide easy-to-use structure for expressing what is desired regarding the data supplied (SQUARE, SEQUEL, SQL) 4. Graphical language: Query by Example and Query-by-Form -Products: Approach, Access and Cyberprise DBApp -With graphical interface, the user is presented a materialization of one or more relations -The materialization might be a data entry form, it might be a spread sheet, or it might be some other structure -The DBMS maps the materialization to the underlying relation and constructs queries (most likely in SQL) on behalf of the user

17 8.2.2 DML Interfaces to the DBMS
User interface to a database Form and Report capabilities supplied by DBMS Via a query/update language through application programs (by means of DBMS command)

18 (1) Data Manipulation by Means of Forms

19 (2) Query/Update Language Interface
Ex.: Consider the following SQL statement that processes the relation PATIENT(Name, Age, Gender, AccountNumber, Physician) SELECT Name, Age FROM PATIENT WHERE Physician = ‘Levy’

20 (3) Stored Procedure Interface
Query languages have generally proved to be too complicated for the average end user many end users have specialists write the query procedures, which are stored as files such procedures can be written to be parameter driven, thereby enabling the users to execute them when they change the data Ex.: DO BILLING FOR BDATE = '9/15/1996'

21 (4) Application Program Interface
Through application programs written such as COBOL, BASIC, Perl, Pascal, C++ (some application programs are written in languages provided by the DBMS vendors) Two styles of application program interface to the DBMS 1. Application program makes function calls to routines in a function library provided with the DBMS e.g. to read a particular row of a table, the application program calls the DBMS read function and passes parameters that indicate the table to be accessed, the data to be retrieved, the criteria for row selection, and the like Object-Oriented Syntax in Access 2000 set db= currentdb() set rs= db.OpenRecordset(“PATIENT”) Methods: rs.AllowDeletions rs.MoveFirst

22 (4) Application Program Interface
2. Used with mainframe and server DBMS products -a set of high level data access commands is defined by the DBMS vendor -These commands (which are peculiar to database processing and not part of any standard language) are embedded in the application program code

23

24 Mismatch -There is a mismatch in the basic orientation of SQL and application program languages. To correct for this mismatch, the results of SQL statements are assumed, in the application program, to be files. SELECT Name, Age FROM PATIENT WHERE Physician = ‘Levy’ -Result of these statement is a table with 2 columns and N rows -Mismatch: SQL(relation oriented) and programming languages (rows oriented) -must be corrected when programs access a relational database via SQL

25 Mismatch -In order to accept the results of this query, the application program is written to assume that these statements have produced a file with N records -The application program opens the query, processes rows one at a time (the same as that for processing as sequential file)

26 8.3 Relational Algebra

27

28 8.3.1 Relational Operators (1) UNION (A + B)
adding the tuples from one relation to those of a second relation to produce a third relation -the order in which the tuples appear in the third relation is not important, but duplicate tuples must be eliminated ★ union compatible: each relation must have the same number of attributes, and the attributes in corresponding columns must come from the same domain

29

30 (2) DIFFERENCE The difference of two relations is a relation containing tuples that occurs in the first relation but not in the second Union compatible

31 (3) INTERSECTION The intersection of two relations is a third relation containing the tuples that appear in both the first and second relations Union compatible

32 (4) PRODUCT (cartesian product)
Concatenation of every tuple of one relation with every tuple of a second relation the product of relation A (having m tuples) and relation B (having n tuples) has m times n tuples Denoted: A × B or A TIMES B

33

34 (5) PROJECTION Operation that selects specified attributes from a relation The result of the projection is a new relation with the selected attributes Projection chooses columns from a relation Denoted: STUDENT[Name, Major]

35 (6) SELECTION Operation that takes a horizontal subset(rows) of a relation selection identifies those tuples to be included in the new relation Denoted by specifying the relation name, followed by the keyword WHERE, followed by a condition involving attributes Denoted: STUDENT WHERE Major = ‘Math’

36 (7) JOIN Join operation is a combination of the product, selection, and (possibly) projection operations Procedure of Join operation A and B ① Product of A × B ② selection to eliminate some tuples ③ (optionally) remove some attributes by means of projection

37 Example of Join Ex.: STUDENT and ENROLLMENT relations in Figure 8-15 (we want to know the Name and Position Number of each student) STUDENT JOIN(SID = StudentNumber) ENROLLMENT “Join a STUDENT tuple to an ENROLLMENT tuple if SID of STUDENT equals StudentNumber of ENROLLMENT” Procedure 1. Product: STUDENT and ENROLLMENT(Figure 8-16) 2. SELECT those tuples from the product there SID of STUDENT equals StudentNumber of ENROLLMENT 3. Results: Figure 8-19a (two attributes are identical) Figure 8-19a: equijoin Figure 8-19b: natural join (default join) Figure 8-19c: outer Join

38

39 Joining on conditions other than equality also is possible
STUDENT JOIN (SID not = StudentNumber) ENROLLMENT, STUDENT JOIN (SID < FID) FACULTY -the latter join would result in tuples in which the student numbers are lower than the faculty numbers Join condition: the attributes in the condition must arise from a common domain STUDENT JOIN (Age = ClassSize) ENROLLMENT Incorrect join (unfortunately many relational DBMS products permit such a join) Outer Join

40 Outer Join -When we want to include all rows in one side of relation in Join operation STUDENT LEFT OUTER JOIN (SID = StudentNumber) ENROLLMENT STUDENT RIGHT OUTER JOIN (SID = StudentNumber) ENROLLMENT -Useful when working with relationships in which the minimum cardinality is zero in one or both sides

41

42 8.3.2 Expressing Queries in Relational Algebra
Example Relations 1. JUNIOR (Snum, Name, Major) 2. HONOR-STUDENT (Number, Name,Interest) 3. STUDENT (SID, Name, Mahor, GradeLevel, Age) 4. CLASS (NAME, Time, Room) 5. ENROLLMENT (StudentNumber, ClassName, PositionNumber) 6. FACULTY (FID, Name, Department)

43

44 ① What are the names of all students?
STUDENT[Name] duplicated names have been omitted ② What are the student numbers of all students enrolled in a class? ENROLLMENT[StudentNumber] JONES PARKS BAKER GLASS RUSSELL RYE 100 150 200 300 400 450

45 STUDENT[SID] - ENROLLMENT[StudentNumber]
③ What are the student numbers of all students not enrolled in a class? STUDENT[SID] - ENROLLMENT[StudentNumber] ④ What are the number of students enrolled in the class `BD445’? ENROLLMENT WHERE ClassName = 'BD445' [StudentNumber] 250 350 100 200

46 ⑤What are the names of the students enrolled in class `BD445' ?
STUDENT JOIN (SID = StudentNumber) ENROLLMENT WHERE ClassName = 'BD445' [STUDENT.Name] -To answer this query, data from both STUDENT and ENROLLMENT are needed. -Specially student names must come from STUDENT, whereas the condition “enrolled in BD445” must be checked in ENROLLMENT -Join  Selection  Projection JONES BAKER

47 ⑥ What are the names and meeting times of ‘PARKS’ classes?
To answer this, we must bring together data in all three relations. STUDENT data to find PARKS’s student number; ENROLLMENT data to learn which classes PARKS is in; CLASS data to determine the class meeting time STUDENT WHERE Name = 'PARKS' JOIN (SID = StudentNumber) ENROLLMENT JOIN (ClassName = Name) CLASS [CLASS.Name, Time] BA200 M-F9 - specify CLASS.Name to avoid ambiguity (because both STUDENT and CLASS have an attribute called Name)

48 Equivalent ways of responding to query ⑥
STUDENT JOIN (SID = StudentNumber) ENROLLMENT JOIN (ClassName = Name) CLASS WHERE Name = 'PARKS' [CLASS.Name, Time] -Selection on PARKS is not done until after all of the joins have been performed (assuming that the computer performs the operations as stated, this expression will be slower than the earlier one because many more tuples will be joined) -$1.17 vs. $4,356

49 ⑦What are the grade levels and meeting rooms of all students, including students not enrolled in a class? -Since all students are to be included, this query requires the use of an outer join STUDENT LEFT OUTER JOIN (SID = StudentNumber) ENROLLMENT JOIN (ClassName = Name) CLASS [GradeLevel, Room] -The result includes the GradeLevels of Glass and Russell, who are not enrolled in any class GR SC213 SO SC110 EB210 SN Null EA304 JR FR


Download ppt "Ch 8: Foundations of Relational Implementation"

Similar presentations


Ads by Google