Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 6 The Relational Data Model and the Relational Algebra.

Similar presentations


Presentation on theme: "Chapter 6 The Relational Data Model and the Relational Algebra."— Presentation transcript:

1 Chapter 6 The Relational Data Model and the Relational Algebra

2 6.1. The Relational Data Model Concepts The relational model of data is based on the concept of a relation A relation is a mathematical concept based on the ideas of sets The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations The model was first proposed by Dr. E.F. Codd of IBM in 1970

3 RELATION: A table of values –A relation may be thought of as a set of rows –A relation may alternately be though of as a set of columns –Each row represents a fact that corresponds to a real-world entity or relationship –Each row has a value of an item or set of items that uniquely identifies that row in the table –Sometimes row-ids or sequential numbers are assigned to identify the rows in the table –Each column typically is called by its column name or column header or attribute name

4 FORMAL DEFINITIONS –A relation may be defined in multiple ways –The schema of a relation: R (A1, A2,.....An) Relation schema R is defined over attributes A1, A2,.....An For Example - CUSTOMER (Cust-id, Cust-name, Address, Phone#) Here, CUSTOMER is a relation defined over the four attributes Cust-id, Cust-name, Address, Phone#, each of which has a domain or a set of valid values. For example, the domain of Cust-id is 6 digit numbers

5 A tuple is an ordered set of values Each value is derived from an appropriate domain. Each row in the CUSTOMER table may be referred to as a tuple in the table and would consist of four values. is a tuple belonging to the CUSTOMER relation. A relation may be regarded as a set of tuples (rows). Columns in a table are also called attributes of the relation A domain has a logical definition: e.g., “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S.

6 A domain may have a data-type or a format defined for it. The USA_phone_numbers may have a format: (ddd)-ddd-dddd where each d is a decimal digit. E.g., Dates have various formats such as monthname, date, year or yyyy-mm-dd, or dd mm,yyyy etc An attribute designates the role played by the domain. E.g., the domain Date may be used to define attributes “Invoice- date” and “Payment-date” The relation is formed over the cartesian product of the sets; each set has values from a domain; that domain is used in a specific role which is conveyed by the attribute name

7 For example, attribute Cust-name is defined over the domain of strings of 25 characters. The role these strings play in the CUSTOMER relation is that of the name of customers. Formally, Given R(A 1, A 2,.........., A n ) r(R)  dom (A 1 ) X dom (A 2 ) X....X dom(A n ) R: schema of the relation r of R: a specific "value" or population of R. R is also called the intension of a relation r is also called the extension of a relation

8 Let S1 = {0,1} Let S2 = {a,b,c} Let R  S1 X S2 Then for example: r(R) = {,, } is one possible “state” or “population” or “extension” r of the relation R, defined over domains S1 and S2. It has three tuples

9 DEFINITION SUMMARY Informal Terms Formal Terms Table Relation ColumnAttribute/Domain RowTuple Values in a columnDomain Table DefinitionSchema of a Relation Populated TableExtension

10

11 CHARACTERISTICS OF RELATIONS Ordering of tuples in a relation r(R): The tuples are not considered to be ordered, even though they appear to be in the tabular form. Ordering of attributes in a relation schema R (and of values within each tuple): We will consider the attributes in R(A 1, A 2,..., A n ) and the values in t= to be ordered. (However, a more general alternative definition of relation does not require this ordering). Values in a tuple: All values are considered atomic (indivisible). A special null value is used to represent values that are unknown or inapplicable to certain tuples

12 Notation: - We refer to component values of a tuple t by t[A i ] = v i (the value of attribute A i for tuple t). Similarly, t[A u, A v,..., A w ] refers to the subtuple of t containing the values of attributes A u, A v,..., A w, respectively

13 6.2. The Relational Constraints and Relational Database Schemas Constraints are conditions that must hold on all valid relation instances. There are three main types of constraints: 1.Key constraints 2.Entity integrity constraints 3.Referential integrity constraints Key Constraints Superkey of R: A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK]  t2[SK]

14 Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey. Example: The CAR relation schema: CAR(State, Reg#, SerialNo, Make, Model, Year) has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys. {SerialNo, Make} is a superkey but not a key If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. The primary key attributes are underlined

15 Entity Integrity Relational Database Schema: A set S of relation schemas that belong to the same database. S is the name of the database S = {R 1, R 2,..., R n } Entity Integrity: The primary key attributes PK of each relation schema R in S cannot have null values in any tuple of r(R). This is because primary key values are used to identify the individual tuples t[PK]  null for any tuple t in r(R) Note: Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key

16 Referential Integrity A constraint involving two relations (the previous constraints involve a single relation) Used to specify a relationship among tuples in two relations: the referencing relation and the referenced relation Tuples in the referencing relation R 1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relation R 2. A tuple t 1 in R 1 is said to reference a tuple t 2 in R 2 if t 1 [FK] = t 2 [PK] A referential integrity constraint can be displayed in a relational database schema as a directed arc from R 1.FK to R 2

17 Referential Integrity Constraint Statement of the constraint The value in the foreign key column (or columns) FK of the the referencing relation R 1 can be either: (1) a value of an existing primary key value of the corresponding primary key PK in the referenced relation R 2,, or.. (2) a null. In case (2), the FK in R 1 should not be a part of its own primary key.

18 Other Types of Constraints -based on application semantics and cannot be expressed by the model per se -E.g., “the max. no. of hours per employee for all projects he or she works on is 56 hrs per week” -A constraint specification language may have to be used to express these -SQL-99 allows triggers and ASSERTIONS to allow for some of these

19 6.3.The Relational Operations The basic set of operations for the relational model is known as the relational algebra These operations enable a user to specify basic retrieval requests The result of the retrieval is a new relation, which may have been formed from one or more relations The algebra operations thus produce new relations, which can be further manipulated using operations of the same algebra A sequence of relational algebra operations forms a relational algebra expression, whose result will also be a relation that represents the result of a database query (or retrieval request)

20 Relational algebra is a theoretical language with operations that work on one or more relations to define another relation without changing the original relation The output from one operation can become the input to another operation (nesting is possible) There are different basic operations that could be applied on relations on a database based on the requirement

21 –Selection (σ) Selects a subset of rows from a relation. –Projection (π) Deletes unwanted columns from a relation. –Cross-Product ( X ) Allows us to combine two relations. –Set-Difference ( - ) Tuples in relation1, but not in relation2. –Union ( ∪ ) Tuples in relation1 or in relation2. –Intersection (∩) Tuples in relation1 and in relation2 –Join Tuples joined from two relations Using these we can build up sophisticated database queries

22 The following is sample table used to illustrate different kinds of relational operations. The relation contains information about employees, IT skills they have and the school where they attend each skill. The primary key for this table is EmpId and Skill ID since a single employee can have multiple skills and a single skill be acquired by many employees. School address is the address of a school for which the address of the main office will be considered in cases where a single school has many branches at different locations.

23 Employee EmpIDFNameLNameSkillIDSkillSkillTypeSchoolSchoolAddSkillLevel 12AbebeMekuria2SQLDatabaseAAUSidist_Kilo5 16LemmaAlemu5C++ProgrammingUnityGerji6 28ChaneKebede2SQLDatabaseAAUSidist_Kilo10 25AberaTaye6VB6ProgrammingHelicoPiazza8 65AlmazBelay2SQLDatabaseHelicoPiazza9 24DerejeTamiru8OracleDatabaseUnityGerji5 51SelamBelay4PrologProgrammingJimmaJimma City8 94AlemKebede3CiscoNetworkingAAUSidist_Kilo7 18GirmaDereje1IPProgrammingJimmaJimma City4 13YaredGizaw7JavaProgrammingAAUSidist_Kilo6

24 1.Selection –Selects subset of tuples/rows in a relation that satisfy selection condition. –Selection operation is a unary operator (it is applied to a single relation) –The Selection operation is applied to each tuple individually –The degree of the resulting relation is the same as the original relation but the cardinality (no. of tuples) is less than or equal to the original relation. –The Selection operator is commutative. –Set of conditions can be combined using Boolean operations (∩(AND), Ú(OR), and ~(NOT))

25 –No duplicates in result! –Schema of result identical to schema of (only) input relation. –Result relation can be the input for another relational algebra operation! (Operator composition.) –It is a filter that keeps only those tuples that satisfy a qualifying condition (those satisfying the condition are selected while others are discarded.) Notation: σ (relation name)

26 Example: Find all Employees with skill type of Database σ (Employee) This query will extract every tuple from a relation called Employee with all the attributes where the SkillType attribute with a value of “Database” The resulting relation will be the following EmpIDFNameLNameSkillIDSkillSkillTypeSchoolSchoolAddSkillLevel 12AbebeMekuria2SQLDatabaseAAUSidist_Kilo5 28ChaneKebede2SQLDatabaseAAUSidist_Kilo10 65AlmazBelay2SQLDatabaseHelicoPiazza9 24DerejeTamiru8OracleDatabaseUnityGerji5

27 If the query is all employees with a SkillType Database and School Unity the relational algebra operation and the resulting relation will be as follows σ (Employee) EmpIDFNameLNameSkillIDSkillSkillTypeSchoolSchoolAddSkillLevel 24DerejeTamiru8OracleDatabaseUnityGerji5

28 2.Projection –Selects certain attributes while discarding the other from the base relation –The PROJECT creates a vertical partitioning – one with the needed columns (attributes) containing results of the operation and other containing the discarded Columns –Deletes attributes that are not in projection list –Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation –Projection operator has to eliminate duplicates! Note: real systems typically don’t do duplicate elimination unless the user explicitly asks for it –If the Primary Key is in the projection list, then duplication will not occur –Duplication removal is necessary to insure that the resulting table is also a relation

29 Notation: π (Relation Name) Example: To display Name, Skill, and Skill Level of an employee, the query and the resulting relation will be: π (Employee) FNameLNameSkillSkillLevel AbebeMekuriaSQL5 LemmaAlemuC++6 ChaneKebedeSQL10 AberaTayeVB68 AlmazBelaySQL9 DerejeTamiruOracle5 SelamBelayProlog8 AlemKebedeCisco7 GirmaDerejeIP4 YaredGizawJava6

30 If we want to have the Name, Skill, and Skill Level of an employee with Skill SQL and SkillLevel greater than 5 the query will be: π ( σ 5> (Employee)) FNameLNameSkillSkillLevel ChaneKebedeSQL10 AlmazBelaySQL9

31 3.Cartesian Product (Cross Product) –This operation is used to combine tuples from two relations in a combinatorial fashion –That means, every tuple in Relation1(R) one will be related with every other tuple in Relation2 (S) –In general, the result of R(A1, A2,..., An) X S(B1,B2,..., Bm) is a relation Q with degree n + m attributes Q(A1, A2,..., An, B1, B2,..., Bm), in that order; Where R has n attributes and S has m attributes –The resulting relation Q has one tuple for each combination of tuples—one from R and one from S

32 IDFNameLName 123AbebeLemma 567BelayTaye 822KefleKebede Hence, if R has n tuples, and S has m tuples, then | R x S | will have n* m tuples Example: Employee Dept DeptIDDeptNameMangID 2Finance567 3Personnel123

33 Then the Cartesian product between Employee and Dept relations will be of the form: Employee X Dept: IDFNameLNameDeptIDDeptNameMangID 123AbebeLemma2Finance567 123AbebeLemma3Personnel123 567BelayTaye2Finance567 BelayTaye3Personnel123 822KefleKebede2Finance567 822KefleKebede3Personnel123

34 Basically, even though it is very important in query processing, the Cartesian Product is not useful by itself since it relates every tuple in the First Relation with every other tuple in the Second Relation Thus, to make use of the Cartesian Product, one has to use it with the Selection Operation, which discriminate tuples of a relation by testing whether each will satisfy the selection condition In our example, to extract employee information about managers of the departments (Managers of each department), the algebra query and the resulting relation will be

35 π (σ (Employee X Dept)) sxd


Download ppt "Chapter 6 The Relational Data Model and the Relational Algebra."

Similar presentations


Ads by Google