1. Explain the following concepts: (a) superkey (b) key (c) candidate key (d) primary key (e) foreign key key constraints a superkey is any combination of attributes that uniquely identify a tuple: t1[superkey] t2[superkey]. a key is superkey that has a minimal set of attributes Sept. 2004 91.3902 Yangjun Chen
one candidate key is chosen as the primary key (PK) If a relation schema has more than one key, each of them is called a candidate key. one candidate key is chosen as the primary key (PK) foreign key (FK) is defined as follows: i) Consider two relation schemas R1 and R2; ii) The attributes in FK in R1 have the same domain(s) as the primary key attributes PK in R2; the attributes FK are said to reference or refer to the relation R2; iii) A value of FK in a tuple t1 of the current state r(R1) either occurs as a value of PK for some tuple t2 in the current state r(R2) or is null. In the former case, we have t1[FK] = t2[PK], and we say that the tuple t1 references or refers to the tuple t2. Sept. 2004 91.3902 Yangjun Chen
2. Draw an ER-diagram to describe the following real world problem. (a) A university is organized into faculties. (b) Each faculty has a unique name, ID and number of professors and a specific professor is chosen as the faculty head. (c) Each faculty provides a number of courses. (d) Each course has a unique name and courseID. (e) Each professor has a name, SIN, address, salary, sex and courses taught by him/her. (f) Each professor belongs to a faculty and can teach several sections of a course. (g) Each student has a name, ID, SIN, address, GPA, sex, and major. (h) Each student can choose one faculty as his/her major faculty and take several courses with certain credit hours. Some of the courses are mandatory and some are optional. Sept. 2004 91.3902 Yangjun Chen
ER-model: professor faculty course student belong teach head provide salary name ID addr. SIN belong N 1 professor NoProf F-Id sex F-name M startdate 1 teach section-IDs faculty head 1 M N couresId 1 provide N course name choose N ID N addr. SIN take M student major mandatory-optional creditHours name sex birthdate Sept. 2004 91.3902 Yangjun Chen
- collision resolution strategy: chaining 3. Linear Hashing - collision resolution strategy: chaining - split rule: load factor > 0.7 - initially M = 4 (M: size of the primary area) - hash functions: hi(key) = key mod 2i M (i = 0, 1, 2, …) - bucket capacity = 2 Trace the insertion process of the following keys into a linear hashing file: 24, 7, 10, 5, 14, 4, 1, 8, 2, 3, 17, 13, 15. Sept. 2004 91.3902 Yangjun Chen
when inserting the sixth record we would have The first phase – phase0 when inserting the sixth record we would have but the load factor 6/8= 0.75 > 0.70 and so bucket 0 must be split (using h1 = Key mod 2M): 24 4 5 10 14 7 n=0 before the split (n is the point to the bucket to be split.) 0 1 2 3 24 5 10 14 7 4 n=1 after the split load factor: 6/10=0.6 no split 0 1 2 3 4 Sept. 2004 91.3902 Yangjun Chen
24 5 10 14 7 4 0 1 2 3 4 n=1 load factor: 7/10=0.7 no split 24 5 1 10 insert(1) 24 5 10 14 7 4 0 1 2 3 4 n=1 load factor: 7/10=0.7 no split 24 5 1 10 14 7 4 0 1 2 3 4 Sept. 2004 91.3902 Yangjun Chen
24 5 1 10 14 7 4 0 1 2 3 4 n=1 load factor: 8/10=0.8 split using h1. insert(8) 24 5 1 10 14 7 4 0 1 2 3 4 n=1 load factor: 8/10=0.8 split using h1. 24 8 5 1 10 14 7 4 Sept. 2004 91.3902 Yangjun Chen
24 8 1 10 14 7 4 5 0 1 2 3 4 5 n=2 load factor: 8/12=0.66 no split 24 0 1 2 3 4 5 n=2 load factor: 8/12=0.66 no split insert(2) 24 8 1 10 14 7 4 5 0 1 2 3 4 5 Sept. 2004 91.3902 Yangjun Chen
24 8 1 10 14 7 4 5 n=2 load factor: 9/12=0.75 split using h1. 2 overflow 2 0 1 2 3 4 5 24 8 1 10 2 7 4 5 14 Sept. 2004 91.3902 Yangjun Chen
insert(3) 24 8 1 10 2 7 4 5 14 24 8 1 10 2 7 3 4 5 14 n=3 load factor: 10/14=0.714 split.using h1. Sept. 2004 91.3902 Yangjun Chen
The second phase – phase1 n = 0; using h1 = Key mod 2M to insert and 24 8 1 10 2 3 4 5 14 7 n=4 The second phase – phase1 n = 0; using h1 = Key mod 2M to insert and h2 = Key mod 4M to split. insert(17) 8 24 1 10 2 3 4 5 14 7 Sept. 2004 91.3902 Yangjun Chen
8 24 1 17 10 2 3 4 5 14 7 n=0 load factor: 11/16=0.687 no split. 8 24 insert(13) 8 24 1 17 10 2 3 4 5 14 7 Sept. 2004 91.3902 Yangjun Chen
8 24 1 17 10 2 3 4 5 13 14 7 n=0 load factor: 12/16=0.75 split bucket 0, using h2. 1 17 10 2 3 4 5 13 14 7 8 24 Sept. 2004 91.3902 Yangjun Chen
insert(15) 1 17 10 2 3 4 5 13 14 7 8 24 1 17 10 2 3 4 5 13 14 7 15 8 24 n=1 load factor: 13/18=0.722 split bucket 1, using h2. 1 17 10 2 3 4 5 13 14 7 15 8 24 Sept. 2004 91.3902 Yangjun Chen
Here, we assume that each internal node can contain at most two 4. Given the following B+-tree, trace the deletion sequence: h, c, e, f. Here, we assume that each internal node can contain at most two keys and each leaf node can contain at most two value/point pairs. c b e a b c e f h Sept. 2004 91.3902 Yangjun Chen
f e a b c f e a c b Sept. 2004 91.3902 Yangjun Chen
c a e a b f a e a b f a b a Sept. 2004 91.3902 Yangjun Chen
5. Given the relation schemas shown in Fig. 1, construct expressions (using relational algebraic operations) to evaluate the following query: Find the names of employees who works on all the projects controlled by department Number ‘Business Computing’. EMPLOYEE fname, minit, lname, ssn, bdate, address, sex, salary, superssn, dno DEPARTMENT Dname, dnumber, mgrssn, mgrstartdate PROJECT WORKS_ON Pname, pnumber, plocation, dnum Essn pno, hours Sept. 2004 91.3902 Yangjun Chen
: DN dnumber(Dname = ‘Business Computing’(Department)) DEPT_P PNUMBER(DNDN.dnumber = PROJECT.dnumber(PROJECT)) EMP_PNOS ESSN,PNO(WORK_ON) SSNS EMP_PNOS : DEPT_P RESULT FNAME, LNAME(SSNS * EMPLOYEE) Sept. 2004 91.3902 Yangjun Chen
Pname, pnumber, plocation, dnum 6. Given the relation schemas as shown in Fig. 2, construct a SQL clause to evaluate the following query: For each department, retrieve the department number, the department name, the depatment manager, and the number of employees who work in that department. PROJECT WORKS ON Pname, pnumber, plocation, dnum Essn pno, hours Sept. 2004 91.3902 Yangjun Chen
SELECT x.Dname, x.dnumber, z.fname, z.lname, count(y.ssn) FROM department x, employee y, employee z WHERE x.Dnumber=y.dno and x.mgrssn=z.ssn Group by x.dnumber The following SQL statement is wrong. SELECT Dname, dnumber, fname, lname, count(ssn) FROM department , employee WHERE Dnumber=dno and mgrssn=ssn Group by dnumber Sept. 2004 91.3902 Yangjun Chen
4. Trace the insertion process of the following keys into an empty b+-tree: 7, 6, 5, 3, 1, 2. Here, we assume that each internal node can contain at most two keys and each leaf node can contain at most two value/point pairs. 7 6 7 Sept. 2004 91.3902 Yangjun Chen
5 6 7 5 5 6 3 5 7 6 Sept. 2004 91.3902 Yangjun Chen
5 3 6 1 3 5 6 7 3 6 2 3 3 5 1 2 6 7 Sept. 2004 91.3902 Yangjun Chen