1.(5) Describe the working process with a database system. a) database definition By the database definition, we create an entity-relationship diagram for a realistic world problem, which is then transformed into a database schema. b) populating tables with data The second phase is the database construction, by which a database is initially populated with data. c) database manipulation Once a database is constructed, one can conduct some operation on it, such as data insertion, data deletion and data updating, which changes the database from a state to another state. It is the responsibility of a database system to guarantee that the database goes from one correct state to another. Sept. 2012 ACS-3902 Yangjun Chen
2.(10) Describe the three-schema architecture of a database system and the main functionality of each level in this architecture. a) internal level The first level is the internal level, which possesses an internal schema. It describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access paths for the database. b) conceptual level The second level is the conceptual level, which has a conceptual schema. It describes the structure of the whole database for a community of users. The conceptual level hides the details of physical structures and concentrates on entity types, data types, relationships, user operations, and constraints. A high level data model or an implementation data model can be used at this level. Sept. 2012 ACS-3902 Yangjun Chen
c) external level The third level is the so-called external level or the view level, which includes a number of external schemas or user views. It describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group. A high level data model or the implementation data model can be used at this level. Sept. 2012 ACS-3902 Yangjun Chen
3.(10) Explain the following concepts: (a) superkey (b) key (c) candidate key (d) primary key (e) foreign key key constraints a superkey is any combination of attributes that uniquely identify a tuple: t1[superkey] t2[superkey]. a key is superkey that has a minimal set of attributes Sept. 2012 ACS-3902 Yangjun Chen
one candidate key is chosen as the primary key (PK) If a relation schema has more than one key, each of them is called a candidate key. one candidate key is chosen as the primary key (PK) foreign key (FK) is defined as follows: i) Consider two relation schemas R1 and R2; ii) The attributes in FK in R1 have the same domain(s) as the primary key attributes PK in R2; the attributes FK are said to reference or refer to the relation R2; iii) A value of FK in a tuple t1 of the current state r(R1) either occurs as a value of PK for some tuple t2 in the current state r(R2) or is null. In the former case, we have t1[FK] = t2[PK], and we say that the tuple t1 references or refers to the tuple t2. Sept. 2012 ACS-3902 Yangjun Chen
(a) A university is organized into faculties. 4.(15) Draw an ER-diagram to describe the following real world problem. (a) A university is organized into faculties. (b) Each faculty has a unique name, ID and number of professors and a specific professor is chosen as the faculty head. (c) Each faculty provides a number of courses. (d) Each course has a unique name and courseID. (e) Each professor has a name, SIN, address, salary, sex and courses taught by him/her. (f) Each professor belongs to a faculty and can teach several sections of a course. (g) Each student has a name, ID, SIN, address, GPA, sex, and major. (h) Each student can choose one faculty as his/her major faculty and take several courses with certain credit hours. Some of the courses are mandatory and some are optional. Sept. 2012 ACS-3902 Yangjun Chen
ER-model: professor faculty sections course student belong teach head salary name ID addr. SIN belong N 1 professor NoProf F-Id sex F-name M startdate 1 faculty head teach 1 N M 1 sections sectionId provide N M choose has 1 course N ID name addr. N SIN take M couresId student major mandatory-optional creditHours name sex birthdate Sept. 2012 ACS-3902 Yangjun Chen
- collision resolution strategy: chaining 5.(30) Linear Hashing - collision resolution strategy: chaining - split rule: load factor > 0.7 - initially M = 4 (M: size of the primary area) - hash functions: hi(key) = key mod 2i M (i = 0, 1, 2, …) - bucket capacity = 2 Trace the insertion process of the following keys into a linear hashing file: 24, 7, 10, 5, 14, 4, 1, 8, 2, 3, 17, 13, 15. Sept. 2012 ACS-3902 Yangjun Chen
when inserting the sixth record we would have The first phase – phase0 when inserting the sixth record we would have but the load factor 6/8= 0.75 > 0.70 and so bucket 0 must be split (using h1 = Key mod 2M): 24 4 5 10 14 7 n=0 before the split (n is the point to the bucket to be split.) 0 1 2 3 24 5 10 14 7 4 n=1 after the split load factor: 6/10=0.6 no split 0 1 2 3 4 Sept. 2012 ACS-3902 Yangjun Chen
24 5 10 14 7 4 0 1 2 3 4 n=1 load factor: 7/10=0.7 no split 24 5 1 10 insert(1) 24 5 10 14 7 4 0 1 2 3 4 n=1 load factor: 7/10=0.7 no split 24 5 1 10 14 7 4 0 1 2 3 4 Sept. 2012 ACS-3902 Yangjun Chen
24 5 1 10 14 7 4 0 1 2 3 4 n=1 load factor: 8/10=0.8 split using h1. insert(8) 24 5 1 10 14 7 4 0 1 2 3 4 n=1 load factor: 8/10=0.8 split using h1. 24 8 5 1 10 14 7 4 Sept. 2012 ACS-3902 Yangjun Chen
24 8 1 10 14 7 4 5 0 1 2 3 4 5 n=2 load factor: 8/12=0.66 no split 24 0 1 2 3 4 5 n=2 load factor: 8/12=0.66 no split insert(2) 24 8 1 10 14 7 4 5 0 1 2 3 4 5 Sept. 2012 ACS-3902 Yangjun Chen
24 8 1 10 14 7 4 5 n=2 load factor: 9/12=0.75 split using h1. 2 overflow 2 0 1 2 3 4 5 24 8 1 10 2 7 4 5 14 Sept. 2012 ACS-3902 Yangjun Chen
insert(3) 24 8 1 10 2 7 4 5 14 24 8 1 10 2 7 3 4 5 14 n=3 load factor: 10/14=0.714 split using h1. Sept. 2012 ACS-3902 Yangjun Chen
The second phase – phase1 n = 0; using h1 = Key mod 2M to insert and 24 8 1 10 2 3 4 5 14 7 n=4 The second phase – phase1 n = 0; using h1 = Key mod 2M to insert and h2 = Key mod 4M to split. insert(17) 8 24 1 10 2 3 4 5 14 7 Sept. 2012 ACS-3902 Yangjun Chen
8 24 1 17 10 2 3 4 5 14 7 n=0 load factor: 11/16=0.687 no split. 8 24 insert(13) 8 24 1 17 10 2 3 4 5 14 7 Sept. 2012 ACS-3902 Yangjun Chen
8 24 1 17 10 2 3 4 5 13 14 7 n=0 load factor: 12/16=0.75 split bucket 0, using h2. 1 17 10 2 3 4 5 13 14 7 8 24 Sept. 2012 ACS-3902 Yangjun Chen
insert(15) 1 17 10 2 3 4 5 13 14 7 8 24 1 17 10 2 3 4 5 13 14 7 15 8 24 n=1 load factor: 13/18=0.722 split bucket 1, using h2. 1 17 10 2 3 4 5 13 14 7 15 8 24 Sept. 2012 ACS-3902 Yangjun Chen
6.(15) Given the following B+-tree, trace the deletion sequence: h, c, e, f. Here, we assume that each internal node can contain at most two keys and each leaf node can contain at most two value/point pairs. c b f a b c e f h Fig. 1 Sept. 2012 ACS-3902 Yangjun Chen
remove h: f e a b c remove c: f e a c b Sept. 2012 ACS-3902 Yangjun Chen
remove e: c a e a b f remove f: a e a b f a b a Sept. 2012 ACS-3902 Yangjun Chen
Find the names of employees who works on all the projects 7.(15) Given the relation schemas shown in Fig. 2, construct expressions (using relational algebraic operations) to evaluate the following query: Find the names of employees who works on all the projects controlled by department ‘Business Computing’. EMPLOYEE fname, minit, lname, ssn, bdate, address, sex, salary, superssn, dno DEPARTMENT Dname, dnumber, mgrssn, mgrstartdate Fig. 2 PROJECT WORKS_ON Pname, pnumber, plocation, dnum Essn pno, hours Sept. 2012 ACS-3902 Yangjun Chen
: DN dnumber(Dname = ‘Business Computing’(Department)) DEPT_P PNUMBER(DNDN.dnumber = PROJECT.dnumber(PROJECT)) EMP_PNOS ESSN,PNO(WORK_ON) SSNS EMP_PNOS : DEPT_P RESULT FNAME, LNAME(SSNS * EMPLOYEE) Sept. 2012 ACS-3902 Yangjun Chen