Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Milano Bicocca, Italy Carlo Batini

Similar presentations


Presentation on theme: "University of Milano Bicocca, Italy Carlo Batini"— Presentation transcript:

1 University of Milano Bicocca, Italy Carlo Batini
Course in Data Base Design Conceptual Design: qualities and methods

2 Introduction to strategies and qualities Motivating examples

3 Requirements “not toy example”
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address.

4 Example of complex step that cannot be performed“one shot”
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. Database Design 1. Student (Student Id, Last Name, First Name, Type) 2. Course (Course Id, Year of Enrollment) 3. Enrolled in (Student Id, Course Id) 4. Passed (Student Id, Course Id, Grade, Date) 5. Professor (Last Name, First Name, Type, Department Name) 6. Department (Name, Address) 7. Full Professor (Last Name, First Name, City of Birth) 8. City (Name, Region) 9. Foreign Student (Student Id, Country) 10. Chinese Student (Student Id, City of Birth) 11. Country (Name, Continent)

5 First strategy: two phases instead of one phase (divide et impera)

6 Phase 1 Phase 2 Phase 1 - Conceptual Design Phase 2 - Logical Design
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. First Name Last Name Born in one many ID Teach Name Region Associate Professor Full Professor Belongs to Address Year of Enrollment Passed Enrolled in Grade Date Chinese St. Foreign St. Born Continent Department Country Student City Course First Name Last Name Born in one many ID Teach Name Region Associate Professor Full Professor Belongs to Address Year of Enrollment Passed Enrolled in Grade Date Chinese St. Foreign St. Born Continent Department Country Student City Course Phase 1 - Conceptual Design Phase 1 1. Student (Student Id, Last Name, First Name, Type) 2. Course (Course Id, Year of Enrollment) 3. Enrolled in (Student Id, Course Id) 4. Passed (Student Id, Course Id, Grade, Date) 5. Professor (Last Name, First Name, Type, Department Name) 6. Department (Name, Address) 7. Full Professor (Last Name, First Name, City of Birth) 8. City (Name, Region) 9. Foreign Student (Student Id, Country) 10. Chinese Student (Student Id, City of Birth) 11. Country (Name, Continent) Phase 2 - Logical Design Phase 2

7 Not so easy…. Phase 1 - Conceptual Design
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. Not so easy…. Phase 1 - Conceptual Design First Name Last Name Born in one many ID Teach Name Region Associate Professor Full Professor Belongs to Address Year of Enrollment Passed Enrolled in Grade Date Chinese St. Foreign St. Born Continent Department Country Student City Course

8 Second strategy: proceed step by step (manage complexity)

9 Need for strategies in conceptual design
Teach one many Associate Professor Full Professor Related Chinese St. Foreign St. Student Course Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. Step 1 Step 2 First Name Last Name Born in one many ID Teach Name Region Associate Professor Full Professor Belongs to Address Year of Enrollment Passed Enrolled in Grade Date Chinese St. Foreign St. Born Continent Department Country Student City Course Teach one many Associate Professor Full Professor Related Chinese St. Foreign St. Student Course

10 The issue of schema quality

11 Different schemas for the same requirements….
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. First Name Last Name Born in one many ID Teach Name Region Associate Professor Full Professor Belongs to Address Year of Enrollment Passed Enrolled in Grade Date Chinese St. Foreign St. Born Continent Department Country Student City Course Schema 2 Student ID Last Name First Name Country of Birth Continent of Birth City of Birth Course Id Course Name …… ……. Student Schema 1

12 The same in the relational model
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. 1. Student (Student Id, Last Name, First Name, Type) 2. Course (Course Id, Year of Enrollment) 3. Enrolled in (Student Id, Course Id) 4. Passed (Student Id, Course Id, Grade, Date) 5. Professor (Last Name, First Name, Type, Department Name) 6. Department (Name, Address) 7. Full Professor (Last Name, First Name, City of Birth) 8. City (Name, Region) 9. Foreign Student (Student Id, Country) 10. Chinese Student (Student Id, City of Birth) 11. Country (Name, Continent) Schema 2 1. Student (Student Id, Last Name, First Name, Type, Enrolled Course Id, Year of Enrollment, Passed course Id, Grade, Date, Professor Last Name, Professor First Name, Professor Type, Department Name, Department Address, Professor City of Birth, Region, Foreign Student Country of Birth, Foreign Student Continent, Chinese Student City of Birth, Chinese Student Region of Birth) Schema 1

13 Motivating example: find “errors” in schemas
Students of a University attend courses, and pass them. Courses have a last name, a first name and a year of enrollment. Students have an ID, a Last Name and First Name. For Chinese Students we want to represent City of Birth and Region of Birth. For Foreign Students we want to represent the Country of Birth and the Continent of Birth. A Course passed by a Student has a grade and a date. Courses are taught by professors. Professors have a last name and a first name. They may be Associate Professors or Full professors, only for Full Professors we are interested to the City of Birth, with name and region. Every professor belongs to a Department. Departments have a Name and an address. ? ? Schema 2 Schema 1 Last Name Last Name ID First Name Name Year of Enrollment ID Name Year of Enrollment many many one many Student Enrolled in Course Student Enrolled in Course many many one many Passed Passed Teach one Teach City of Birth one Country Grade Date Foreign St. Chinese St. Grade Date Address Name many First Name many Born many one one First Name Born many one Professor many one Professor many Department Belongs to Located at Department Belongs to Last Name one one Last Name Country Name Name Scientific Area Full Professor Associate Professor Name Continent Address Name Name many one many Region City Born in Region City Born in many

14 Contents Qualities of a conceptual schema
Strategies for conceptual design

15 Schema Quality Dimensions
@C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

16 Correctness with respect to the model
Concerns the correct use of the categories of the model in representing requirements. Example – In the Entity Relationship model we may represent the logical link between persons and their first names using the two entities Person and FirstName and a relationship between them. The schema is not correct wrt the model since an entity should be used only when the concept has a unique existence in the real world and has an identifier. Student (1,n) has (1,1) First Name @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

17 Correctness with respect to requirements
Concerns the correct representation of the requirements in terms of the model categories. Example - In an organization each department is headed by exactly one manager and each manager may head exactly one department. If we represent Manager and Department as entities, the Relationship between them should be one-to-one; in this case, the Schema is correct wrt requirements. If we Use a one-to-many relationship, the schema is incorrect. Manager (1,n) heads (1,1) Department @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

18 Minimality/Redundancy
A schema is minimal if every part of the requirements is represented only once in the schema. In other words, it is not possible to eliminate some element from the schema without compromising the information content. Student 1,n Attends 1,n Course Assigned to 1,? Teaches 1,n Instructor 1,n @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

19 @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008
Completeness Measures the extent to which a conceptual schema includes all the conceptual elements necessary to meet specified requirements. It is possible that the designer has not included certain characteristics present in the requirements in the schema, e.g., attributes related to an entity Person; in this case, the schema is incomplete. Students have a code, a name, a place of birth. @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

20 @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008
Completeness Measures the extent to which a conceptual schema includes all the conceptual elements necessary to meet specified requirements. It is possible that the designer has not included certain characteristics present in the requirements in the schema, e.g., attributes related to an entity Person; in this case, the schema is incomplete. Students have a code, a name, a place of birth. Student Id Student Student Name @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

21 @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008
Pertinence Measures how many unnecessary conceptual elements are included in the conceptual schema. In the case of a schema that is not pertinent, the designer has Gone too far in modeling the requirements, and has included too many concepts. Students have a code and a name. @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

22 Pertinence Measures how many unnecessary conceptual
Students have a code and a name. Student Id Student Name Student Place of Birth Measures how many unnecessary conceptual elements are included in the conceptual schema. In the case of a schema that is not pertinent, the designer has Gone too far in modeling the requirements, and has included too many concepts.

23 How to check and achieve Completeness and Pertinence

24 Exercise 4.1 - Completeness
Students have Student Id, Given Name, Surname, Date of Birth, City of Birth, Country of Birth. Courses have Course Id, Name, Year of Enrollment, Number of hours. Student pass Exams, with Grade and Date. Name Student Student ID City Given Name Country born many one Surname many Exam Grade many Course Course ID Course Name Year of Enrollment Number of hours

25 Exercise 4.1 - Completeness
Students have Student Id, Given Name, Surname, Date of Birth, City of Birth, Country of Birth. Courses have Course Id, Name, Year of Enrollment, Number of hours. Student pass Exams, with Grade and Date. Name Student Student ID City Country born Given Name many one Surname many Exam Grade many Course Course ID Course Name Year of Enrollment Number of hours

26 Exercise 4.1 - Completeness
Students have Student Id, Given Name, Surname, Date of Birth, City of Birth, Country of Birth. Courses have Course Id, Name, Year of Enrollment, Number of hours. Student pass Exams, with Grade and Date. Name Student Student ID City Country born Given Name many one Surname many Exam Grade many Course Course ID Course Name Year of Enrollment Number of hours

27 Exercise 4.1 - Completeness
Students have Student Id, Given Name, Surname, Date of Birth, City of Birth, Country of Birth. Courses have Course Id, Name, Year of Enrollment, Number of hours. Student pass Exams, with Grade and Date. Name Student Student ID City Country born Given Name many one Surname many Exam Grade many Course Course ID Course Name Year of Enrollment Number of hours

28 Exercise 4.1 - Completeness
Students have Student Id, Given Name, Surname, Date of Birth, City of Birth, Country of Birth. Courses have Course Id, Name, Year of Enrollment, Number of hours. Student pass Exams, with Grade and Date. Name Student Student ID City Country born Given Name many one Surname many Exam Grade The schema is lacking: - Date of Birth of Students - Date of Exams many Course Course ID Course Name Year of Enrollment Number of hours

29 Exercise 4.2 – Pertinence – Do yourself
Students have Student Id, Given Name, Surname, Date of Birth, City of Birth, Country of Birth. Courses have Course Id, Name, Year of Enrollment. Student pass Exams, with grade. Name Student Student ID City Given Name Country born many one Surname many Exam Grade many Course Course ID Course Name Year of Enrollment Number of hours

30 Readability Intuitively, a schema is readable whenever it represents
the meaning of the reality represented by the schema in a clear way for its intended use. This simple, qualitative definition is not easy to translate in a more formal way, since the evaluation expressed by the word clearly conveys some elements of subjectivity. In models, such as the Entity Relationship model, that provide a graphical representation of the schema, readability concerns both the diagram and the schema itself.

31 Unreadable schema Employee Department Floor Purchase Vendor Warehouse
Of Order Worker Engineer City Born Warehouse Warranty Type In Employee Vendor Acquires Works Head Floor Located Department Produces Item Manages

32 Diagrammatic Readability as embedding in a grid
Floor Located Manages Department Head Employee Born City Works Produces Vendor Worker Engineer Item In Warranty Type Warehouse Order Purchase Acquires Of

33 A Readable schema

34 Diagrammatic readability
With regard to the diagrammatic representation, readability can be expressed objectively by a number of aesthetic criteria that human beings adopt in drawing diagrams: crossings between lines should be minimized, graphic symbols should be embedded in a grid, lines should be made of horizontal or vertical segments, The number of bends in lines should be minimized, the total area of the diagram should be minimized, and, finally, Parents in generalization hierarchies should be positioned at a higher level in the diagram in respect to children. The children entities in the generalization hierarchy should be symmetrical with respect to the parent entity.

35 Our example Embedded in a grid No crossings in lines
All lines are horizontal or vertical Childs in generalizations symmetric w.r.t. parent Parent upon childs Purchase Of Order Worker Engineer City Born Warehouse Warranty Type In Employee Vendor Acquires Works Head Floor Located Department Produces Item Manages Floor Located Manages Depart ment Head Employee Born City Works Produces Vendor Worker Engineer Item In Type Warehouse Warranty Acquires Order Of Purchase

36 A Readable schema Department Employee Item Order Purchase Worker
Of Worker Engineer City Born Warranty Type Employee Vendor Acquires Works Head Floor Located Department Produces Item Warehouse In Manages

37 @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008
Compactness The second issue addressed by readability is the compactness of schema representation. Among the different conceptual schemas that equivalently represent a certain reality, we prefer the one or the ones that are more compact, because compactness favors readability. @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

38 Transformation that preserves information content
and enhances compactness Employee Vendor Worker Engineer Born Born City Born @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

39 Transformation the preserves information content
and enhances compactness Worker Engineer City Born Employee Vendor @C.Batini, F. Cabitza, G. Pasi, R. Schettini, 2008

40 Normalization in the ER model
Employee-Project Employee Id Surname Given Name Project Id % of time of Employee in Project Project Name

41 Normalization Employee Project Employee Id Surname Given Name
Assigned to (1,n) Employee Id Surname Given Name Project Id % of time of Employee in Project Project Name

42 Unnormalized ER schema
Normalization Normalized ER schema Employee-Project Employee # Salary Project # Budget Role Employee Project Assigned to 1,n Unnormalized ER schema

43 Strategies in Conceptual Design

44 Bottom up Start from attributes, then choose entities, then relationships, finally is a and generalization hierarchies

45 Inside out Start from the most important entity, and assign attributes. Then move to adjacent entities, relationships and is-a and generalization hierachies, until all the schema is represented.

46 Top down Start from one entity, then refine the entity into a more articulated schema, then for each concept of the schema apply further refinements until all the schema is produced.

47 Top down Final refinement Requirements in natural language Initial
Second refinement Requirements in natural language Initial schema

48 We compare three strategies
Bottom up Inside Out Top down

49 Bottom up Course Student Foreign St. Chinese St. Country
We are in China. Students are of two types, Foreign Students and Chinese Students. For both of them we want to represent ID, First Name and Last Name. Furthermore, for Foreign Students we want to represent the Country where they are born with Name and Continent. Students are related to courses, Courses have Name and Year of Enrollment. Students are enrolled in courses; furthermore they pass Course with Grade and Date. Name Year of Enrollment Enrolled in many ID Last Name First Name Country Chinese St. Foreign St. Course Student Passed many Grade Date ID Last Name First Name ID Last Name First Name Born many one Name Continent

50 Inside Out Student Course Foreign St. Chinese St. Country
We are in China. Students are of two types, Foreign Students and Chinese Students. For both of them we want to represent ID, First Name and Last Name. Furthermore, for Foreign Students we want to represent the Country where they are born with Name and Continent. Students are related to courses, Courses have Name and Year of Enrollment. Students are enrolled in courses; furthermore they pass Course with Grade and Date. Name Year of Enrollment Enrolled in many ID Last Name First Name Student Course Passed many Grade Date Chinese St. Foreign St. Born many one Country Name Continent

51 Top down Student Career Student Career Course Student Course Student
Name Year of Enrollment Course Student Career Passed Enrolled in many Student Course Student ID Last Name First Name Chinese St. Foreign St. Student Student Course Related Grade Date Passed many one Country Foreign St. Born Name Continent Country Name Year of Enrollment Course Foreign St. many one Country Born Chinese St. Foreign St. Student Student Career Course Related Related Passed Enrolled in many Student ID Last Name First Name Grade Date Passed Name Continent Country

52 Strategies compared Strategies / Pros and cons Pros Cons Bottom up
Intuitive and «easy» Forces restructurings Oil stain Compromise between bottom up and top down Some restructuring Top down Need for an holistic view – difficult to achieve No restructuring

53 Lesson Exercise 4.2

54 Exercise 4.2 Study the following data requirements for a hospital database and produce a conceptual schema using these strategies: The bottom up strategy The inside out strategy The top-down strategy The hospital database stores data about patients, their admission and discharge from hospital’s departments, and their treatments. For each patient, we know the name, address, sex, social security number, and insurance code. For each department, we know the department’s name, its location, the name of the doctor who heads it, the number of beds available, and the number of beds occupied. Each patient goes through multiple treatments during hospitalization; for each treatment, we store its name, duration, and the possible reactions to it the patient may have.

55 Inside out Strategy

56 Patient schema – Inside Out Strategy
Name Address Sex SSN Insurance Code Location Name Name of head # of beds available # of beds occupied Patient Department Passed many Admission date Discharge receive many Duration Name Reactions (1,n) Treatment

57 Patient schema – Bottom up Strategy
The hospital database stores data about patients, their admission and discharge from hospital’s departments, and their treatments. For each patient, we know the name, address, sex, social security number, and insurance code. For each department, we know the department’s name, its location, the name of the doctor who heads it, the number of beds available, and the number of beds occupied. Each patient goes through multiple treatments during hospitalization; for each treatment, we store its name, duration, and the possible reactions to it the patient may have. Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Admission date Discharge date Name Reactions (1,n) Duration

58 Bottom up Strategy

59 Patient schema – Bottom up Strategy - 1
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Admission date Discharge Duration Name Reactions (1,n)

60 Patient schema – Bottom up Strategy - 2
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient Department Admission date Discharge Duration Name Reactions (1,n) Treatment

61 Patient schema – Bottom up Strategy - 3
Admitted many Patient Department Treatment Name Address Sex SSN Insurance Code Location Name of head # of beds available # of beds occupied Reactions (1,n) Duration Admission date Discharge Receives

62 Patient – Inside Out - 1 Patient Name Address Sex SSN Insurance Code

63 Patient – Inside Out - 2 Patient Department Admitted Name Address Sex
SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient Department many many Admitted Admission date Discharge

64 Patient – Inside Out - 3 Patient Department Treatment Admitted
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient Department many many Admitted Admission date Discharge many Receives Duration many Name Reactions (1,n) Treatment

65 Patient schema – Top down Strategy

66 Patient schema – Top down Strategy
Therapies

67 Patient schema – Top down Strategy
Patient and Treatments Department Admitted many

68 Patient schema – Top down Strategy
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient and Treatments Department Admitted many

69 Patient schema – Top down Strategy
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient and Treatments Department Admitted many Admission date Discharge

70 Patient schema – Top down Strategy
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient Department Admitted many Admission date Discharge Receives many Treatment

71 Patient schema – Top down Strategy
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient Department Admitted many Admission date Discharge Receives many Name Reactions (1,n) Treatment

72 Patient schema – Top down Strategy
Name Address Sex SSN Insurance Code Name Location Name of head # of beds available # of beds occupied Patient Department Admitted many Admission date Discharge Receives many Duration Name Reactions (1,n) Treatment


Download ppt "University of Milano Bicocca, Italy Carlo Batini"

Similar presentations


Ads by Google