Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data modeling Process. Copyright © 2007 - CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.

Similar presentations


Presentation on theme: "Data modeling Process. Copyright © 2007 - CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design."— Presentation transcript:

1 Data modeling Process

2 Copyright © 2007 - CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design the way they will be organized in the database Different levels of data modeling: –Conceptual data model –Logical data model –Physical data model

3 Copyright © 2007 - CIST 3 Conceptual data model High level model used at early stage of the project –Example: Moto rental software Customer Moto rent

4 Copyright © 2007 - CIST 4 Logical data model Middle level model used to explore attributes and relationship of the entities of the system –Example: Moto rental software rent

5 Copyright © 2007 - CIST 5 Physical data model Used to design the internal schema of a database, depicting data tables, attributes of these tables and types of these attributes

6 Copyright © 2007 - CIST 6 Data modeling process 1.Identify entity types 2.Identify attributes 3.Identify relationships 4.Apply design patterns and normalize 1. Multi-Valued attributes 2. Repeated attributes 3. Sub key 4. Many to many 5.Assign keys 6.Apply naming conventions 7.(denormalize to improve performance)

7 Copyright © 2007 - CIST 7 Data modeling 1. Identify entity types

8 Copyright © 2007 - CIST 8 Identify entities An entity type is similar to the concept of class. –A class has both data and behavior –An entity type just has data Create UML use case diagrams and class diagrams to identify the various entity types (classes of objects) that must be stored in the database

9 Copyright © 2007 - CIST 9 Example System: CIST course management software –There are about 150 students at CIST. They all belong to one (and only one) of the 7 CIST classes. Each class follows one curriculum training (DOP or SNA). Each curriculum gather a set of courses. –Each course is taught by one (and only one) teacher. One teacher can teach one or more courses. We want to design a software that can display: –The list of students (first name and last name) in each curriculum –The average age of students by curriculum, by class –The number of courses taught by each teacher –An information page by teacher with first name, last name, address and phone number –The list of students that scored less than 50% at the final exam Identify the entity types (objects) to store in the database

10 Copyright © 2007 - CIST 10 Solution CIST course management software

11 Copyright © 2007 - CIST 11 Data modeling 2. Identify attributes

12 Copyright © 2007 - CIST 12 Attributes Identify the data that are worth to be saved in the system  Data that will be displayed by the system  Data that will be used by an actor of the system  Data that will be used by the system itself

13 Copyright © 2007 - CIST 13 Example For each entity previously identified: –Student –Teacher –Class –Course –Curriculum Add the attributes needed by the software

14 Copyright © 2007 - CIST 14  The list of students that scored less than 50% at the final exam Solution  The list of students (first name and last name) in each curriculum  First name, last name  Date of birth  First name, last name, address, phone number  score  The average age of students  An information page by teacher with first name, last name, address and phone number

15 Copyright © 2007 - CIST 15 Solution For each entity previously identified: student, teacher, class, course, curriculum, add the attributes needed by the CIST course management software

16 Copyright © 2007 - CIST 16 Data modeling 3. Identify relationships and multiplicities

17 Copyright © 2007 - CIST 17 Relationships Once the entities and attributes of your system are identified, you need to precise the relationship and multiplicity between the entities you defined. In the real world, entities have relationships with other entities. For example, a Student IS IN one Class, a Teacher TEACHES one Course. The relationships between entities are conceptually identical to the relationship between objects.

18 Copyright © 2007 - CIST 18 Relationships Logical data model: course management software

19 Copyright © 2007 - CIST 19 Relationships Multiplicities –One student is in one and only one class / In one class, there is at least one student and there can be many students –One class follows one and only one curriculum (dop or sna) / one curriculum is followed by at least one class and can be followed by many classes.

20 Copyright © 2007 - CIST 20 Data modeling 4. Apply design patterns And Normalize Multi- Valued attributes Repeated attributes Sub key Many to many

21 Copyright © 2007 - CIST 21 Design patterns There are some modeling situations that you will find in many systems. We refer to these as design patterns. Understand the concepts behind each one will enable you to solve many modeling problem when you design a system database.

22 Copyright © 2007 - CIST 22 Data modeling Multi-Valued attributes

23 Copyright © 2007 - CIST 23 Multi-Valued attributes Example of table in a database –Give the list of people having “singing” as a hobby

24 Copyright © 2007 - CIST 24 Multi-Valued attributes Problem: searching the list of people having on particular hobby is possible but very difficult. The table is not in 1 st Normal Form as it has Multi-Valued attributes

25 Copyright © 2007 - CIST 25 Multi-Valued attributes Solution: Change the Multi-Valued table to a table with unique value attributes –Give the list of people having “singing” as a hobby Our table is now in 1st NORMAL FORM

26 Copyright © 2007 - CIST 26 Multi-Valued attributes It is much easier to search for a particular hobby. –However, imagine that Sok Pisey moves from Phnom Penh to Battambang, how many values do you need to update in the table?

27 Copyright © 2007 - CIST 27 Multi-Valued attributes Problem: each ‘name’ and ‘address’ value is duplicated. Update the address is very costly because we need to update as many values as the number of hobbies.  as a database designer, you must avoid the Redundancy of the information The table is in 1NF but not in 2NF, why?

28 Copyright © 2007 - CIST 28 Multi-Valued attributes

29 Copyright © 2007 - CIST 29 Multi-Valued attributes 2NF: all attributes are fully dependent on the primary key –Name: Primary key –Address: If you know the name, you can give a value of address for sure (i.e. Name=‘Sok Pisey’ => Address=‘Phnom Penh’)  Address is fully dependent on the primary key Name –Hobbies: If you know the name, you cannot give a value of hobbies for sure (i.e. Name=‘Sok Pisey’ => Hobbies=‘swimming’ OR Hobbies=‘singing’ OR …)  Hobbies IS NOT fully dependent on the primary key Name  The table IS NOT in 2 nd Normal Form

30 Copyright © 2007 - CIST 30 Multi-Valued attributes Solution: Remove the non-fully dependent attribute(s) and move it (them) to a new table with a “foreign key” to link them Contact Hobbies

31 Copyright © 2007 - CIST 31 Multi-Valued attributes Two tables linked by a “foreign key” –list of people having “singing” as a hobby: easy to find –Update the address of Sok Pisey: one value to update FOREIGN KEY Contact Hobbies

32 Copyright © 2007 - CIST 32 Multi-Valued attributes All attributes are fully dependent on the primary key Contact Hobbies PRIMARY KEY Functionally determines  The database is in 2 nd NORMAL FORM

33 Copyright © 2007 - CIST 33 Exercise 1 Convert table below to 1NF and 2NF: Describe the reasons why we need to convert to 1NF and 2NF?

34 Copyright © 2007 - CIST 34 Exercise 2 Convert table below to 1NF and 2NF: Describe the reasons why we need to convert to 1NF and 2NF?

35 Copyright © 2007 - CIST 35 Exercise 3 Convert table below to 1NF and 2NF: Describe the reasons why we need to convert to 1NF and 2NF?

36 Copyright © 2007 - CIST 36 Data modeling Repeated attributes

37 Copyright © 2007 - CIST 37 Repeated attributes Example of table in a database –Imagine that you want to add one score for a new topic “Web design”. What do you need to do?

38 Copyright © 2007 - CIST 38 Repeated attributes Problem1: if you want to add one topic (Web- design for example), you will have to add one attribute (column) to the table. –Adding an attribute implies changing the table structure –This often has many impacts on the project, the developers will have to update their code. As a database designer, you need to avoid this situation.

39 Copyright © 2007 - CIST 39 Repeated attributes Problem2: Many NULL values in the table –The score of the DOP course will always be NULL for SNA students –The score of the SNA courses will always be NULL for DOP students As a database designer, you must avoid unnecessary NULL values as it wastes memory space

40 Copyright © 2007 - CIST 40 Solution: in fact we don’t have 4 single attributes, we have one repeated attribute “Score” Repeated attributes The score has a “Topic” attribute that tells us what type of score it is.

41 Copyright © 2007 - CIST 41 Repeated attributes Solution: move the repeated attribute from the initial table to a separate table with a “foreign key” to link it with the initial table FOREIGN KEY

42 Copyright © 2007 - CIST 42 Exercise 1 Remove repeat attribute below:

43 Copyright © 2007 - CIST 43 Exercise 1 (correction)

44 Copyright © 2007 - CIST 44 Repeated attributes - Exercise 2 Remove repeat attribute below:

45 Copyright © 2007 - CIST 45 Repeated attributes - Exercise 3 Remove repeat attribute below:

46 Copyright © 2007 - CIST 46 Data modeling Sub keys

47 Copyright © 2007 - CIST 47 Sub keys Example of table in a database –Imagine that Sangkhim changes his phone number from 987654321 to 876543219, how many values do you need to update?

48 Copyright © 2007 - CIST 48 Sub keys Problem: each ‘teacher address’ and ‘teacher phone’ value is duplicated. Update the address or phone is very costly because we need to update as many values as the number of courses taught by the teacher.  as a database designer, you must avoid the Redundancy of the information The table is in 1NF, is it 2NF?

49 Copyright © 2007 - CIST 49 Sub keys In 2NF? –Are all attributes fully dependent on the primary key?  Course ID:  Course name:  Teacher:  Teacher address:  Teacher phone: PRIMARY KEY Fully dependent on ‘Course ID’ The table is in 2 nd NORMAL FORM

50 Copyright © 2007 - CIST 50 Sub keys The table IS NOT in 3 rd Normal Form Why?: some attributes are functionally dependent on an attribute that IS NOT the PRIMARY KEY

51 Copyright © 2007 - CIST 51 Sub keys All attributes are dependent on the PRIMARY KEY (2NF) ‘Teacher address’ and ‘T phone’ are dependent on the ‘Teacher’ Functionally determines PRIMARY KEY Functionally determines SUBKEY

52 Copyright © 2007 - CIST 52 Sub keys Solution: 1.Remove all the attributes that are dependent on the SUBKEY. In our example, the dependant attributes are ‘teacher address’ and ‘teacher phone’ 2.Duplicate the SUBKEY attribute where it becomes the PRIMARY KEY of the new table. In our example, the subkey is ‘teacher’ 3.Leave a copy of the subkey attribute where it is now a FOREIGN KEY

53 Copyright © 2007 - CIST 53 Sub keys Solution: PRIMARY KEY FOREIGN KEY

54 Copyright © 2007 - CIST 54 Exercise 1 Remove the sub-key and transform this table to a 3NF database:

55 Copyright © 2007 - CIST 55 Exercise 2 Remove the sub-key and transform this table to a 3NF database:

56 Copyright © 2007 - CIST 56 Exercise 3 Remove the sub-key and transform this table to a 3NF database:

57 Copyright © 2007 - CIST 57 Exercise 4 Remove the sub-key and transform this table to a 3NF database:

58 Copyright © 2007 - CIST 58 Data modeling Many to Many

59 Copyright © 2007 - CIST 59 Many to Many Design pattern: many-to-many –Example:  Each Curriculum is composed of one or more Courses  Each Course is part of one or more Curriculum

60 Copyright © 2007 - CIST 60 Many to Many Design pattern: many-to-many –Example: list of courses in DOP and SNA curriculum

61 Copyright © 2007 - CIST 61 Many to Many Design pattern: many-to-many –Since the maximum multiplicity in each direction is “many”, this is called a “many to many” association between Course and Curriculum

62 Copyright © 2007 - CIST 62 Many to Many Each time you have a many-to-many relationship between two tables, you need a JUNCTION TABLE to store the relationship between the two entities in the database. The many to many relationship is transformed in a “one to many” relationship and a “many to one” Man Many to many One to many Many to one +

63 Copyright © 2007 - CIST 63 Many to Many Example: list of courses in CIST curricula

64 Copyright © 2007 - CIST 64 Many to Many Data representation: Curriculum and Course

65 Copyright © 2007 - CIST 65 Many to Many Data representation: Junction table

66 Copyright © 2007 - CIST 66 Many to Many Data representation One to many Many to one

67 Copyright © 2007 - CIST 67 Many to Many Example: list of courses in CIST curricula

68 Copyright © 2007 - CIST 68 Exercise 1 Create the tables needed to store theses entities and relation in a database

69 Copyright © 2007 - CIST 69 Exercise 2 Create the tables needed to store theses entities and relation in a database

70 Copyright © 2007 - CIST 70 Exercise 3 Create the tables needed to store theses entities and relations in a database

71 Copyright © 2007 - CIST 71 Data modeling 5. Assign KEYS

72 Copyright © 2007 - CIST 72 Data modeling 6. Apply naming conventions

73 Copyright © 2007 - CIST 73 Data modeling 7. Denormalize (improve performance)

74 Copyright © 2007 - CIST 74 Exercise Exercise: auto repair You are designing a database for an automobile repair shop. When a customer brings in a vehicle, a service advisor will write up a repair order. This order will identify the customer and the vehicle, along with the date of service and the name of the advisor. A vehicle might need several different types of service in a single visit. These could include oil change, lubrication, rotate tires, and so on. Each type of service is billed at a pre-determined number of hours work, regardless of the actual time spent by the technician. Each type of service also has a flat “book rate” of dollars-per-hour that is charged. Describe each class in English. Draw the class diagram, including association classes if required. Describe each association in English (both directions). Draw the relation scheme. The solution to this exercise will be discussed in class or posted online at a later date.

75 Copyright © 2007 - CIST 75 Any Question?

76 Copyright © 2007 - CIST 76 The End


Download ppt "Data modeling Process. Copyright © 2007 - CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design."

Similar presentations


Ads by Google