Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Design Sections 9 & 10 Modeling Historical Data, conditional nontranferability, time-related constraints, Database conventions, generic modeling.

Similar presentations


Presentation on theme: "Database Design Sections 9 & 10 Modeling Historical Data, conditional nontranferability, time-related constraints, Database conventions, generic modeling."— Presentation transcript:

1 Database Design Sections 9 & 10 Modeling Historical Data, conditional nontranferability, time-related constraints, Database conventions, generic modeling Previously 10-11 Used Fall 2011

2 Modeling Over Time Any relationship that changes through time needs special consideration. An attribute such as status may change over time. Examples of time sensitive attributes are: rental status, fee payment, country name, or price Modeling data that changes over time can be a tricky subject. In this course, we will cover this subject twice -- here and again in Section 7. There are several implications to a model that incorporates the time element, so we are spreading out the material. In this lesson, we will discuss the factors that influence the decision to model historical data, create a new entity to track it, and define a UID for such an entity. A pharmaceutical company may want to track complaints filed about their drugs. This could help them detect problems early and issue recalls if necessary. Or they may want to conduct a study that follows patients on a new drug for an extended period of time. This could help scientific research and lead to further improvements in their products. A commercial processing plan may want to keep a record of rejected units per batch, which machine it came from or the farm that was the source of the cows or products. If there is a pattern over time, it could suggest ways to improve the processing methods to minimize waste and reduce costs.

3 Change and Time Every attribute update means loss of information.
Time in your model makes the model more complex. There are often complex join conditions. Users can work in advance. if date/time modeled as an attribute when it is updated, each time history is lost, as it overrides the previous value. Time separate entities, more complex Often difficult to join when time condition modeled as separate entities Planning can help modeling time at future times Point out that in the real world, most data changes over time, but the business may not need to track all of it. They need to validate requirements for storing historical data with the user. Storing unnecessary historical data can be costly. It takes up space

4 Why Modeling Time examples
Track trends in products City police department tracks crimes in each neighborhood frequency of crimes during certain times of the year holidays or during really hot or cold weather Slide 4: Why Learn It? Historical data is often used by businesses … Ask students to think of examples in their school where time is something of importance about which the school needs to keep information. Possible answers include: - Monthly reports on student attendance that determine funding in some schools - Late arrivals at school can be recorded to determine absences - Lunch schedules Types/quantities of food Library need to track checking out book, history of borrowing Florist – orders greater around significant times Restaurant – track busy times to adjust work schedules Other businesses may need to trak time for efficiency

5 Entity DAY or Attribute Date
PURCHASE DAY #date on for Single attribute entity without M:1 relationships is usually replaced by attribute PURCHASE * date Example: billable hours to different customers 2 ways to model date, as a separate entity or as an attribute of another entity. When date is separate and if date is the only attribute of Day it may be simplier to model the date as an attribute of Purchase. Before you can do that you must check to see that there are no relationships from the day entity before you move it. Some times you want to model date as an entity other times as an attribute. Identify trends, like when customers buy coats vs. swim suits vs. sneakers Temperature may be important

6 What is wrong with revision?
Slide 5: Tell Me /Show Me – Entity DAY vs Attribute Date Consider the entity Purchase Ask the class: Why can’t we just add the attributes for high temperature and low temperature to the PURCHASE entity? Answer: Because those attributes depend on the attribute date, which is not part of the UID of PURCHASE. Remember the rule of Third Normal Form -- attributes can’t have attributes of their own.

7 Entity Day vs. Attribute Date
Slide 6: Tell Me /Show Me – Entity DAY vs Attribute Date Remember the Third Normal Form Ask students what other information may be of interest about a particular day other than just the date in other business scenarios? Possible answers include: Is it a holiday? Is it a school day or a weekend? Is it a leap year?

8 Examples When there is an interest in a particular day not a date.
Is it a holiday? Is it a work day or weekend? Is it leap year? Month end? This refers to the previous slide. Answer to questions, What other information about a particular date might one need to know.

9 DAY #date * public holiday indicator
Entity DAY TASK ASSIGNMENT *duration in hours TASK #id DAY #date * public holiday indicator EMPLOYEE #name first day of starts on for of in with Study the above ERD Note Day is modeled as a separate Entity with date as the UID and Public holiday as a yes/no attribute. Slide 7: Tell Me / Show Me – Time-Related Constraints Time-related constraints: A constraint or data restriction that results from the time dimension. Be aware of constraints that can result from the need to track dates and times. Example: Booths at a fair. scheduling people to work different shifts at different booths. Some can work longer than others. Other example: resturant. Slide 8: Tell Me / Show Me – Here is a selection of time-related… Mention that although the time-related constraints seem obvious in real life, they must be enforced by programming logic in the database. Therefore, they must be documented.

10 Time-related Contstraints
Where you are trying to model a shift schedule, some relevant time-related constraints should reflect business rules such as the shift end-time cannot be less than or equal to the shift start time. Another rule might be to avoid double-booking of volunteers on a single booth so shift times for each volunteer may not overlap with another volunteer's shift time. Conditional nontransferability: Refers to a relationship that may or may not be transferable, depending on time Example: A shift could not be transferred to another person unless the shift had not begun. The “start time” can be updated earlier if not started. Nontransferability: Property of a relationship where an instance of A is related to an Instance of B, and the association cannot be moved to another instance of B Shift can not be transferred to another booth or person unless it has not started. Can’t represent in diagram because most of the time it is non-transferrable but not always. NEED to document this case.

11 Modeling time constraint example 1
Be aware of constraints that can result from the time dimension. Here is an example: Consider a school fair that features several booths. The manager signs up volunteers to work different shifts at different booths. Some volunteers can work for several hours; others can work fewer hours depending on their free time. The schedule has to be determined in advance, so that the manager knows which times are not covered by any volunteers. Normalization The rule of Third Normal Form states that no non-UID attribute can be dependent on another non-UID attribute. Third Normal Form prohibits transitive dependencies. A transitive dependency exists when any attribute in an entity is dependent on any other non-UID attribute in that entity. Slide 7: Tell Me /Show Me – Time-Related Constraints Time- related constraints- A constraint or data restriction that results from the time dimension.

12 Example 2 10.1.8 (1) Why must start time be part of the UID?
Answer: Because other exams can be scheduled in the same classroom on the same day, just not the same time. (2) Other time-related constraints: - Assignment periods may not overlap. (On a given date, the start time of an assignment for an exam may not be between the start time and end time of another exam for the same classroom. The end time of an assignment for an exam may not be between the start time and end time of another exam for the same classroom.) - You cannot move the assignment of an exam from one classroom to another unless the start date and time of the assignment is still in the future (conditional nontransferability). - Start date and time of an assignment may be updated to a later time, if that date and time is still in the future. - Start date and time of an assignment may be updated to an earlier time, if that date and time are still in the future. Slide 9: Tell Me /Show Me – The "start time" for a shift may be … Conditional nontransferability- Refers to a relationship that may or may not be transferable, depending on time Slide 10: Tell Me / Show Me – Conditional Nontransferability Nontransferability- Property of a relationship where an instance of A is related to an instance of B, and the association cannot be moved to another instance of B

13 What are some other rules?
Review example 1- What are some other rules? Mention that although the time-related constraints seem obvious in real life, they must be enforced by programming logic in the database. Therefore, they must be documented. Slide 8: Tell Me /Show Me – Here is a selection of time-related … Mention that although the time-related constraints seem obvious in real life, they must be enforced by programming logic in the database. Therefore, they must be documented. Can transfer shift if and only if the shift has not started! Other rules: No one may work more than 8 hours – This is a Procedural Rule Procedural rules: Must be enforced through a program. Another Procedural rule – No shift may overlap. Structural Rule use the Diamond (non-transferable) Refer to example above. example: every pilot has one and only one plane to fly.

14 Modeling Change ASSIGNMENT #start date o end date EMPLOYEE #id
COUNTRY #name of for in as Like a substitute teacher Date are attributes of Assignment Entity. Note start date Mandatory while end date is optional. Thus allowing not yet completed assignments to be modeled. Point out “End Date” may seem redundant because when a new assignment starts, the old one would automatically end. However, employees may take a leave and return after a couple of weeks, months, or years. In other words, if one does not model the attribute End Date one ignores the possibility that the assigned periods of a person are not contiguous.

15 Even a Country has a life cycle
Modeling Change COUNTRY #name #start time *end time EMPLOYEE #id of in life cycle attributes as for Note: Change in adding Start/end dates attributes to both Assignment and Country entities These two entities also have a relationship. Some countries may have name changes which are reflected here. The addition of these attributes may have other constraints that may be hard to implement. ASSIGNMENT #start date o end date Even a Country has a life cycle

16 Modeling change: Price
When do we need to track price? Determine best price for item Sale price vs. original price Grade change Value of home Asking price vs. sale price Stock, bond, or investment Others May want to track “price-in-time” For example price of car at time, What was the price when purchased for the refund? Slide 3: Why Learn It? Appreciation- A rise in value or price, especially over time. Depreciation- A decrease or loss in value, because of age, wear, or market conditions.

17 Mapping Historical Price
PRICED PRODUCT = HISTORICAL PRICE PRODUCT #id *name PRICE #start date *price $ o end date have for A product may have different prices at different times but only one price at one time. Break out price into separate entity to capture historical data. The #start date is not enough to be a UID so use barred relationship where composite UID includes Product ID (note multiple products can have same start date) Be careful to make sure that the product can only have one and only one price at one time. Do you really need an end date? It depends. If the various periods of a product price are contiguous, then you don’t need an end date. However, if the products are not always available, as in the seasonal fruit and vegetable market, then you need an end date. Don’t need end date if there is always a price. If there is not always a price then there can be a gap. If so then the new state date is the end date of previous price.

18 Slide 11 Slide 11: Tell Me /Show Me – Businesses often need to keep a record of price… Note that you cannot model the derived “between relationship.” It needs to be documented and implemented with additional code in the system. You may also want to point out that PURCHASE may also contain other attributes or relationships to the CUSTOMER and the EMPLOYEE who processes the transaction. For simplicity’s sake, we don’t include them here.

19 Revised previous ERD Note here Order header and Order item are added. Where the order header holds the header information like order id and date, server etc. The Order Item actually holds what is ordered. Between is not a normal ERD item but is used here for clarity. Order date will fall between a “start date” and “end date”, therefore dictating price. May need to have other code to model or handle between condition.

20 Price Changes Stock Market – example Price changes with time (seconds)
Need to know to buy/sell Trends in price effects decisions US Stock prices usually take a sharp dive or rise during times of crises or just before an announcement by the US Federal Reserve. If a stock price has been rising rapidly for a long time, you may wonder how much longer the rise will continue. You would need to know the history of the stock price. These are some of the things to take into consideration when deciding when to buy or sell. Price of fuel changes/building supply price changes If fuel prices look as though they are on an upward trend, you may lean toward a more fuel-efficient car or a hybrid model. If you have an opportunity to “lock in” to a monthly heating rate with your energy company, you may want to do so. A contractor of a five-year bridge-construction project has to plan how many materials (cement, steel, equipment, etc.) to buy. If the price of cement is on the rise, the contractor may want to stockpile a lot of it. What others things change with time: Price of goods, gold/silver, real estate, currency

21 Price with time Price may fluctuate with time “The Good Old Days”
Examples: Value of Gold or Silver Real Estate Value of currency Gasoline The Good Old Days You can have students bring up other examples of things that are constantly going up in price. What kinds of things do their parents complain about all the time? (food, school tuition, clothing, etc.) Can You Believe What I Paid for It? Think about video-gaming systems. When the Sega Genesis was brand new, how much was it? How much can you buy it for now? You may want to track the "price-in-time" so that you can answer questions such as: How much was Sega Genesis in December 1999? What's the Price Today? Have students name other items for which the price fluctuates. What about gold or silver? What about real estate? What about currency?

22 Journaling Slide 13: Journaling
Journaling- Keeping an on-going record of transactions. You may want to explain that PAYMENT could have additional attributes or relationships to PAYOR and PAYEE, but we have left them out of the diagram to keep things simple. Apart from the consequences for the conceptual data model, the system needs special journaling functionality: any business function that allows an update of PAYMENT amount should result in the requested update, plus the creation of a new instance of AMOUNT MODIFICATION with the proper values. This functionality has to be implemented by programming. Bank accounts Credit card Credit report

23 Slide 14: Tell Me / Show Me – A journal usually consists of both the …
Again, the system needs special journaling functionality: any business function that allows an update of EMPLOYEE salary amount should result in the requested update, plus the creation of an entity instance SALARY CHANGE with the proper values. You may want to point out that an alternative way to model salary changes (and many other examples of tracking changes over time) would be to include both old and new salary amounts as attributes of SALARY CHANGE, and remove the salary amount attribute from EMPLOYEE. The M:1 relationship would then be mandatory at both ends. How would we find an employee’s current salary? It's the new salary amount of the SALARY CHANGE instance with the most recent date modified.

24 Example When a student’s grade is changed, we need to record information on the teacher who changed the grade and the reason for the change. Start with the ENROLLMENT entity, which is the resolution of the M:M between STUDENT and CLASS. See next slide for ERD

25 Solution 10.2.13 STUDENT CLASS ENROLLMENT #date *grade
GRADE CHANGE *old grade *changed by #date changed o reason for change Student / Class M:M Enrollment is the intersection entity Things are left out for simplicity.

26 Adding the Time element to ERD
When does a school need to keep time related information? Could these be similar to issues in a business In this lesson, we have students use DAY as an entity to capture data about adoption dates. They will also start thinking of reports, statistics, and documentation that can result from their data model. Connections Ask students to think of examples in their school where time is something of importance about which the school needs to keep information. Possible answers include: - Monthly reports on student attendance that determine funding in some schools - Late arrivals at school can be recorded to determine absences - Lunch schedules Kinds/quantities of foods ordered based on demand determined over time (around holidays, schools may order less?) Utilities use around people schedule Business example (possible example IBM Rochester plant): Late employees pay reduced if late. Work/shift schedules Meal schedule, food orders due to work shifts

27 Drawing Conventions Improve readability
Examples from “DJs on Demand” scenario Dividing a big ERD up into functional areas Section 10 – Lesson 1 Brainstorm a list of conventions that are understood by most people that seem to be “learned” along the way. For example, the fork goes on the left, people shake hands when meeting others, there are three numbers on each row on a telephone, the cold water faucet is on the right, etc. Of course these conventions vary between different countries and cultures. How do these conventions make things easier for us? Answer: They tell us what to expect and help us behave appropriately.

28 Conventions Review Crows feet Crows fly East and South
Divide complex ERD’s into functional areas Place Highest volume entities in upper left corner Improve readability avoid criss-crossing lines increase white spaces so relationships don’t overlap be consistent with font type, size, and styles Section 10 Simplify ERD of DJs Possible answers: -Avoid criss-crossing lines. -Increase white space so that the relationships don’t overlap. -Use consistent font types, sizes, and styles. Following the conventions of "crows flying south … This is a review of a convention discussed previously. We say the crows are “flying south” because the “crow’s feet” are pointing north, as in JOB ASSIGNMENT and EVENT. We say the crows are “flying east” toward PARTNER and CLIENT, because the “crow’s feet” are pointing west. High-volume entities are those that would … Discuss why there would be more instances of JOB ASSIGNMENT than PARTNER or EVENT. This is because each event may have many partners working on it, and each one would be a job assignment. Similarly, there would be more instances of EVENT than CLIENT or SPACE (PUBLIC or PRIVATE). Marge Hohly 28 28

29 Following the conventions of "crows flying north …
The “South and East” convention makes sense in countries whose written language is read from left to right and top-down (for example English) because the human eye will naturally read from left to right (East) and from top to bottom (South). The “North and West” convention makes more sense in countries whose written language is read from right to left (for example Arabic) or from bottom to top. High-volume entities are often the "central" … High- volume entity- An entity that will have a large number of instances. Why is JOB ASSIGNMENT an important entity? It is important because it stores data on which partner is working on an event, what the status is, etc. This is information that is important in running the day-to-day business for DJs on Demand. Similarly, EVENT is an important entity -- it is the “main thing” on which the DJ business is built.

30 Often you will have a mix, depending on the amount …
Ask the class to find the crows that are flying north (to CLIENT), south (to PUBLIC and PRIVATE SPACE), east (to PARTNER), and west (to EVENT). Point out that this is still a readable diagram. Which are the high-volume entities in the model? Answer: EVENT and JOB ASSIGNMENT. They have the most number of relationships to other entities.

31 The high-volume entities are not always the most important
ENROLLMENT would be the highest volume and has therefore been drawn top-left in this “crows fly south and east” ERD. But the school exists only to educate the students, therefore STUDENT is the most important entity. Stress again that the “crows” conventions are exactly that: conventions, not rigid rules.

32 Readability takes space and is subject to taste.
White space- Space on a page or poster not covered by print or graphic matter. These diagrams convey the same information, but the one on the right is more readable. Do not sacrifice readability for space.

33 When you have a very large diagram, it …
It is still important to have a big diagram that shows the whole picture (even if it has to be printed on a plotter or taped together from smaller pieces of paper). For example, there may be relationships between entities in two different sub-models, and these relationships must be represented somewhere.

34 Point out a the diamond for nontransferabiltiy

35 Often multiple developers build the applications …
“Multiple developers” makes the point that smaller sub-diagrams can be useful during physical modeling and application development as well as during conceptual-modeling discussions with the client.

36 Generic Modeling Can reduce number of entities in diagram
Can provide more flexibility in unstable situations (where business requirements change often) Use a more distant perspective See next slide. What would happen to the generic model if we had to add 10 new ARTICLE types, each with their own attributes? Why Learn It? Generic- Relating to or descriptive of an entire group or class. Make sure that students understand the meaning of the word “generic.” You may want to use other examples, such as brand-name jeans as opposed to jeans with no label. There may be some negative associations with the word “generic” -- as in cheap, not as good, etc. Explain that in the data-modeling world, generic models are neither cheap nor a poor substitute. In fact, they can be more complex, but more flexible, as the lesson will illustrate. Example: single entity ARTICLE Marge Hohly 36 36

37 Generic modeling looks at the same context from …
Perspective- Point of view The graphic shows the generic model with a single entity ARTICLE. This can also be modeled with subtypes of SHIRT, SKIRT, DRESS, PANTS, etc. It would still reduce the number of entities. Point out that with the generic model, the common attributes can keep their mandatory nature (length, material), but other attributes specific to one type of ARTICLE must become optional. For example: neck size and sleeve length are mandatory for a SHIRT, but they need to be optional in the generic model because other articles (SKIRT, DRESS, PANTS) don’t have that attribute. Ask students: What would happen to the generic model if we had to add 10 new ARTICLE types, each with their own attributes? Answer: New attributes would have to be added to the entity ARTICLE. If the new types did not include the same mandatory attributes (length, material), those attributes would now have to be changed to optional. The problem is compounded if subtypes were modeled, because each new type would mean the creation of a new subtype. These are significant changes to the model.

38 Generic Modeling Have more attributes in fewer entities
Many mandatory requirements/attributes become optional Structural rules become procedural rules Example: PANTS waist size was mandatory, with ARTICLE waist size becomes optional What other businesses would be good candidates for generic modeling? Examples: - Big Lots - Trader Joes Marge Hohly 38 38

39 Recycling of Attributes
Walk students through the data to help them understand the model. DNM105 is the ID for a shirt with length of 40, made of denim material, with a neck size of 16, and a sleeve length of 33. LIN200 is the ID for a skirt with a length of 22, made of linen material, with a waistband type of elastic, and a hem circumference of 60. To understand the data in ARTICLE, you have to look at the data in ARTICLE TYPE. We have not discussed entity-to-table mapping yet, so some students may be confused as to why ARTICLE TYPE name is listed as a column under ARTICLE. Remind them that this is sample data, and that the relationship from ARTICLE to ARTICLE TYPE suggests that the ARTICLE TYPE name will be part of the data that is stored in ARTICLE. Data is part of the physical implementation of the conceptual model, which we will learn about later. This is a more flexible model than the first one because the addition of types of articles simply involves adding instances to ARTICLE TYPE and ARTICLE. The problem occurs if the new types have more attributes than the maximum number defined in the model. This will mean modifying both ARTICLE TYPE and ARTICLE entities to add attributes. Therefore, this model is good if the number of attributes is known and fairly fixed. You may also want to point out that this model can make the data look more complex. The value of each attribute can be different depending on the type of ARTICLE instance. For example, Property 4 can be a number for one type of article and a character string for another type of article. This means that the data type of each attribute in ARTICLE has to be fairly generic as well to hold all kinds of data.

40

41 Note the addition of values in this model

42 Attributes Modeled as Property Instance
Instructions: Again, you will need to walk the students through the data to help clarify the model. ARTICLE TYPE stores the different kinds of articles (shirt, skirt, pants, etc.). ARTICLE stores the IDs of each article (DNM105 is the ID for a shirt, LIN200 is the ID for a skirt, etc.). PROPERTY stores the descriptors for each article type (SHIRT has a length, material, neck size, etc. SKIRT has a length, etc.). ARTICLE PROPERTY VALUE stores the actual value for each property of each article (DNM105, which is a shirt, has a length of 40, material of denim, neck size of 16; LIN200, which is a skirt, has a length of 22). We have not discussed entity-to-table mapping yet, so some students may be confused as to why ARTICLE TYPE Name is listed as a column under PROPERTY. Remind them that this is sample data, and that the relationship from ARTICLE to PROPERTY suggests that the ARTICLE TYPE name will be part of the data that is stored in PROPERTY. Data is part of the physical implementation of the conceptual model, which we will learn about later. Point out that this is the most flexible model so far. We can add any number of article types with different types and any number of attributes. Again, while the structure gives us a lot of flexibility, it makes the data more complex. To understand the ARTICLE PROPERTY VALUE, you must know the ARTICLE identifier, the PROPERTY number, and the ARTICLE type.

43 Review these Property Instances with previous model.

44 Generic Model - Benefits
Flexibility – can prevent need to change data structure Can reduce the number of entities dramatically Disadvantages Increased complexity is both data model and application programs Generic Models The benefits of generic models are most apparent when the business requirements change often. There will be a need to add new entities and attributes in the future, which cannot be specified at this time. A generic solution is more flexible and will accommodate change without too much impact to the model. However, it does make looking at the data more complex. This means that end users who want to access the database will have to write more complex SQL queries, and developers who write applications for the database will need to add more complexity to their code as well.

45 Relational Database Concepts
Conceptual model transforms into a relational database A relational database is a database that is perceived by the user as a collection of relations or two-dimensional tables. Table, each employee (instances), and each column (attribute) Employee table: Row – describes a employee (instance) Column – attribute of each employee Marge Hohly 45 45

46 Conventions Review Crows feet Crows fly East and South
Divide complex ERD’s into functional areas Place highest volume entities in upper left corner Improve readability Avoid criss-crossing lines Increase white spaces so relationships don’t overlab Be consistent with font type, size, and styles


Download ppt "Database Design Sections 9 & 10 Modeling Historical Data, conditional nontranferability, time-related constraints, Database conventions, generic modeling."

Similar presentations


Ads by Google