Fundamentals/ICY: Databases 2010/11 WEEK 6 John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK
Multivalued Attributes in ERMs and ERDs (now with more explanation)
A Multivalued Attribute in an Entity: CAR_COLOR gives multiple colours
Multivalued Attributes “You should not implement them in the relational DBMS” [rather, you should re-represent them in a special way – J.A.B.] One possibility: Use a variable-length string for the attribute, and list all the values within the string. Disadvantage: little support supplied by the DBMS – insertions and deletion require special extra programming. Similarly if calculations are needed on the individual values.
Multivalued Attributes, contd Another possibility: Within original entity type, split the attribute into several different attributes corresponding to different natural components of the entity. (See next slide.)
Splitting the Multivalued Attribute into New Naturally Namable Attributes
Multivalued Attributes, contd Disadvantages: The attribute may need to be split differently for different entities. The attribute may not have naturally namable aspects at all. E.g., imagine blotches of colour in random places on a car.
Multivalued Attribute Problems, contd Another possibility: Within original entity type, split the attribute into several different attributes not corresponding to specific components of the entity. E.g., have attributes called Colour1, Colour2, … , Colour6. Advantage: copes with the no-identifiable-components problem and the different-split problems. NB: also allows repetition of colours. Disadvantages: Have to set aside enough columns to accommodate the conceivable max, but if this max not often approached then have a lot of wasted space. Searching for a colour, or doing insertions and deletions, can be very cumbersome.
Multivalued Attributes, contd Often Better: Replace the attribute by a new 1:M relationship to a new entity type holding the original attribute’s data. If the components of the original attribute are conceptually distinguishable in a natural way, the new entity can have an attribute whose values identify those components.
Multivalued Attributes, contd If the original multivalued attribute does not have namable components, leave out the component-naming attribute (COL_SECTION in diagram). But NB: the PK would then need to include, in our example, the colour. So we can’t have repetitions of colours. Or include an integer-valued attribute to allow the values to be distinguished. The PK now includes that attribute. Now we can have repetitions of colours.
New for Week 6
Generalization Hierarchies in ERMs and ERDs
Entity Supertypes and Subtypes Generalization (or: specialization) hierarchy A group of relationships each of which is between a higher-level “supertype” entity (e.g. EMPLOYEE) and a lower-level “subtype” entity (e.g., PROFESSOR) Supertype Contains attributes shared by all its subtypes Subtype Contains special attributes: ones that not all sister subtypes have Primary key of a subtype = that of the supertype
Disjoint (or: Non-Overlapping) Subtypes Each entity in the supertype can appear in at most one of the subtypes Exhaustive Subtypes Each entity in the supertype must appear in at least one of the subtypes Other terminology: exhaustiveness = total completeness (!!) = mandatoriness [of subtype] non-exhaustiveness = partial completeness (!!) = optionality [of subtype]
Why Consider Supertypes and Subtypes? We would not just want to have a table for the supertype and none for the subtypes, because of the resulting poor table structure. We would not just want to use separate tables for subtypes, because of structure in common and relationships in common data redundancy when subtypes overlap need for a supertype table anyway when the subtypes are not exhaustive.
Notation in following slides is from a previous textbook Notation in following slides is from a previous textbook. See other, better, notations in current textbook and in Additional Notes
Disjoint Subtypes
A Generalization Hierarchy with Overlapping Subtypes
Actual Realization of Subtyping A supertype maintains a 1:1 relationship with each subtype [optional in the super-to-sub direction and mandatory in the other]
The EMPLOYEE/PILOT Supertype/Subtype Relationship
More-Than-Binary Relationships in ERMs and ERDs
Relationship Degree Binary relationship [my definition] Two entities (“entity occurrences”) are associated by each instance of the relationship, as in all previous examples in lecture slides. Ternary relationship [my definition] Three entities are associated by each relationship instance. Etc. NB: the entities associated with each other need not be of different types. Indeed, could have an entity (occurrence) associated with itself: e.g. an “employs” relationship where someone can employ him/herself. Doesn’t violate binarity.
Relationship Degree: Terminology Problems The standard terminology & definitions relating to “relationship degree” (see the textbook) are mathematically anomalous. The degree of a relationship should be the number of entities (“entity occurrences”) that are associated by each instance of the relationship, no matter how many entity types are involved. But the definitions used in the books count the number of entity types (“entities” in the abbreviated language used), departing from normal mathematical practice.
Terminology Problems contd. A “unary” relationship is standardly defined as being a relationship where the entity occurrences are all within the same entity type (“entity”). E.g., a “manages” relationship between employees. A better name is “recursive” (also used in the books). Above example is binary recursive, under my definition. The books define a binary relationship as one relating two different types, even when there are more than two entity occurrences per relationship instance. Similarly ternary and three different types.
Different Degrees The “unary” case is badly named. It’s really a type of binary. The word “recursive” [later] is better.
Tables for a Ternary Relationship CFR is just like a bridging entity type for a binary M:N relationship, but has 3 links to other types instead of 2
Recursive Relationships in ERMs and ERDs
Recursive Relationships A recursive relationship links entities of the same type. E.g.: marriage, management, parthood, … Can have partial recursion: just some of the entity types involved in a relationship (that is ternary or of higher degree in my sense) could be the same.
Recursive Relationships: Symmetry A relationship R between entity types E,F (possibly the same) is symmetric iff: if eRf then fRe (i.e., if R relates entity e of type E to entity f of type F, then it must also relate f to e.) E.g.: marriage, being-sibling-of. Recursive relationships cause major redundancy problems when symmetric. Symmetry can only hold in the 1:1 and M:N cases. ((Can generalize the points to partly-recursive cases.))
An ER Representation of Recursive Relationships I don’t know why the Crow’s Ft model uses two links in each case
(necessarily non-symmetric) 1:M recursive: “EMPLOYEE Manages EMPLOYEE” Just a standard 1:M implementation except linking a table to itself. No redundancy problem.
non-symmetric M:N recursive: “PART Contains PART” The COMPONENT entity type is just a bridging type, linking PART to itself. NB: its first two columns both refer to PART’s PK but must be differently named. No redundancy problem.
symmetric (1:1) recursive relationship: “EMPLOYEE Married to EMPLOYEE”
Symmetry is the Problem A non-symmetric 1-1 relationship would not have the problem shown on previous slide. A symmetric M:N relationship would have a redundancy problem, whether implemented as in the 1-1 case or by a bridging table. E.g.: being-sibling-of.
symmetric (1:1) recursive relationship: redundant & non-redundant implemntns As previously—redundant . MARRIED_V1 is just a bridging entity type: still redundant. MARRIAGE together with MARPART act as a sort of bridge. Non-redundant.
Symmetric M:N, etc. Method 3 on previous slide can straightforwardly be generalized to: symmetric recursive M:N relationships (( (partially-)symmetric&recursive (Barnden- )ternary, etc. relationships, whatever the connectivity --- M:N:P, 1:1:1, M:1:P, … ))