M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #3 Matthew P. Johnson Stern School of Business, NYU Spring, 2005
M.P. Johnson, DBMS, Stern/NYU, Spring Admin Textbooks? This afternoon
M.P. Johnson, DBMS, Stern/NYU, Spring Agenda Last time: E/R models, some design issues This time: More design “carving at the joints” Redundancy Whether an element should be an attribute or entity set Replacing a relationships with entity sets Constraints Identifying & specifying key attributes to an entity set Recognizing other types of single-valued constraints Representing referential integrity constraints Identifying & representing general constraints Weak entity sets
M.P. Johnson, DBMS, Stern/NYU, Spring Review Multiplicity review: Square-of? (e.g., (3,9)) Cube-of? (e.g., (-3,-27)) Wife-of? Wife-of-in-certain-other-cultures?
M.P. Johnson, DBMS, Stern/NYU, Spring Design Principles Faithfulness Simplicity Avoiding redundancy Choice of relationships Picking elements
M.P. Johnson, DBMS, Stern/NYU, Spring Simplicity Einstein: Theories should be as simple as possible, but not simpler. Use as few elements as possible Minimum required relations No unnecessary attributes (will you be using this attribute?) Eliminate “spinning wheels” Example: how can we simplify this? MoviesOwningsStudios Owned-byOwns
M.P. Johnson, DBMS, Stern/NYU, Spring Avoiding redundancy Say everything exactly once Minimize database storage requirements More important: prevent possible update errors simplest but not only e.g.: modify data one place but not the other – more later Example: Spot the redundancy StudiosMovies Own StudioName Name Length Name Address
M.P. Johnson, DBMS, Stern/NYU, Spring Avoiding redundancy Say everything exactly once Minimize database storage requirements More important: prevent possible update errors simplest but not only e.g.: modify data one place but not the other – more later Example: Spot the redundancy StudiosMovies Own StudioName Name Length Name Address Redundancy: Movies “knows” the studio two ways Phone
M.P. Johnson, DBMS, Stern/NYU, Spring Spot more redundancy Different redundancy: studio info listed for every movie! Movies StudioName Name Length SAddress SPhon e Name Length Studio SAddress SPhone Pulp Fiction… Miramax NYC 212-… Sylvia… Miramax NYC 212-… Jay & Sil. Bob … Miramax NYC 212-… …
M.P. Johnson, DBMS, Stern/NYU, Spring Don’t add relships that are implied StudentsCourses TAs Enrolls TA-of Assist Suppose each course again has <=1 TA Q: Is the following good design? A: If TAs other than the course’s TA can help students, then yes; if not, then no: we can connect Students and TAs by going through Courses; redundant!
M.P. Johnson, DBMS, Stern/NYU, Spring Correct E/R models may contain loops Person plays multiple roles: employee of company buyer of product price address namessn Person buys makes employs Company Product namecategory stockprice name
M.P. Johnson, DBMS, Stern/NYU, Spring More design Repeating TA names & IDs – redundant TA is not TAing any course now lose TA’s data! TA should get its own ES StudentsCourses Enrolls Q: What’s wrong with this design? A: TA-NameTA-ID TA- Course-ID CName
M.P. Johnson, DBMS, Stern/NYU, Spring Opposite problem: Entity or attribute? Some E/Rs improved by removing entities Can convert Entity E into attributes of F if 1. R:F E is many-one one-one counts because special case 2. Attributes for E are independent of each other knowing one att val doesn’t tell us another att val Then remove E add all attributes of E to F
M.P. Johnson, DBMS, Stern/NYU, Spring StudentsCourses Enrolls TA-Name Assists TA Entity attribute CName Room StudentsCourses Enrolls CName Room TA-Name Course-ID
M.P. Johnson, DBMS, Stern/NYU, Spring Convert TA entity again? No! Multiple TAs allowed Violates condition (1) Redundant course data StudentsCourses Enrolls Assists TA CName CIDRoom TA-Name DBMS Howard DBMS Wesley … CName Room Course-ID TA-Name
M.P. Johnson, DBMS, Stern/NYU, Spring Convert TA entity again? StudentsCourses Enrolls Assists TA CName Room Course-ID TA-IDTA-Favorite-Color No! TA has dependent fields Violates condition (2) How can it tell? Redundant TA data CName TA-Name TA-ID TA-Color DBMS Ralph 678 Green A.Soft. Ralph 678 Green … TA-Name
M.P. Johnson, DBMS, Stern/NYU, Spring Entity or attributes? Should student address be an entity or an attribute? If student may have multiple addresses, must be entity campus address, permanent address attributes cannot be set-valued If we need to examine structure of address, must be entity find all students from NYS but not NYC If attribute, then it’s probably a simple string no structure! NB: this choice is a microcosm of entire miniworld (much) power of a DB comes from the structure imposed on the data
M.P. Johnson, DBMS, Stern/NYU, Spring Larger example DB design Application: library database. Authors have written books about various subjects; different libraries in the system may carry these books. Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname) Relationships [associating entities in square brackets]: Wrote-on [Authors, Subjects] Cover [Libraries, Subjects] On [Books, Subjects]
M.P. Johnson, DBMS, Stern/NYU, Spring E/R of DB design Name Author ssnphonebirthdate wrote-on Subject SName Title Carries Library LName On Book ISBN
M.P. Johnson, DBMS, Stern/NYU, Spring Poor initial design First design is a poor model of this system Some info not captured: How many copies does a lib. have of a given book? What edition of a book does the library have? Design problems: no direct relship associating authors and books no direct relship associating libraries and books Common queries complex and difficult/expensive What libraries carry books by a given author? What books has a given author written? Who is the author of a given book?
M.P. Johnson, DBMS, Stern/NYU, Spring Larger example DB design 2 Application: library database as before Entities (with attributes in parentheses): Authors (ssn, name, phone, birthdate) Books (ISDN, title) Subjects (sname, sid) Libraries (lname) Relations [associating entities in square brackets] (attributes in parentheses): Wrote [Authors, Books] Carries [Libraries, Books] (quantity, edition) On [Books, Subjects]
M.P. Johnson, DBMS, Stern/NYU, Spring E/R of improved DB design Rule of thumb: often queried together make closely connected Name Author ssnphonebirthdate wrote Book ISBN Title Carries Library LName Edition Quantity On Subject SName
M.P. Johnson, DBMS, Stern/NYU, Spring Next topic: Constraints Review: programmer-defined rules stating what should always be true about consistent databases Restrictions on data: Keys (e.g. SSNs uniquely identify people) Single value constraints (e.g. everyone has 1 father) Referential Integrity (e.g. person’s record refers to father father must exist) Domain constraints (e.g. gender in M/F, age in ) General constraints (e.g. no more than 10 customers per sales rep) Can’t infer constraints from data may hold “accidentally” they are a part of the schema
M.P. Johnson, DBMS, Stern/NYU, Spring E/R keys Uniquely identifies entity in ES Attribute or set of attributes Two entities cannot agree on all key attributes These attributes determine all others Every ES should have a key possibly including all attributes Primary key attributes underlined More than one possible key: Candidate keys, primary key Practical tip: create intentional key attribute E.g. SSN, course-id, employee-id, etc. SSN likely shorter than (name,address) Prevents quasi-redundancy address namessn Person
M.P. Johnson, DBMS, Stern/NYU, Spring Single-valued constraints “at most one” value sharp arrows E.g. attributes: could be null or one Many-one relationships: the “one” part is single-valued. Can think of key atts as (non-null) single- valued TACourse Assists
M.P. Johnson, DBMS, Stern/NYU, Spring Referential integrity “Exactly one value” NOT NULL attributes Relationships Non-null value refers to entity that exists Refer to entity with foreign key HTML analogy: no broken links Programming analogy: no dangling pointers Ways of handling deletion: Prevent deletion as long as referrer exist Enforce deletion of all referrers InstructorCourse Taught
M.P. Johnson, DBMS, Stern/NYU, Spring Referential integrity – E/R e.g. Insertion – must refer to existing entity Suppose need to add course: “Oracle” instructor: MPJ Q: Which order? Q: What if relship were exactly-exactly, say, M(Hs,Ws)? i.e., referential integrity in both directions? A: Put both inserts in one xact – later StudentsCourses Enrolls Instructor Taught
M.P. Johnson, DBMS, Stern/NYU, Spring Other kinds of constraints Domain constraints E.g. date: must be after 1980 Enumerated type: grades A through F, no E No specific E/R notation: mention with attribute or relationship General constraints: A class may have no more than 100 students; a student may not have more than 6 courses: StudentsCourses Enroll <=6<=100
M.P. Johnson, DBMS, Stern/NYU, Spring Next topic: Weak entity sets Definition: Some or all key attributes belong to another ES Why: An entity set is part of a hierarchy (not ISA) Connecting entity sets The key consists of 0, 1 or more of its own attributes Key attributes of entity sets from supporting relationships
M.P. Johnson, DBMS, Stern/NYU, Spring Conditions of Supporting relationships Supporting relationship R:E F R is many-one (E-F) (or one-one) R is binary Referential integrity from E to F a rounded arrow Those atts supplied to E are the key attributes of F F itself may be weak Another entity set G, and so on recursively A1 A2 R E F
M.P. Johnson, DBMS, Stern/NYU, Spring For several supporting relships from E to F Keys of each F role appear as foreign key of E Other many-one relationships Not necessarily supporting Requirements for weak entity sets From By Purchases A1 A2 A3 People Stores At-store
M.P. Johnson, DBMS, Stern/NYU, Spring Weak entity sets Example: Hierarchy – species & genus Idea: species name unique per genus only Species name Belongs-to Genus name
M.P. Johnson, DBMS, Stern/NYU, Spring Next time We’ll finish E/R models and begin the relational model Read chapter 3 through section 3.4 Info on project, hw likely posted soon