44220: Database Design & Implementation Review & Assignment 1 Ian Perry Room: C41C Tel Ext.: 7287 E-mail: I.P.Perry@hull.ac.uk http://itsy.co.uk/ac/0708/sem2/44220_DDI/ 1 1 1 1
Data, Information, …
Data Information What is Data? What is Information? a series of observations, measurements, or facts. What is Information? data that have been transformed into a meaningful and useful form for people. Information = Data + Structure + Context the same data can give different information if a different structure and/or context is applied.
Database = Data + Structure A database is: an organised collection of data. A database management system (DBMS) is: software designed to assist in maintaining and utilising large collections of data. A relational database management system (RDBMS) is: a specific type of DBMS.
Modelling the ‘Real’ World Requires us to focus on the critical aspects of the real world’s richness. Models are not Real/Complete. All models require decision of what to include & what to exclude. These design decisions represent someone’s view of what is important (and what is not important) about a particular reality. As such, there is no right answer! All one might say is that this is a ‘good’ model, given the purpose we want to use it for.
Example Models Model Duck: Model Aeroplane: Data Model: Purpose; to show shape, colour, size, etc. Model Aeroplane: Purpose; to show general structure, identification of parts, flight characteristics, etc. Data Model: Purpose; the representation of objects of interest to an enterprise, allowing data to be structured (i.e. given meaning) and manipulated (for specific purposes).
The ‘Data Modelling Stack’
Conceptual Data Modelling Identify ALL of the relevant Entities. must play a necessary role in the business system. Identify those Attributes that adequately describe each Entity. remember to choose ‘key’ attribute(s). Identify the Relationships between Entities. determine the Degree of each Relationship: determine the Type of each Relationship. attempt to decompose any many-to-many Relationships that you have identified. 21 20 22
Entities & Attributes Real World Situation: Hospital Entities = objects of interest, e.g.: Doctor, Nurse, Ward, Patient, etc. Attributes = describing each Entity, e.g.: Patient = Name, Address, Date-of-Birth, Gender, etc. Entity Definitions Staff (StaffID, Role, Name, Room, Extension, Speciality, …) Patient (FirstName, FamilyName, DOB, Address, Gender, …) NB. ‘key’ Attribute(s) MUST be identified.
Occurrence Diagrams? Staff Ward Fred Smith Ward 1 Jane Bloggs Ward 2 Use these (with values for Key Attributes) to discover how many occurrences of each Entity are actually on either side of a Relationship (i.e. the Degree of the Relationship). Staff Fred Smith Jane Bloggs Arthur Jones Angela Oust Ward Ward 1 Ward 2 Ward 3 M M 1 1
Degree, Type & Participation Diagrams Ward Patient has beds for => 1 M <= stays in Clinic Patient treats => 1 M <= attends NB. the above Relationships are also Exclusive. Staff Ward Team M 1 <= employs work in => Patient Operation Pat/Op M 1 <= performed on has => i.e. having ‘solved’ the M:M Relationship ‘problems’.
ER Diagram for a Hospital Patient Clinic treats => 1 M <= attends Ward has beds for => <= stays in #==========# Operation Pat/Op <= performed on has => Staff Team <= work in
Logical Data Modelling All about: translating our Conceptual Data Model so that it might be implemented using software that matches a specific Database Theory. Relational Database Theory, Codd (1970): allows us to develop mathematically rigorous abstract data models, composed of a number of distinct Relations. Tables are NOT Relations: simply the way we choose to mentally give flesh to our Logical Data Model. 15 16
Relations Are defined by a list of Attributes (i.e. columns), that: must be distinctly named. contain data entries that are atomic, of the same type, from the same domain. can be defined in any order. Tuples (i.e. rows): once again, ordering is irrelevant. must be unique (so need a Key). Relationships: are made via Primary/Foreign Key mechanism. 14 15
Example Relations (+Tables) Staff (SCode, Name, Address, DoB, DoE) Contract (CCode, Site, Begin, End, Super) 6 7
Avoid Database Anomalies! What is an Anomaly? Anything we try to do with a database that leads to unexpected (unpredictable) results. Three types of Anomaly: insert delete update Need to check your logical database design carefully: the only good database is an anomaly free database. 2 2 3 3 16
A Conceptual Model Consider the following ‘simple’ conceptual data model: Staff Course Student 1 M N Staff(Staff-ID, Name, Address, ScalePoint, RateOfPay, DOB, ...) Student(Enrol-No, Name, Address, OLevelPoints, ...) Course(CourseCode, Name, Duration, ...) 8 8
The ‘Translation’ Process Entities become Relations Attributes become Attributes (?) Entity Identifiers become Primary Keys Relationships are represented by additional Foreign Key Attributes in those Relations that are at the ‘M’ end of a 1:M relationship. Usually end up with more Relations than we originally defined as Entities, with: ‘Artificial’ Relations – to solve M:M problems. ‘Split-off Relations’ – to avoid dependency problems. 9 9
5 Relations from 3 Entities Student Staff Course Solves M:M Problem Team Pay Avoids Dependency Problem 17 17
Document as Database Schema? A Database Schema defines all Relations (together with Attributes and Primary/Foreign Keys) and their relevant Domains. We should have ‘captured’ the Business situation (assumptions and constraints) in the Conceptual Data Model, e.g: a College only delivers 10 Courses. These assumptions and constraints become the Domains of the Database Schema. 19 19
Database Schema - Domains Schema College Domains StudentIdentifiers = 1 - 9999; StaffIdentifiers = 1001 - 1199; PersonNames = TextString (15 Characters); Addresses = TextString (25 Characters); CourseIdentifiers = 101 - 110; CourseNames = Comp, IS, Law, Mkt, ...; OLevelPoints = 0 - 100; ScalePoints = 1 - 12; PayRates = £14,005, £14,789, £15,407, ...; StaffBirthDates = Date (dd/mm/yyyy), >21 Years before Today; 20 20
Database Schema - Relations Relation Student Enrol-No: StudentIdentifiers; Name: PersonNames; Address: Addresses; OLevel: OLevelPoints; Tutor: StaffIdentifiers; Primary Key: Enrol-No. Foreign Key: Tutor refs Staff.Staff-ID Relation Staff Staff-ID: StaffIdentifiers; ScalePoint: ScalePoints; DOB: StaffBirthDates; Primary Key: Staff-ID. Foreign Key: ScalePoint refs Pay.ScalePoint Continue to define each of the other Relations in a similar manner. Remember to define ALL of the Relations, including: ‘artificial’ ones (e.g. Team) ‘split-off’ ones (e.g. Pay) 21 21
Assignment 1? Read the Case Study carefully: Two parts: NB.: Must understand the Business (i.e. Learn-by-Post) for whom you are developing this database. Two parts: develop an appropriate conceptual data model that might deliver the information requirements. develop a robust logical data model that will deliver the information requirements. NB.: Test BOTH Data Models with the 10 questions at the end of the Learn-by-Post Case Study.
Answer the Questions I have set! Part 1 – Conceptual Data Model (40 Marks) ER Diagram; depicting the Relationships between all Entities, AND indicating the degree, type & participation of each Relationship. Part 2 – Logical Data Model (60 Marks) Database Schema; specifying all Domains, Relations, Attributes and Primary & Foreign Keys. NB.: BOTH of the above MUST be in the format as defined in the Lectures and practised during the Workshops. Ass 1 Deadline = Tuesday, the 1st of April, 2008.