RELATIONAL DATA MODELING MIS2502 Data Analytics. What is a model? Representation of something in the real world.

Slides:



Advertisements
Similar presentations
BUSINESS DRIVEN TECHNOLOGY Plug-In T4 Designing Database Applications.
Advertisements

Ch5: ER Diagrams - Part 1 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
GCSE Computing#BristolMet Session Objectives# 21 MUST describe methods of validating data as it is input. SHOULD explain the use of key fields to connect.
From Class Diagrams to Databases. So far we have considered “objects” Objects have attributes Objects have operations Attributes are the things you record.
Agenda for Week 1/31 & 2/2 Learn about database design
Week 2 Normalization and Queries
Database Design Chapter 2. Goal of all Information Systems  To add value –Reduce costs –Increase sales or revenue –Provide a competitive advantage.
Relational Databases What is a relational database? What would we use one for? What do they look like? How can we describe them? How can you create one?
MIS2502: Data Analytics Relational Data Modeling
Data fundamentals file processing fundamentals entity-relationship diagrams the cornucopia case portfolio project Systems Analysis and Design for the Small.
Database Relationships Objective 5.01 Understand database tables used in business.
Database Relationships Objective 5.01 Understand database tables used in business.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Business Process Modeling
1 Advanced Computer Programming Databases. Overview What is a database? Database Basics Database Components Data Models Normalization Database Design.
Relational Database Concepts. Let’s start with a simple example of a database application Assume that you want to keep track of your clients’ names, addresses,
MIS2502: Data Analytics Coverting ERD into a DB Schema David Schuff
1 Chapter 1 Overview of Database Concepts. 2 Chapter Objectives Identify the purpose of a database management system (DBMS) Distinguish a field from a.
1 ER Modeling BUAD/American University Entity Relationship (ER) Modeling.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
MIS 301 Information Systems in Organizations Dave Salisbury ( )
SQL 1: GETTING INFORMATION OUT OF A DATABASE MIS2502 Data Analytics.
Copyright 2008 McGraw-Hill Ryerson 1 TECHNOLOGY PLUG-IN T5 DESIGNING DATABASE APPLICATIONS.
1.NET Web Forms Business Forms © 2002 by Jerry Post.
© Copyright 2002, L. M. Linson, may be freely used with this notice Practical Database Design “Structuring Your Tables” by Larry Linson, presented to the.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
CS370 Spring 2007 CS 370 Database Systems Lecture 4 Introduction to Database Design.
© Copyright 2002, 2005, 2013 L. M. Linson, may be freely used with this notice Practical Database Design “Structuring Your Tables” by Larry Linson, presented.
MIS2502: Data Analytics SQL – Getting Information Out of a Database David Schuff
M1G Introduction to Database Development 4. Improving the database design.
Description and exemplification of entity-relationship modelling.
Lesson 2: Designing a Database and Creating Tables.
MIS2502: Data Analytics Relational Data Modeling
1 DATABASE TECHNOLOGIES (Part 2) BUS Abdou Illia, Fall 2015 (September 9, 2015)
Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin APPENDIX C DESIGNING DATABASES APPENDIX C DESIGNING DATABASES.
MIS2502: Data Analytics SQL – Putting Information Into a Database David Schuff
Data modeling Process. Copyright © CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.
Understand Relational Database Management Systems Software Development Fundamentals LESSON 6.1.
6.1 © 2007 by Prentice Hall Chapter 6 (Laudon & Laudon) Foundations of Business Intelligence: Databases and Information Management.
MIS2502: Data Analytics SQL – Getting Information Out of a Database.
MIS2502: Data Analytics Relational Data Modeling David Schuff
Data Modeling and Entity-Relationship Model I
Database Planning Database Design Normalization.
DATA SCIENCE MIS0855 | Spring 2016 Designing Data
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Announcements n Difference between “excused” and “absent” n Office hours next week May not be here Monday Tuesday: 1:00 to 2:00 Wednesday: 10:00 to 11:00.
Let try to identify the conectivity of these entity relationship
Database Development Lifecycle
MIS2502: Data Analytics SQL – Putting Information Into a Database
MIS2502: Data Analytics Relational Data Modeling
MIS5101: Business Intelligence Relational Data Modeling
MIS2502: Data Analytics SQL – Putting Information Into a Database
MIS2502: Data Analytics SQL – Getting Information Out of a Database
MIS2502: Data Analytics Relational Data Modeling
MIS2502: Data Analytics SQL – Putting Information Into a Database
MIS2502: Review for Exam 1 JaeHwuen Jung
MIS2502: Data Analytics Converting ERDs to Schemas
MIS2502: Data Analytics SQL – Getting Information Out of a Database Part 2: Advanced Queries Aaron Zhi Cheng
MIS2502: Data Analytics Relational Data Modeling
MIS2502: Data Analytics SQL – Putting Information Into a Database
MIS2502: Data Analytics Relational Data Modeling
MIS2502: Review for Exam 1 Aaron Zhi Cheng
MIS2502: Data Analytics SQL – Putting Information Into a Database
MIS2502: Data Analytics SQL – Putting Information Into a Database
MIS2502: Data Analytics Relational Data Modeling
MIS2502: Data Analytics SQL 4– Putting Information Into a Database
MIS2502: Data Analytics Relational Data Modeling 2
Database Management system
MIS2502: Data Analytics SQL – Getting Information Out of a Database Part 2: Advanced Queries Zhe (Joe) Deng
MIS2502: Data Analytics Relational Data Modeling 3
Presentation transcript:

RELATIONAL DATA MODELING MIS2502 Data Analytics

What is a model? Representation of something in the real world

Modeling a database A representation of the structure of the data Describes the data contained in the database Explains how the data interrelates A student is part of a section, which is part of a course

Why bother modeling? Creates a blueprint before you start building the database Gets the story straight: easy for non-technical people to understand Minimize having to go back and make changes in the implementation stage

The process of analysis and design Systems Analysis Analysis of complex, large-scale systems and the interactions within those systems Systems Design The process of defining the hardware and software architectures, components, models, interfaces, and data for a computer system to satisfy specified requirements Notice that they are not the same!

Basically… Systems Analysis is the process of modeling the problem Requirements-oriented What should we do? Systems Design is the process of modeling a solution Functionality-oriented How should we do it? This is where we define and understand the business scenario. This is where we implement that scenario as a database. In the context of database development.

Start with a problem statement “We want a database to track orders.” That’s too vague to create a useful system, so we then gather requirements to learn more Gather documentation About the business process About existing systems Conduct interviews Employees directly involved in the process Other stakeholders (i.e., customers) Management Why are each of these important? Are there others? Why are each of these important? Are there others?

Start with a problem statement Refine the problem statement Getting iterative feedback from the client End up with a scenario like this: The system must track customer orders Multiple products can go into an order A customer is described by their name, address, and a unique Customer ID number An order is described by the date in which it was placed, what was bought, and how much it costs The specification “what was bought” is a little vague, and that will cause us a problem a little later. But let’s leave it for now… The specification “what was bought” is a little vague, and that will cause us a problem a little later. But let’s leave it for now…

The Entity Relationship Diagram (ERD) The primary way of modeling a relational database Part of the “analysis” process Implemented as a picture with three key elements Entity Relationship A uniquely identifiable thing (i.e., person, order) Describes how two entities relate to one another (i.e., makes) Attribute A characteristic of an entity or relationship (i.e., first name, order number)

A very simple example Customer First name makes Order Last name City State Zip Price Product name Order Date Order number Customer ID

The primary key Entities need to be uniquely identifiable So you can tell them apart when you retrieve them Use a primary key An attribute (or a set of attributes) that uniquely identifies an entity Order number Customer ID Uniquely identifies a customer Uniquely identifies an order How about these as primary keys for Customer: First name and/or last name? Social security number? How about these as primary keys for Customer: First name and/or last name? Social security number?

Last component: Cardinality Defines the rules of the association between entities Customer makes Order This is a one-to-many relationship: One customer can have many orders One order can only belong to one customer at least – one at most - many at least – one at most - one

Crows Feet Notation Customer makes Order There are other ways of denoting cardinality, but this one is pretty standard. So called because this… …looks something like this There are also variations of the crows feet notion!

Cardinality is defined by business rules What would the cardinality be in these situations? Order contains Product ? ? Course has Section ? ? Employee has Office ? ?

But we have a problem with our ERD This assumes every order contains only one product. So if I want two products, I have to make two orders! The problem: Product is defined as an attribute, not an entity. (Because we didn’t define our requirements clearly enough?)

Here’s a solution Now A customer can make multiple orders An order can contain multiple products A product can be part of multiple orders Customer First name makes Order Last name City StateZip Price Product name Order Date Order number Product contains Customer ID Quantity

Implementing the ERD As a database schema A map of the tables and fields in the database This is what is implemented in the database management system Part of the “design” process A schema actually looks a lot like the ERD Entities become tables Attributes become fields Relationships can become additional tables

Structure of a database Data elementDescription CharacterSingle letter or number (“A”, “Z”, “1”) FieldSet of related characters (first name) RecordSet of related fields (all information about a customer) TableSet of related records (all customers in the company) DatabaseSet of related tables (all information about the company)

The Rules Primary key field of “1” table put into “many” table as foreign key field 1:many relationships Create new table 1:many relationships with original tables many:many relationships Primary key field of one table put into other table as foreign key field 1:1 relationships 1. Create a table for every entity 2. Create table fields for every entity’s attributes 3. Implement relationships between the tables

Our Order Database schema Order-Product is a decomposed many-to-many relationship Order-Product has a 1:n relatonship with Order and Product Now an order can have multiple products, and a product can be associated with multiple orders Original 1:n relationship Original n:n relationship

What the Customer and Order tables look like CustomerIDFirstNameLastNameCityStateZip 1001GregHousePrincetonNJ LisaCuddyPlainsboroNJ JamesWilsonPittsgroveNJ EricForemanWarminsterPA19111 Order Number OrderDateCustomer ID Note that there are no repeating records Every customer is unique Every order is unique This is an example of normalization. Note that there are no repeating records Every customer is unique Every order is unique This is an example of normalization. Customer Table Order Table

Normalization Organizing data to minimize redundancy (repeated data) This is good for two reasons The database takes up less space You have a lower chance of inconsistencies in your data If you want to make a change to a record, you only have to make it in one place The relationships take care of the rest But you will usually need to link the separate tables together in order to retrieve information

To figure out who ordered what Match the Customer IDs of the two tables, starting with the table with the foreign key (Order): We now know which order belonged to which customer This is called a join But it’s an inefficient way to store data (redundancies) So we normalize Order Number OrderDateCustomer ID FirstNameLastNameCityStateZip GregHousePrincetonNJ LisaCuddyPlainsboroNJ GregHousePrincetonNJ EricForemanWarminsterPA19111 Order Table Customer Table

Now the many:many relationship Order Number OrderDateCustomer ID Order Table ProductIDProductNamePrice 2251Cheerios Bananas Eggo Waffles2.99 Product Table Order ProductID Order number Product IDQuantity Order-Product Table This table relates Order and Product to each other!

To figure out what each order contains Match the Product IDs and Order IDs of the tables, starting with the table with the foreign keys (Order-Product): Order ProductID Order Number Product ID QuantityOrder Number Order Date Customer ID Product ID Product Name Price Cheerios Bananas Eggo Waffles Cheerios Bananas Eggo Waffles Eggo Waffles2.99 Order-Product TableOrder TableProduct Table Now there is redundant product data as a result of the join!

Why redundant data is a big deal The redundant data seems harmless, but: What if the price of “Eggo Waffles” changes? And what if Greg House changes his address? And if there are 1,000,000 records? The redundant data seems harmless, but: What if the price of “Eggo Waffles” changes? And what if Greg House changes his address? And if there are 1,000,000 records?

Best practices for normalization Create new entities when there are collections of related attributes, especially when they would repeat For example, consider a modified Product entity Price Product name Product Vendor Name Vendor Address Vendor Phone Price Product name Product Vendor Name Vendor Address Vendor Phone Vendor sells Don’t do this… …do this. Then you won’t have to repeat vendor information for each product. …do this. Then you won’t have to repeat vendor information for each product. Vendor ID ? Why did we introduce VendorID?

Best practices for normalization Create new entities to enforce data entry standards Customer First name Last name City StateZip Customer ID Customer First name Last name City ID Zip Customer ID City City Name State ID State State Name This is fine… …but this can be even better. ! The city name is entered only once in the City table; CityID is used in Customer table

City and State as “lookup tables” Why this can be a better way of doing it CustomerIDFirstNameLastNameCityIDStateIDZip 1001GregHouse LisaCuddy JamesWilson EricForeman CityIDCityName 1Princeton 2Plainsboro 3Pittsgrove 4Warminster StateIDStateNameAbbr 1New JerseyNJ 2PennsylvaniaPA This helps prevent inconsistent spellings (Pennsylvania is always entered as “2”) Customer City State Customer First name Last name City ID Zip Customer ID City City Name State ID State State Name

The three-way relationship Sometimes three entities are necessary to capture what happens in a transaction This would be modeled as an many-to-many- to-many relationship Mechanic Model Make Salary Description Car Performs Repair date Repair Charge Name Employee ID Repair code VIN

The many:many:many table The many-to-many-to-many relationship would still be represented as a separate table Just with three foreign keys, instead of two RepairIDRepairCodeEmployeeIDVINRepair date