Lecture 6 Data Model Design (continued)

Slides:



Advertisements
Similar presentations
Data Definition and Integrity Constraints
Advertisements

Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
Day 3 - Basics of MySQL What is MySQL What is MySQL How to make basic tables How to make basic tables Simple MySQL commands. Simple MySQL commands.
Phonegap Bridge – File System CIS 136 Building Mobile Apps 1.
Chapter 14 & 15 Conceptual & Logical Database Design Methodology
Entity-Relationship Design
DATABASES AND SQL. Introduction Relation: Relation means table(data is arranged in rows and columns) Domain : A domain is a pool of values appearing in.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
Copyright © Curt Hill SQL The Data Definition Language.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 13 Managing Databases with SQL Server 2000.
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor Ms. Arwa.
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
Database Technical Session By: Prof. Adarsh Patel.
Concepts and Terminology Introduction to Database.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
CHAPTER:14 Simple Queries in SQL Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
Lecture 7 Integrity & Veracity UFCE8K-15-M: Data Management.
Chapter 7 SQL HUANG XUEHUA. SQL SQL server2005 introduction Install components  management studio.
CSC 2720 Building Web Applications Database and SQL.
SQL Data Definition Language (DDL) Using Microsoft SQL Server 1SDL Data Definition Language (DDL)
Unit 6 Data Storage Design. Key Concepts 1. Database overview 2. SQL review 3. Designing fields 4. Denormalization 5. File organization 6. Object-relational.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
SQL Fundamentals  SQL: Structured Query Language is a simple and powerful language used to create, access, and manipulate data and structure in the database.
Data Types Lesson 4. Skills Matrix Table A table stores your data. Tables are relational in that they are organized as rows and columns (a matrix). Each.
Advanced Web 2012 Lecture 3 Sean Costain What is a Database? Sean Costain 2012 A database is a structured way of dealing with structured information.
Chapter 9 Logical Database Design : Mapping ER Model To Tables.
GLOBEX INFOTEK Copyright © 2013 Dr. Emelda Ntinglet-DavisSYSTEMS ANALYSIS AND DESIGN METHODSINTRODUCTORY SESSION EFFECTIVE DATABASE DESIGN for BEGINNERS.
Sql DDL queries CS 260 Database Systems.
Information Access Mgt09/12/971 Entity-Relationship Design Information Level Design.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
DBMS 3. course. Reminder Data independence: logical and physical Concurrent processing – Transaction – Deadlock – Rollback – Logging ER Diagrams.
Lecture 9 Using Structured Query Language (SQL) Jeffery S. Horsburgh Hydroinformatics Fall 2012 This work was funded by National Science Foundation Grant.
CDT/1 Creating data tables and Referential Integrity Objective –To learn about the data constraints supported by SQL2 –To be able to relate tables together.
Lecture 5 Data Model Design Jeffery S. Horsburgh Hydroinformatics Fall 2012 This work was funded by National Science Foundation Grant EPS
SQL Basics Review Reviewing what we’ve learned so far…….
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Data Model / Database Implementation Jeffery S. Horsburgh Hydroinformatics Fall 2015 This work was funded by National Science Foundation Grants EPS
1 of 42 Lecture 5 Data Model Design Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National Science Foundation Grants EPS
Jeffery S. Horsburgh Hydroinformatics Fall 2014
Getting started with Accurately Storing Data
Creating Database Objects
Fundamentals of DBMS Notes-1.
Logical Database Design and the Rational Model
Understanding Data Storage
Business System Development
Managing Tables, Data Integrity, Constraints by Adrienne Watt
Module 2: Creating Data Types and Tables
Using Structured Query Language (SQL) (continued)
Entity-Relationship Model
Quiz Questions Q.1 An entity set that does not have sufficient attributes to form a primary key is a (A) strong entity set. (B) weak entity set. (C) simple.
Databases and Database Management Systems Chapter 9
Attributes and Domains
Translation of ER-diagram into Relational Schema
DATABASE MANAGEMENT SYSTEM
Accounting System Design
Normalization Referential Integrity
CS4222 Principles of Database System
LECTURE 34: Database Introduction
Database systems Lecture 2 – Data Types
Data Model.
PT2520 Unit 5: Physical Design
Accounting System Design
Databases and Information Management
Attributes and Domains
Database Design: Relational Model
LECTURE 33: Database Introduction
Creating Database Objects
Presentation transcript:

Lecture 6 Data Model Design (continued) Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National Science Foundation Grants EPS 1135482 and EPS 1208732

Objectives Identify and describe important entities and relationships to model data Develop data models to represent, organize, and store data Design and use relational databases to organize, store, and manipulate data

Present and discuss your preliminary designs

Naming Database Objects Names should be Unique Have some meaning to the user Short No spaces or reserved characters Entity and Attribute names = nouns Relationship names = verbs Many Observations are made at a Site.

More on Attributes Attribute values should be atomic Allows for: Present a single fact Allows for: simpler programming, greater reusability of data easier to implement changes

Atomic Attribute Example Instead of 1 overloaded attribute: VariableName = “Dissolved Oxygen, mg/L, surface water” You might use three: VariableName = “Dissolved Oxygen” Units = “mg/L” SampleMedium = “surface water”

Common Attribute Atomicity Violations Simple aggregation: Address = “8200 Old Main Hill, Logan, UT, 84322” Complex codes: VariableCode = “DO_mgL_Avg” Text fields: Free form text. Overreliance may mean that data requirements may not be met by the model. Mixed domains: Where the value of an attribute can have different meaning under different conditions.

Primary Keys Attribute or set of attributes that uniquely identify a specific instance of an entity (row in the table) Primary keys must: Have a non-null value for each instance of an entity Have a unique value for each instance of an entity Have values that do not change or become null

Normalization Organizing the fields and tables in a relational database to minimize redundancy and dependency Dividing large tables into smaller tables (with relationships) Isolate data so that additions, deletions, and modifications of a field or record can be made in one place Reduce the need for restructuring the database as new types of data are introduced

Unnormalized Data Example SiteID SiteName VariableID VariableName DateTime Value 1 Logan River Temperature 1/1/2012 5 1/2/2012 2 pH 8 Spring Creek 7 7.5

Issues with Unnormalized Data SiteID SiteName VariableID VariableName DateTime Value 1 Logan River Temperature 1/1/2012 5 1/2/2012 2 pH 8 INSERT: The fact that a site or variable exists cannot be asserted until a measurement has been made. DELETE: If a row is deleted, information may be lost about not only the measurement, but also the variable and the site. UPDATE: If a SiteName or VariableName changes, multiple records have to be updated with the new information

Normalization Example 1 * * SiteID SiteName 1 Logan River 2 Spring Creek SiteID VariableID DateTime Value 1 1/1/2012 5 1/2/2012 2 8 7 7.5 1 VariableID VariableName 1 Temperature 2 pH

Normalization Tradeoffs Pros: Eliminates redundant data Saves space and can improve storage efficiency Inserts and updates are done in one place Can improve efficiency Cons: May complicate the code of common queries Abstracts tables using keys – can be harder for a human to “see” the data

Data Integrity Rules Entity Integrity Primary key must exist, be unique, and not null ValueID SiteID VariableID DateTime Value 101 1 1/1/2012 5 102 1/2/2012 103 2 8 104 105 7 106 107 7.5 108 SiteID SiteName 1 Logan River 2 Spring Creek VariableID VariableName 1 Temperature 2 pH

Data Integrity Rules Referential Integrity Every foreign key value must match a primary key value in an associated table Ensures that we can navigate relationships ValueID SiteID VariableID DateTime Value 101 1 1/1/2012 5 102 1/2/2012 103 2 8 104 105 7 106 107 7.5 108 SiteID SiteName 1 Logan River 2 Spring Creek VariableID VariableName 1 Temperature 2 pH

Data Integrity Rules Insert and Delete Rules What happens to a parent entity when child entities are deleted? What happens to child entities when a parent is deleted? ValueID SiteID VariableID DateTime Value 101 1 1/1/2012 5 102 1/2/2012 103 2 8 104 105 7 106 107 7.5 108 SiteID SiteName 1 Logan River 2 Spring Creek VariableID VariableName 1 Temperature 2 pH

Data Integrity Rules Value Domains Valid set of values for an attribute Controlled vocabulary, data type, length, range, constraints, default value Integer Fields Date Field Double Controlled Domain ValueID SiteID VariableID DateTime Value 101 1 1/1/2012 5.5 102 1/2/2012 5.678 103 2 8.0 104 8.9 VariableID VariableName 1 Temperature 2 pH

Specialization Designating entity subgroups within a higher level entity Entity subgroups have attributes or relationships that do not apply to the higher level entity Attributes are inherited A lower level entity inherits all of the attributes and relationship participation of the higher level entity to which it is linked

Specialization Example A car is a vehicle A truck is a vehicle

Generalization Combine a number of entities that share features into a higher level entity Specialization and generalization are inversions of each other Specialization Generalization

Constraints on Specialization/Generalization Constraints on which entities can be members of a given lower-level entity set Condition-defined – “all vehicles with a towing capacity of more than 10,000 lbs are trucks” Constraints on whether entities can belong to more than one lower-level entity set Disjoint – an entity can belong to only one Overlapping – an entity can belong to more than one Completeness constraint – must every higher level entity belong to at least one lower level entity

Mapping Specialization to Tables Option 1: Put everything in one table There will be NULL values where attributes don’t apply

Mapping Specialization to Tables Option2: Form tables for the higher level entity and the lower level entities Each lower level entity includes the primary key of the higher level entity set

Mapping Specialization to Tables Option3: Model only the lower level entities Repeats attributes

Steps in Data Model Design Identify entities Identify relationships among entities Determine the cardinality and participation of relationships Designate keys / identifiers for entities List attributes of entities Identify constraints and business rules Map 1-6 to a physical implementation

Physical Data Model The “physical” means a specific implementation of the data model Choice of hardware and operating system Choice of relational database management system Implementation of tables, relationships, constraints, triggers, indices, data types Database access and security Performance Storage

Relational Database Management Systems (RDBMS) File vs. server based Free vs. commercial Different data types Potentially different syntax for SQL queries Security models and concurrent users

Reduction of an ER Diagram to Tables Converting an ER diagram to table format is the basis for deriving a relational database Primary keys allows entities to be expressed as tables that contain data A database is a collection of tables Tables are assigned the same name as the entity Each table has columns that correspond to attributes – each column has a unique name Each column must have a single data type

Advanced Database Objects Views Stored procedures Triggers Constraints Implementation of these objects may depend on your choice of RDBMS software

Database Views A View is equivalent to a table, but is defined by a SQL query Used to present a set of desired information, independent of the underlying database structure Can be used to hide complexities of the underlying data model from the user One way to address the Cons of normalization

Stored Procedures A set of structured query language (SQL) statements that are stored and executed on the server Useful for repetitive tasks Encapsulate functionality and isolate users from data tables Can provide a security layer – software applications have no access to the database directly, but can execute stored procedures

Triggers Special kind of stored procedure Automatically executes on a table or view when an event occurs in the database Events include: CREATE, ALTER, INSERT, UPDATE, DELETE Mostly used to maintain the integrity of information in the database

Constraints Common way to enforce data integrity Examples: Not NULL – value in a column must not be NULL Unique – value(s) in specified column(s) must be unique for each row in a table Primary Key – value(s) in the specified column(s) must be unique for each row in the table and not be NULL Foreign Key – values(s) in the specified column(s) must reference an existing record in another table via its primary key Check – an expression that validates data and must not be FALSE

Data Types Each attribute of an entity (column in a database table) must have a single data type Data types are enforced by RDBMS software Table: DataValues Attribute Data Type Sample Data ValueID Integer 1 SiteID 5 VariableID DateTime Date/Time 8/15/2013 4:30 PM DataValue Double 4.567

Data Types Data types can be specific to RDBMS software RDBMS Integer Floating Point Decimal String Date/Time MS SQL Server TINYINT, SMALLINT, INT, BIGINT FLOAT, REAL NUMERIC, DECIMAL, SMALLMONEY, MONEY CHAR, VARCHAR, TEXT, NCHAR, NVARCHAR, NTEXT DATE, DATETIMEOFFSET, DATETIME2, SMALLDATETIME, DATETIME, TIME MySQL TINYINT (8-bit), SMALLINT (16-bit), MEDIUMINT (24-bit), INT (32-bit), BIGINT (64-bit) FLOAT (32-bit), DOUBLE (aka REAL) (64-bit) DECIMAL CHAR, BINARY, VARCHAR, VARBINARY, TEXT, TINYTEXT, MEDIUMTEXT, LONGTEXT DATETIME, DATE, TIMESTAMP, YEAR PostgreSQL SMALLINT (16-bit), INTEGER (32-bit), BIGINT (64-bit) REAL (32-bit), DOUBLE PRECISION (64-bit) DECIMAL, NUMERIC CHAR, VARCHAR, TEXT DATE, TIME (with/without TIMEZONE), TIMESTAMP (with/without TIMEZONE), INTERVAL Quick summary from: http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems

Summary of 3 Levels of Data Model Design Feature Conceptual Logical Physical Entity Names X Entity Relationships Attributes Primary Keys Foreign Keys Table Names Column Names Column Data Types Views Stored Procedures Triggers Constraints

Summary Simple rules for naming objects and specifying domains can help protect the integrity of data Normalization can help reduce redundancy, increase storage efficiency, and protect data integrity – but there are tradeoffs Data integrity rules include relationships and domains and protect the integrity of data in the database Specialization and generalization require special consideration in implementation A physical database implementation requires choices about hardware, software, security, formats and storage, and other factors