Data Quality Class 4. Goals Questions Review of SQL select Data Quality Rules.

Slides:



Advertisements
Similar presentations
The Relational Model Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
Advertisements

SQL Lecture 10 Inst: Haya Sammaneh. Example Instance of Students Relation  Cardinality = 3, degree = 5, all rows distinct.
Accounting System Design
Maintenance Modifying the data –Add records –Delete records –Update records Modifying the design –Add fields into tables –Remove fields from a table –Change.
Relations The Relational Data Model John Sieg, UMass Lowell.
Data Quality Class 4. Goals Discuss Project Midterm Statistical Process Control Data Quality Rules.
Data Quality Class 5. Goals Project Data Quality Rules (Continued) Example Use of Data Quality Rules.
Chapter 3. 2 Chapter 3 - Objectives Terminology of relational model. Terminology of relational model. How tables are used to represent data. How tables.
Relational Model Stores data as tables –Each column contains values about the same attribute –Each column has a distinct name –Each row contains values.
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
Accounting Databases Chapter 2 The Crossroads of Accounting & IT
The Relational Model Codd (1970): based on set theory Relational model: represents the database as a collection of relations (a table of values --> file)
Database – Part 2a Dr. V.T. Raja Oregon State University.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Database Constraints. Database constraints are restrictions on the contents of the database or on database operations Database constraints provide a way.
Chapter 7 Constraints and Triggers Spring 2011 Instructor: Hassan Khosravi.
Database Architecture The Relational Database Model.
© Pearson Education Limited, Chapter 2 The Relational Model Transparencies.
Chapter 4 The Relational Model.
Database Management System Lecture 6 The Relational Database Model – Keys, Integrity Rules.
Learningcomputer.com SQL Server 2008 – Entity Relationships in a Database.
Database Technical Session By: Prof. Adarsh Patel.
Chapter 9 Designing Databases Modern Systems Analysis and Design Sixth Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich.
Constraints  Constraints are used to enforce rules at table level.  Constraints prevent the deletion of a table if there is dependencies.  The following.
Chapter 3 The Relational Model. 2 Chapter 3 - Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between.
Lecture 7 Integrity & Veracity UFCE8K-15-M: Data Management.
1 Structured Query Language (SQL). 2 Contents SQL – I SQL – II SQL – III SQL – IV.
1 The Relational Model. 2 Why Study the Relational Model? v Most widely used model. – Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy.
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Relational Database. Database Management System (DBMS)
Slide Chapter 5 The Relational Data Model and Relational Database Constraints.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 7 (Part a): Introduction to SQL Modern Database Management 9 th Edition Jeffrey A.
1 © Prentice Hall, 2002 Chapter 5: Logical Database Design and the Relational Model Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Fall 2001Database Systems1 Triggers Assertions –Assertions describe rules that should hold for a given database. –An assertion is checked anytime a table.
Relational Theory and Design
Advanced Accounting Information Systems Day 10 answers Organizing and Manipulating Data September 16, 2009.
1 CS 430 Database Theory Winter 2005 Lecture 4: Relational Model.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
The relational model A data model (in general) : Integrated collection of concepts for describing data (data requirements). Relational model was introduced.
Constraints Lesson 8. Skills Matrix Constraints Domain Integrity: A domain refers to a column in a table. Domain integrity includes data types, rules,
1 ER Modeling BUAD/American University Mapping ER modeling to Relationships.
The Relational Model. 2 Relational Model Terminology u A relation is a table with columns and rows. –Only applies to logical structure of the database,
Session 1 Module 1: Introduction to Data Integrity
Understand Primary, Foreign, and Composite Keys Database Administration Fundamentals LESSON 4.2.
CS34311 The Relational Model. cs34312 Why Relational Model? Currently the most widely used Vendors: Oracle, Microsoft, IBM Older models still used IBM’s.
Dr Gordon Russell, Napier University Unit SQL 1 1 SQL 1 Unit 1.2.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 5 (Part a): Logical Database Design and the Relational Model Modern Database Management.
Chapter 3: Relational Databases
Mapping ER to Relational Model Each strong entity set becomes a table. Each weak entity set also becomes a table by adding primary key of owner entity.
Lecture 4: Logical Database Design and the Relational Model 1.
Chapter 3 The Relational Model. Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between mathematical.
Constraints Advanced Database Systems Dr. AlaaEddin Almabhouh.
CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
Database Constraints ICT 011. Database Constraints Database constraints are restrictions on the contents of the database or on database operations Database.
Getting started with Accurately Storing Data
Chapter 6: Integrity (and Security)
Chapter 4 Logical Database Design and the Relational Model
Chapter 4: Logical Database Design and the Relational Model
Chapter 6 - Database Implementation and Use
 2012 Pearson Education, Inc. All rights reserved.
Relational Databases.
Lecture # 13 (After 1st Exam)
Databases and Information Management
Module 5: Implementing Data Integrity by Using Constraints
CHAPTER 4: LOGICAL DATABASE DESIGN AND THE RELATIONAL MODEL
Databases and Information Management
Presentation transcript:

Data Quality Class 4

Goals Questions Review of SQL select Data Quality Rules

SQL Structured Query Language Used to extract data from databases Used to insert data into a database

The Select Statement select [all | distinct] from [ | ] [,[ | ]...] [where ] [group by [, ]...] [having ] [order by { | } [asc | desc] [,{ | } [asc | desc]]...]

Data Quality Rules Definitions Proscriptive Assertions Prescriptive Assertions Conditional Assertions Operational Assertions

Definitions Nulls Domains Mappings

Proscriptive Assertions Describe what is not allowed Used to figure out what is wrong with data Used for validation

Prescriptive Assertions Describe what is supposed to happen with data Can be used for data population, extraction, transformation Can also be used for validation

Conditional Assertions Define an assertion that must be true if a condition is true

Operational Assertions Define an action that must be taken if a condition is true

9 Classes of Rules 1. Null value rules 2. Value rules 3. Domain membership rules 4. Domain Mappings 5. Relation rules 6. Table, Cross-table, and Cross-message assertions 7. In-Process directives 8. Operational Directives 9. Other rules

Null Value Rules Null value specification – Define GETDATE for unavailable as “fill in date” Null values allowed – Attribute A allowed nulls {GETDATE, U, X} Null values not allowed – Attribute B nulls not allowed

Value Rules Value restriction rule Restrict GRADE: value >= ‘A’ AND value <= ‘F’ AND value != ‘E’

Domain Rules Domain Definition Domain Membership Domain Nonmembership Domain Assignment

Mapping Rules Mapping definition Mapping membership Mapping nonmembership Mapping Assignment

Relation Rules Completeness Exemption Consistency Derivation

Completeness Defines when a record is complete (I.e., what fields must be present) IF (Orders.Total > 0.0), Complete With {Orders.Billing_Street, Orders.Billing_City, Orders.Billing_State, Orders.Billing_ZIP}

Exemption Defines which fields may be missing IF (Orders.Item_Class != “CLOTHING”) Exempt {Orders.Color, Orders.Size }

Consistency Define a relationship between attributes based on field content – IF (Employees.title == “Staff Member”) Then (Employees.Salary >= AND Employees.Salary < 30000)

Derivation Prescriptive form of consistency rule Details how one attribute’s value is determined based on other attributes IF (Orders.NumberOrdered > 0) Then { Orders.Total = (Orders.NumberOrdered * Orders.Price) * 1.05 }

Table and Cross-Table Rules Functional Dependence Primary Key Assertion Foreign Key Assertion (=referential integrity)

Functional Dependence Functional Dependence between columns X and Y: – For any two records R1 and R2 in a table, if field X of record R1 contains value x and field X of record R2 contains the same value x, then if field Y of record R1 contains the value y, then field Y of record R2 must contain the value y. In other words, attribute Y is said to be determined by attribute X.

Primary Key Assertion A set of attributes defined as a primary key must uniquely identify a record Enforcement = testing for duplicates across defined key set

Foreign Key Assertion When the values in field f in table T is chosen from the key values in field g in table S, field S.g is said to be a foreign key for field T.f If f is a foreign key, the key must exist in table S, column g (=referential integrity)

In-process Directives Definition directives (labeling information chain members) Measurement directives Trigger directives

Operational Directives Transformation Update

Other Rules Approximate Searching rules Approximate Matching rules