Chapter 1 Introduction to Data Quality. Data Quality Characteristics Data quality affects several attributes associated with data: Accuracy–Is it realistic.

Slides:



Advertisements
Similar presentations
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.
Advertisements

Database Management Issues of interest to Address Databases.
Achieve First-Class Address Quality with a Satori Software Solution By Mike Roy.
Normalisation Ensuring data integrity in database design 1.
Maintenance Modifying the data –Add records –Delete records –Update records Modifying the design –Add fields into tables –Remove fields from a table –Change.
Multidimensional Database Structure
Databases Chapter Distinguish between the physical and logical view of data Describe how data is organized: characters, fields, records, tables,
Database Design Concepts INFO1408 Term 2 week 1 Data validation and Referential integrity.
Chapter 8 Structuring System Data Requirements
1 Chapter 2 Reviewing Tables and Queries. 2 Chapter Objectives Identify the steps required to develop an Access application Specify the characteristics.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Professor Michael J. Losacco CIS 1150 – Introduction to Computer Information Systems Databases Chapter 11.
DATA QUALITY PROBLEMS AND THEIR ROOT CAUSES DAMA COLUMBUS, OH CHAPTER MEETING – JANUARY 2015.
DBI207 3 Data QualityIssueSample Data Problem Standard Are data elements consistently defined and understood ? Gender code = M, F, U in one system and.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
ETL The process of updating the data warehouse.. Recent Developments in Data Warehousing: A Tutorial Hugh J. Watson Terry College of Business University.
Computer System Analysis Chapter 10 Structuring System Requirements: Conceptual Data Modeling Dr. Sana’a Wafa Al-Sayegh 1 st quadmaster University of Palestine.
Understanding Data Analytics and Data Mining Introduction.
GCSE ICT Checking data. Why do errors happen? Computers do not make mistakes. However if incorrect data is put in errors happen. In ICT this is called.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Agenda 03/27/2014 Review first test. Discuss internal data project. Review characteristics of data quality. Types of data. Data quality. Data governance.
DBMS Spring 2014 Database Integrity Sources: Security in Computing, Pfleeger and Pfleeger, Prentice Hall, 2003 Lecture Slides, CSE6243, MSU, Rayford B.
Data entry: Validation
Checking data GCSE ICT.
Chapter 12 View Design and Integration. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for view design.
Time Use Survey Coding and Processing Time Use Data.
Creating a Database Designing Structure, Capturing and Presenting Data.
©1999 Addison Wesley Longman Slide 1.1 Business Processes 3.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
Principles of Database Design, Conclusions AIMS 2710 R. Nakatsu.
Chapter 9 View Design and Integration. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Motivation for view design.
System Design System Design - Mr. Ahmad Al-Ghoul System Analysis and Design.
Atlanta User Group Introduction to: Data Quality & Master Data Management.
Copyright 2007, Paradigm Publishing Inc. ACCESS 2007 Chapter 3 BACKNEXTEND 3-1 LINKS TO OBJECTIVES Modify a Table – Add, Delete, Move Fields Modify a Table.
AS Level ICT Data entry: Problems with errors. Garbage in; Garbage out If incorrect data is entered into a data management system, the results of any.
Principles of Database Design, Conclusions MBAA 609 R. Nakatsu.
Paolo Valente - UNECE Statistical Division Slide 1 Technology for census data coding, editing and imputation Paolo Valente (UNECE) UNECE Workshop on Census.
Foundations of Business Intelligence: Databases and Information Management.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Data Verification and Validation
CHAPTER 5 Data and Knowledge Management. CHAPTER OUTLINE 5.1 Managing Data 5.2 The Database Approach 5.3 Database Management Systems 5.4 Data Warehouses.
Databases Chapter Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 43 Budgeting Techniques. Budget The main purposes are to help you. –Live within your income. –Achieve your financial goals. –Buy wisely. –Avoid.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
Copyright 2002 Prentice-Hall, Inc. Modern Systems Analysis and Design Third Edition Jeffrey A. Hoffer Joey F. George Joseph S. Valacich Chapter 10 Structuring.
Database Concepts and Applications in HRIS
 At the end of the class students should:  distinguish between data and information.  explain the characteristics and forms of Information Processing.
GCSE ICT LESSON 5 Booklet Sections: 6 & 7 Data Capture & Checking Data.
Chapter 10 Structuring System Requirements: Conceptual Data Modeling
Review: Simulation (see handout) Teradata & SQL accounts LMS GIGO
Event-driven accounting information systems
IS 130 Information systems 1
Overview of MDM Site Hub
Database Systems Chapter 3 1.
Chapter 6 Structuring System Requirements: Conceptual Data Modeling
DQS: Business Logic Meets Enterprise Integration
Databases and Information Management
Semantic Interoperability and Data Warehouse Design
Data Quality By Suparna Kansakar.
Chapter 10 Structuring System Requirements: Conceptual Data Modeling
CHAPTER SIX OVERVIEW SECTION 6.1 – DATABASE FUNDAMENTALS
Data Warehousing Concepts
Microsoft Access Validation Rules, Table Relationships And
Chapter 10 Structuring System Requirements: Conceptual Data Modeling
Valuing Organizational Information
Appendix A Data Modeling MANAGEMENT INFORMATION SYSTEMS 8/E
Lecture 10 Structuring System Requirements: Conceptual Data Modeling
Information system analysis and design
Presentation transcript:

Chapter 1 Introduction to Data Quality

Data Quality Characteristics Data quality affects several attributes associated with data: Accuracy–Is it realistic or believable? Integrity–Is it structured and managed? Consistency–Is it consistently defined and maintained? Validity–Is the data valid, based on business or industry rules and standards?

What Causes Poor Data Quality? These factors can contribute to poor data quality: Business rules do not exist or there are no standards for data capture. Standards may exist but are not enforced at the point of data capture. Inconsistent data entry (incorrect spelling, use of nicknames, middle names, or aliases) occurs. Data entry mistakes (character transposition, misspellings, and so on) happen. Integration of data from systems with different data standards is present. Data quality issues are perceived as time-consuming and expensive to fix.

Primary Sources of Data Quality Problems Source: The Data Warehousing Institute, Data Quality and the Bottom Line, 2002

How Is Clean Data Achieved? Clean data is the result of a combination of efforts: making sure that data entered into the system is clean cleaning up problems after the data is accepted.

Typical Data Quality Issues The most common processes in a data quality initiative are Data Analysis and Standardization –consistency analysis –standardization schemes –gender analysis –entity analysis –data parsing and casing. continued...

Typical Data Quality Issues The most common processes in a data quality initiative are Matching and Merging –de-duplication –householding Address Verification – against a CASS certified database Geocoding – data enrichment using third-party data elements.

... Analysis and Standardization Example Who is the biggest supplier? Anderson Construction$ 2, Briggs,Inc$ 8, Brigs Inc.$12, Casper Corp.$27, Caspar Corp$ 6, Solomon Industries$43, The Casper Corp$11,500.00

... Standardization Scheme Briggs, Inc  Brigs Inc.  Briggs Inc. Casper Corp. Casper Corp.  Caspar Corp  The Casper Corp 

Supplier Spending 0 10,000 20,000 30,000 40,000 50,000 $ Spent Casper Corp. Solomon Ind. Briggs Inc. Anderson Cons.

... Operational System of Records Data Warehouse 01Mark Carver SAS SAS Campus Drive Cary, N.C. 02Mark W. Craver 03Mark Craver Systems Engineer SAS Mark Carver SAS SAS Campus Drive Cary, N.C. Mark W. Craver Mark Craver Systems Engineer SAS Data Matching Example

Mark Craver Systems Engineer SAS SAS Campus Drive Cary, N.C Data Quality Process Mark Carver SAS SAS Campus Drive Cary, N.C. Mark W. Craver Mark Craver Systems Engineer SAS Operational System of Records Data Warehouse DQ