Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding Data Quality Issues: Finding Data Inaccuracies Art DeMaio Evoke Software VP Technical Sales Support.

Similar presentations


Presentation on theme: "Understanding Data Quality Issues: Finding Data Inaccuracies Art DeMaio Evoke Software VP Technical Sales Support."— Presentation transcript:

1 Understanding Data Quality Issues: Finding Data Inaccuracies Art DeMaio Evoke Software VP Technical Sales Support

2 Agenda Why is Understanding Data Important Methodology for Assessing Data –Defining –Weighting –Profiling –Revisiting –Finding –Addressing –Maintaining What is Profiling Benefits of the Assessment

3 What the Experts say… “Information quality is not an esoteric notion;it directly affects the effectiveness and efficiency of business processes. Information quality also plays a major role in customer satisfaction.” - Larry P. English

4 What the Experts say… “Poor data quality is costly. It lowers customer satisfaction, adds expense, and makes it more difficult to run a business and pursue tactical improvements such as data warehouses and re-engineering.” - Thomas C. Redman

5 What’s in Your DATA… “…three-quarters (of participating companies) reported significant problems as a result of defective data, with a third failing to bill or collect receivables as a result.” - In a PricewaterhouseCoopers survey of 600 CIOs, IT directors or similar executives

6 What is Data Quality? Accuracy of Content Structure Completeness Timeliness Presentation

7 Assessing Your Data 2-Weight /Impact 3-Profile Data 6-Address Source Data 7-Maintain 4-Revisit Definitions, Weights 5-Findings 1-Define Issues

8 Defining Issues Standard list Key requirements Content Structure Completeness Update list by project or source Source Data 1-Define Issues

9 Defining Issues-sample Source Data 1-Define Issues

10 Weight Impact After the issues are initially identified: Some issues are more critical than others Weights are not priorities Assign a weighting factor (1-5) Weighting factors SHOULD change by project 2-Weight /Impact Source Data 1-Define Issues

11 Profile Data What does Data Profiling mean? 2-Weight /Impact 3-Profile Data Source Data 1-Define Issues

12 What is Data Profiling? The use of analytical techniques on data for the purpose of developing a thorough knowledge of its content, structure and quality. A process of developing information about data instead of information from data.

13 Information About Data: (Data Profiling) 30% of entries in SUPPLIER_ID are blank the range of values in UNIT_PRICE is 5.99 to 4599.99 there are 14 ORDER_HEADER rows with no ORDER_DETAIL rows Information FROM Data: (not Data Profiling) Texas auto buyers buy more Cadillacs per capita than any other state The average mortgage amount increased last year by 6% 10% of last year's customers did not buy anything this year What is Data Profiling?

14 Profile Data This is multi-step process Collect documentation Review the DATA itself Compare data to documentation Identify and detail specific issues 2-Weight /Impact 3-Profile Data Source Data 1-Define Issues

15 Revisit Review the issues and weights Should there be more or less issues What are they? Are the relative importance of each issue different? 2-Weight /Impact 3-Profile Data Source Data 4-Revisit Definitions, Weights 1-Define Issues

16 Findings Your findings tell others about the data Documented reports and/or charts Results database Quality Assessment Score 2-Weight /Impact 3-Profile Data Source Data 4-Revisit Definitions, Weights 5-Findings 1-Define Issues

17 Findings-Chart

18

19

20  Weighted Issue Rate - 23.8% Weighted Assessment Score - 76.2%

21 Address the Issues Addressing your findings Actual vs. Potential Subject Matter Expertise Cleansing Requirements 2-Weight /Impact 3-Profile Data 6-Address Source Data 4-Revisit Definitions, Weights 5-Findings 1-Define Issues

22 Maintain Vigilance Maintain Complete the cycle Periodic review Document score changes 2-Weight /Impact 3-Profile Data 6-Address Source Data 7-Maintain 4-Revisit Definitions, Weights 5-Findings 1-Define Issues

23 Why Do The Assessment? Quantify the quality issues Isolate true problems Proactive review –reduces the cost of resolving issues –reduces the risk of customer dissatisfaction Define the scope of issues Determine the resources required to address issues

24 Why Do The Assessment? Project Timeline When you find an Issue Cost to Address an Issue Project Costs

25 Why should it be done TIME Pay me now or Pay me later

26 When Should It Be Done? Every IT data project –Warehousing –CRM –ERP –EAI –M&A Ongoing based on –Criticality of the system –Current status (score) –Need to re-purpose data

27

28 Bibliography Larry P. English: Improving Data Warehouse and Business Information Quality, John Wiley & Sons Inc., 1999 Jack Olson, Data Profiling: The Accuracy Dimension, Morgan Kaufmann, 2002 Thomas C. Redman: Data Quality for the Information Age, Artech House, 1996 PricewaterhouseCoopers, “Global Data Management Survey”, 2001


Download ppt "Understanding Data Quality Issues: Finding Data Inaccuracies Art DeMaio Evoke Software VP Technical Sales Support."

Similar presentations


Ads by Google