Download presentation
Presentation is loading. Please wait.
Published bySwen Bader Modified over 6 years ago
1
Driving Data Quality Initiatives with Agile Analytics
Ken Raetz Principal Think Data Insights, LLC
2
Agenda Our time today… Agile Development vs. Agile Analytics
Data Quality Concepts Data Wrangling Stages of Analytics Agile Data Quality Steps during Analytics work
3
Think Data Insights A little about us…
Enterprise Business Intelligence Experts 30+ years of business, technology, consulting experience Based in Nashville, TN Apply Agile Approach to Analytics and BI Microsoft SQL Server BI Solution Experts Power BI Solution Experts Excel Power BI Solution Experts Power Planner, Power Update Solutions Healthcare solutions with sister company, Visualize Health, LLC
4
Agile Methodology Early and continuous delivery Change is normal
Frequent delivery Strong collaboration Co-location Simplicity Sustainable pace Adapt to changes
5
Analytics More than predictive analytics DW/BI Source Data Modeling
ETL/ELT Reporting Visualizations Data Management/Data Warehousing Data integration Reporting Business Intelligence KPI/Dashboards Predictive Analytics
6
Agile Analytics Data complexity and volatility
Business requirements change Must deliver successfully Keep it simple Work closely with business SMEs Identity/document issues as they come Don’t stop to address every issue
7
Continuous Delivery Objectives mobilize teams Teams evaluate data
Rapid analytics approach Data produces insights Insights refine objectives Business Objective Organize Team Evaluate Data Rapid Data Analytics Deliver Insights
8
Derailed by lack of data quality
Wrong objective Ineffective team selected Data correctness becomes focus Technical Specs More work for IT Business objectives not met Business Objective Organize Team Evaluate Data Rapid Data Analytics Deliver Insights Wrong Objective Wrong Team Fix Data No Analytics IT Backlog
9
Data Quality Control Precision Accuracy Usefulness Consistency
Missing / Unknown Completeness
10
Data Wrangling The key to successful analytics projects
Extraction – “Data Gathering” Analysis – “Data Profiling” Transformation – Define Data Quality Rules Loading/Visualizing – Destination Platform Consistency
11
Top of the Funnel Data correction closest to the source
Business Process Source DB Extraction Staging/ODS Transformation DW/BI Report Data correction closest to the source
12
Agile Data Quality Approach
As you go along the path… Extract Analyze Transform Load/Viz 1 - Identify Find data issues 2 - Evaluate Potential solutions 3 - Document Business/tech DQ specs 4 - Implement Short-term fix 5 - Collaborate Define new scope
13
Agile Data Quality Approach
EXTRACT Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate
14
Stage: Extraction Data quality at the source 1 - Identify
Correct source? Complete data? Accuracy and timing? Time-dependent? 2 - Evaluate Profile data Compare source DB Daily comparison Completeness (products, customers, etc.) 3 – Document Source SME DB/File details (tables, views) Frequency concerns Prioritize 4 – Implement Supplement source data Fake data SQL Rules (IF, CASE WHEN, LIKE) Snapshot 5 - Collaborate IT/Business Dev teams Vendor
15
Agile Data Quality Approach
As you go along the path… Extract Analyze Transform Load/Viz 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare 3 - Document SME DB/Files Prioritize 4 - Implement Supplement data Other sources 5 - Collaborate Dev/IT Vendor
16
Agile Data Quality Approach
Analyze Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate
17
Stage: Analyze Data quality during detailed analysis 1 - Identify
Integrate-able? Missing data? Meaningful and complete? Historical preservation? 2 - Evaluate Detail profiling Key analysis NULL/Missing data Distinct values App logic profiling 3 – Document Source data completeness Mapping rules Orphaned data Lookup rules 4 – Implement Hard-code logic Repair data Build lookups lists Find other data 5 - Collaborate IT/Business Dev teams Leadership
18
Agile Data Quality Approach
As you go along the path… Extract Analyze Transform Load/Viz Completeness Accuracy History 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare Detail profile NULL/Missing App Logic Mapping rules Orphaned data Lookup rules 3 - Document SME DB/Files Prioritize Hard-code logic Repair data Build lookup lists 4 - Implement Supplement data Other sources 5 - Collaborate Dev/IT Vendor Dev/IT Leadership
19
Agile Data Quality Approach
Transform Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate
20
Stage: Transform Data quality while transforming data 1 - Identify
Missing data rules Integration rules Date-related rules (Service/Posting) Calcs needed 2 - Evaluate Test rules against source reporting Test integration for completeness Verify calculations 3 – Document Calculation logic Quality rules (IF, CASE WHEN) Data cleansing needs Inconsistent data 4 – Implement Hard-code logic Rules-based calcs Native/Source functions 5 - Collaborate Project Mgmt Dev teams Analysts Power Users
21
Agile Data Quality Approach
As you go along the path… Extract Analyze Transform Load/Viz Completeness Accuracy History Calculations Integration Date logic 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare Detail profile NULL/Missing App Logic Rule/Calc testing Completeness SME DB/Files Prioritize Mapping rules Orphaned data Lookup rules Calcs DQ rules Cleansing needs 3 - Document Hard-code logic Repair data Build lookup lists Hard-code logic Rules-based calcs Native functions 4 - Implement Supplement data Other sources 5 - Collaborate Proj Mgmt Dev/IT Power Users Dev/IT Vendor Dev/IT Leadership
22
Agile Data Quality Approach
Load Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate
23
Stage: Load/Visualize
Data quality while loading/using data 1 - Identify Future data needed Desired reporting structure Default values Granularity 2 - Evaluate Data model Test results using defaults/calcs Compare data at different granularity 3 – Document Data model Default rules Model/Data granularity 4 – Implement Views over tables Tables/queries Load/build routines Visualizations Calculations 5 - Collaborate Project Mgmt Analysts Power Users
24
Agile Data Quality Approach
As you go along the path… Extract Analyze Transform Load/Viz Completeness Accuracy History Calculations Integration Date logic Future data needs Default values Granularity 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare Detail profile NULL/Missing App Logic Data model Granularity Rules Rule/Calc testing Completeness 3 - Document SME DB/Files Prioritize Mapping rules Orphaned data Lookup rules Calcs DQ rules Cleansing needs Data model Rules Hard-code logic Repair data Build lookup lists Hard-code logic Rules-based calcs Native functions Tables Visualizations Calcs 4 - Implement Supplement data Other sources 5 - Collaborate Dev/IT Vendor Proj Mgmt Dev/IT Power Users Dev/IT Leadership Proj Mgmt Power Users
25
Server & Tools Business
12/6/2018 Agile Data Quality Approach A word on Collaboration Right people/teams Clear roadblocks/build new paths Not ONE-TIME Continuous © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
26
Data Analytics – A new role…
Analytics Prototype Solutions Specialist (APSS) Speaks business and IT Rapid solutions Designs early prototypes Defines business metrics Creates vision/Defines ROI Channels data quality issues Aka… Chief Data Wrangler
27
Q&A 12/6/2018 Demo the service. Demo’s are available at //BI
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
28
Driving Data Quality Initiatives with Agile Analytics
Ken Raetz Principal Think Data Insights, LLC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.