Driving Data Quality Initiatives with Agile Analytics

Slides:



Advertisements
Similar presentations
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Advertisements

© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Feature: Reprint Outstanding Transactions Report © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product.
Feature: Purchase Requisitions - Requester © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
MIX 09 4/15/ :14 PM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
demo Default WANGPSLookup Default WANGPS.
Co- location Mass Market Managed Hosting ISV Hosting.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Multitenant Model Request/Response General Model.
Feature: Purchase Order Prepayments II © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
Announcing Demo Announcing.
Feature: OLE Notes Migration Utility
Session 1.
Built by Developers for Developers…. © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
 Rico Mariani Architect Microsoft Corporation.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Feature: Print Remaining Documents © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.
Connect with life Connect with life
NEXT: Overview – Sharing skills & code.
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or.
Feature: Document Attachment –Replace OLE Notes © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product.
Feature: Customer Combiner and Modifier © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
SQL Server SQL Azure Visual Studio“Quadrant” SQL Server Modeling Services Entity Framework ADO.NET“M”/EDM Data Services …
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.
demo Instance AInstance B Read “7” Write “8”

customer.
demo © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names.
demo Demo.
demo QueryForeign KeyInstance /sm:body()/x:Order/x:Delivery/y:TrackingId1Z
projekt202 © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are.
The CLR CoreCLRCoreCLR © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product.
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks.
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or.
Sr. Dir. – Systems Architecture Inlet Technologies.
Requirements Gathering Architectural Design DevelopmentUAT Visual Design Production Polish Concept Ideation Interaction Design.

Solving Modern Day Business Problems Using Power Apps
Presenter Date | Location
S4 Solution Specialist Sales Summit
Владимир Гусаров Директор R&D, Dell Visual Studio ALM MVP ALM Ranger
Возможности Excel 2010, о которых следует знать
Title of Presentation 11/22/2018 3:34 PM
Baseline: How Are We Doing Now?
Title of Presentation 12/2/2018 3:48 PM
12/6/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
1/3/2019 1:21 PM © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Ari Schorr | Product Marketing Manager
Alex Kelly | Program Manager
Feature: Document Attachment - Flow from Master Records
8/04/2019 9:13 PM © 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Виктор Хаджийски Катедра “Металургия на желязото и металолеене”
From Start to REST in 60 Minutes (DEV323)
WINDOWS AZURE A LAP AROUND PLATFORM THE Steve Marx
PENSACOLA ENERGY WORK PLAN OCTOBER 10, 2016
Jason Zander Unplugged
5/1/2019 3:29 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Title of Presentation 5/12/ :53 PM
Шитманов Дархан Қаражанұлы Тарих пәнінің
Title of Presentation 5/24/2019 1:26 PM
5/24/2019 6:44 PM 1/8/18 Bell #10 In a world governed by the gods, is there any room for human will? Do human choices make a difference? EXPLAIN © 2007.
Using Smart Unit Tests to find bugs earlier in the development cycle
01 | A Lap Around Visual Studio Online
日本初公開!? Vista の新機能を実演 とっちゃん わんくま同盟 7/23/2019 9:09 AM
Title of Presentation 7/24/2019 8:53 PM
Presentation transcript:

Driving Data Quality Initiatives with Agile Analytics Ken Raetz Principal Think Data Insights, LLC

Agenda Our time today… Agile Development vs. Agile Analytics Data Quality Concepts Data Wrangling Stages of Analytics Agile Data Quality Steps during Analytics work

Think Data Insights A little about us… Enterprise Business Intelligence Experts 30+ years of business, technology, consulting experience Based in Nashville, TN Apply Agile Approach to Analytics and BI Microsoft SQL Server BI Solution Experts Power BI Solution Experts Excel Power BI Solution Experts Power Planner, Power Update Solutions Healthcare solutions with sister company, Visualize Health, LLC

Agile Methodology Early and continuous delivery Change is normal Frequent delivery Strong collaboration Co-location Simplicity Sustainable pace Adapt to changes

Analytics More than predictive analytics DW/BI Source Data Modeling ETL/ELT Reporting Visualizations Data Management/Data Warehousing Data integration Reporting Business Intelligence KPI/Dashboards Predictive Analytics

Agile Analytics Data complexity and volatility Business requirements change Must deliver successfully Keep it simple Work closely with business SMEs Identity/document issues as they come Don’t stop to address every issue

Continuous Delivery Objectives mobilize teams Teams evaluate data Rapid analytics approach Data produces insights Insights refine objectives Business Objective Organize Team Evaluate Data Rapid Data Analytics Deliver Insights

Derailed by lack of data quality Wrong objective Ineffective team selected Data correctness becomes focus Technical Specs More work for IT Business objectives not met Business Objective Organize Team Evaluate Data Rapid Data Analytics Deliver Insights Wrong Objective Wrong Team Fix Data No Analytics IT Backlog

Data Quality Control Precision Accuracy Usefulness Consistency Missing / Unknown Completeness

Data Wrangling The key to successful analytics projects Extraction – “Data Gathering” Analysis – “Data Profiling” Transformation – Define Data Quality Rules Loading/Visualizing – Destination Platform Consistency

Top of the Funnel Data correction closest to the source Business Process Source DB Extraction Staging/ODS Transformation DW/BI Report Data correction closest to the source

Agile Data Quality Approach As you go along the path… Extract Analyze Transform Load/Viz 1 - Identify Find data issues 2 - Evaluate Potential solutions 3 - Document Business/tech DQ specs 4 - Implement Short-term fix 5 - Collaborate Define new scope

Agile Data Quality Approach EXTRACT Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate

Stage: Extraction Data quality at the source 1 - Identify Correct source? Complete data? Accuracy and timing? Time-dependent? 2 - Evaluate Profile data Compare source DB Daily comparison Completeness (products, customers, etc.) 3 – Document Source SME DB/File details (tables, views) Frequency concerns Prioritize 4 – Implement Supplement source data Fake data SQL Rules (IF, CASE WHEN, LIKE) Snapshot 5 - Collaborate IT/Business Dev teams Vendor

Agile Data Quality Approach As you go along the path… Extract Analyze Transform Load/Viz 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare 3 - Document SME DB/Files Prioritize 4 - Implement Supplement data Other sources 5 - Collaborate Dev/IT Vendor

Agile Data Quality Approach Analyze Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate

Stage: Analyze Data quality during detailed analysis 1 - Identify Integrate-able? Missing data? Meaningful and complete? Historical preservation? 2 - Evaluate Detail profiling Key analysis NULL/Missing data Distinct values App logic profiling 3 – Document Source data completeness Mapping rules Orphaned data Lookup rules 4 – Implement Hard-code logic Repair data Build lookups lists Find other data 5 - Collaborate IT/Business Dev teams Leadership

Agile Data Quality Approach As you go along the path… Extract Analyze Transform Load/Viz Completeness Accuracy History 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare Detail profile NULL/Missing App Logic Mapping rules Orphaned data Lookup rules 3 - Document SME DB/Files Prioritize Hard-code logic Repair data Build lookup lists 4 - Implement Supplement data Other sources 5 - Collaborate Dev/IT Vendor Dev/IT Leadership

Agile Data Quality Approach Transform Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate

Stage: Transform Data quality while transforming data 1 - Identify Missing data rules Integration rules Date-related rules (Service/Posting) Calcs needed 2 - Evaluate Test rules against source reporting Test integration for completeness Verify calculations 3 – Document Calculation logic Quality rules (IF, CASE WHEN) Data cleansing needs Inconsistent data 4 – Implement Hard-code logic Rules-based calcs Native/Source functions 5 - Collaborate Project Mgmt Dev teams Analysts Power Users

Agile Data Quality Approach As you go along the path… Extract Analyze Transform Load/Viz Completeness Accuracy History Calculations Integration Date logic 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare Detail profile NULL/Missing App Logic Rule/Calc testing Completeness SME DB/Files Prioritize Mapping rules Orphaned data Lookup rules Calcs DQ rules Cleansing needs 3 - Document Hard-code logic Repair data Build lookup lists Hard-code logic Rules-based calcs Native functions 4 - Implement Supplement data Other sources 5 - Collaborate Proj Mgmt Dev/IT Power Users Dev/IT Vendor Dev/IT Leadership

Agile Data Quality Approach Load Extract Analyze Transform Load/Viz 1 - Identify 2 - Evaluate 3 - Document 4 - Implement 5 - Collaborate

Stage: Load/Visualize Data quality while loading/using data 1 - Identify Future data needed Desired reporting structure Default values Granularity 2 - Evaluate Data model Test results using defaults/calcs Compare data at different granularity 3 – Document Data model Default rules Model/Data granularity 4 – Implement Views over tables Tables/queries Load/build routines Visualizations Calculations 5 - Collaborate Project Mgmt Analysts Power Users

Agile Data Quality Approach As you go along the path… Extract Analyze Transform Load/Viz Completeness Accuracy History Calculations Integration Date logic Future data needs Default values Granularity 1 - Identify Source data Completeness 2 - Evaluate Profiling DB Compare Time Compare Detail profile NULL/Missing App Logic Data model Granularity Rules Rule/Calc testing Completeness 3 - Document SME DB/Files Prioritize Mapping rules Orphaned data Lookup rules Calcs DQ rules Cleansing needs Data model Rules Hard-code logic Repair data Build lookup lists Hard-code logic Rules-based calcs Native functions Tables Visualizations Calcs 4 - Implement Supplement data Other sources 5 - Collaborate Dev/IT Vendor Proj Mgmt Dev/IT Power Users Dev/IT Leadership Proj Mgmt Power Users

Server & Tools Business 12/6/2018 Agile Data Quality Approach A word on Collaboration Right people/teams Clear roadblocks/build new paths Not ONE-TIME Continuous © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Data Analytics – A new role… Analytics Prototype Solutions Specialist (APSS) Speaks business and IT Rapid solutions Designs early prototypes Defines business metrics Creates vision/Defines ROI Channels data quality issues Aka… Chief Data Wrangler

Q&A 12/6/2018 Demo the service. Demo’s are available at //BI © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Driving Data Quality Initiatives with Agile Analytics Ken Raetz Principal Think Data Insights, LLC