Copyright 2009, Information Builders. Slide 1 iWay Enterprise Information Management (EIM) Data Quality and Master Data Management Kam Wong Solutions Architect.

Slides:



Advertisements
Similar presentations
Project level information Structure of IATI XML file Includes: Activity identifier (project id) Reporting organization Participating organization Activity.
Advertisements

Bob Hoffman Technical Account Manager Eastern Area Boston User Group Getting Data Ready for WebFOCUS November 10, 2011.
iWay Next Generation Data Quality
The Database Environment
3/5/2009Computer systems1 Analyzing System Using Data Dictionaries Computer System: 1. Data Dictionary 2. Data Dictionary Categories 3. Creating Data Dictionary.
Copyright 2007, Information Builders. Slide 1 The Relevance of Data Governance in Higher Education Tim Beckett Higher Education Solutions November 9, 2011.
Implementing MDM for BI & Data Integration by Kabir Makhija.
1 Business Performance Management works for everyone Norman Manley Vice President.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Data Staging Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential Chair of.
Data Warehouse success depends on metadata
Page 1Prepared by Sapient for MITVersion 0.1 – August – September 2004 This document represents a snapshot of an evolving set of documents. For information.
Managing Master Data with MDS and Microsoft Excel
EDITH JOHNSON PATRICIA WOODARD iWay Software With and With out software.
Software Architecture April-10Confidential Proprietary Master Data Management mainly inspired from Enterprise Master Data Management – An SOA approach.
LOGO Business Intelligence System Mr. Natapong Wongprommoon Solution Architect G-ABLE Company Limited
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Agenda 02/20/2014 Complete data warehouse design exercise Finish reconciled data warehouse, bus matrix and data mart Display each group’s work Discuss.
Data Warehouse Tools and Technologies - ETL
Agenda 02/21/2013 Discuss exercise Answer questions in task #1 Put up your sample databases for tasks #2 and #3 Define ETL in more depth by the activities.
Oracle EBS R12 features Analysis. Agenda Overall R12 features at high level R12 financials features at high level AP – Suppliers AP – Invoices AP – Banks.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
Clarity Systems Briefing to FEI CFIT Mark NashmanPresident & CTO ext. 407 Louis MatherneDirector, XBRL Services.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
® IBM Software Group © IBM Corporation IBM Information Server Understand - Information Analyzer.
Lucius McInnis Technical Account Manager Eastern Area New York User Forum Getting Data Ready for WebFOCUS August 10, 2011.
Classroom User Training June 29, 2005 Presented by:
Lucius McInnis, Systems Engineer – Client Services Group Kam Wong, Solutions Architect – iWay Software March 22, 2012 Getting Data Ready for WebFOCUS 1.
CompuBase Data for CRM / PRM Integration How compuBase fits to an existing CRM / PRM system? Last review 25/03/2007.
IWay Solutions - EIM Vincent Deeney – Solutions Architect 6/25/2009.
Introduction to the Orion Star Data
- 1 - Roadmap to Re-aligning the Customer Master with Oracle's TCA Northern California OAUG March 7, 2005.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
How well do you know your DATA?
Master Data Impact, Data Standards, and Management Process and Tools.
Emerging Technologies Work Group Master Data Management (MDM) in the Public Sector Don Hoag Manager.
Pierre-Louis Usselmann, Ben Watt SOGETI Switzerland Master Data Services.
Agenda 03/27/2014 Review first test. Discuss internal data project. Review characteristics of data quality. Types of data. Data quality. Data governance.
Michael Corcoran Sr. Vice President & CMO New Data Requirements Driven By Analytics 1.
Are you feeling secure ? Lee Donaldson Information Builders.
A Strategic Business Imperative Cypress Management Group Corporation Victor Brown Managing Partner 10/19/20151Managing Master Data © 2009 CMGC.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Dan Grady The search for the killer productivity application is over… Copyright 2009, Information Builders. Slide 1.
Information Builders : SmartMart Seon-Min Rhee Visualization & Simulation Lab Dept. of Computer Science & Engineering Ewha Womans University.
Building Marketing Databases. In-House or Outside Bureau? Outside Bureau: Outside agency that specializes in designing and developing customized databases.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Atlanta User Group Introduction to: Data Quality & Master Data Management.
SOA-25: Data Distribution Solutions Using DataXtend ® Semantic Integrator for Sonic ™ ESB Users Jim Barton Solution Architect.
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology- Khan younis.
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Master Data Management & Microsoft Master Data Services Presented By: Jeff Prom Data Architect MCTS - Business Intelligence (2008), Admin (2008), Developer.
OTM 6.1 / GTM Update and Agility China Case Sharing.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
7 Strategies for Extracting, Transforming, and Loading.
MDM IMPLEMENTATION TO REPLACE GPMS TITLE MANAGEMENT October 28, 2013.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
1 iWay DQC and iDP Kam Wong Solutions Architect Exploring Techniques of Data Quality and Profiling April 20, 2012 What Is Data Profiling? What Are Some.
Combating Uninsured Driving 2011 Annual Region I Conference.
Bartek Doruch, Managing Partner, Kamil Karbowiak, Managing Partner, Using Power BI in a Corporate.
Data Mining & OLAP What is Data Mining? Data Mining is the set of activities used to find new, hidden, or unexpected patterns in data.
Enterprise Processes and Systems
Overview of MDM Site Hub
Implementing MDM for BI & Data Integration by Kabir Makhija
Data Student to Data Master
Business Performance Management works for everyone
Achieving better Operations and Analytics
2018 Real IBM M Exam Questions Killtest
A new way to govern, manage and share your data assets
Data Warehousing Concepts
Stephanie Hirner ESTP ”Administrative data and censuses
Presentation transcript:

Copyright 2009, Information Builders. Slide 1 iWay Enterprise Information Management (EIM) Data Quality and Master Data Management Kam Wong Solutions Architect Information Builders D.C. User Forum December 8, 2009

Copyright 2009, Information Builders. Slide 2 Data Quality and Master Data Management Agenda  Business Drivers Behind Data Management  Usage – Where To Use Data Management  Impact Of Data Quality  What Is Data Management?  Data Profiling  Data Cleansing  Data Enrichment  Match & Merge (De-duplication)  Master Data Management  Examples and Demonstration

Copyright 2009, Information Builders. Slide 3 Business Drivers Copyright 2007, Information Builders. Slide 3

Copyright 2009, Information Builders. Slide 4 Business Drivers  Customer Service  Marketing Campaigns  Process Improvement  Regulatory Compliance  Fraud Detection

Copyright 2009, Information Builders. Slide 5 Data Drivers  Accuracy  Correct Information  Completeness  Thorough Information  Consistency  Uniform Information  Validity  Valid Information

Copyright 2009, Information Builders. Slide 6 Copyright 2007, Information Builders. Slide 6 Business Intelligence Drivers Data Quality is the Cornerstone of Effective Business Intelligence, and operations for that matter. So far companies have spent significant amount IT budget to integrate disparate application, creating data warehouse in order to get better Business Intelligence. However, many companies overlook the fact that, at the end of the day, it is the underlying data that matters. All of the pretty screens and reports in the world would not make a difference if the data that resides in the system is full of errors, inconsistent and redundant.  In order to achieve successful business intelligence companies need to tackle the Data Quality problem first.

Copyright 2009, Information Builders. Slide 7 Usage Copyright 2007, Information Builders. Slide 7

Copyright 2009, Information Builders. Slide 8 Copyright 2007, Information Builders. Slide 8 Analytic EIM (Batch or Real-time)  Analytical EIM focuses on improving the data quality and accuracy of BI reports Operational EIM (Real-time)  Goal is to synchronize operational systems data with golden record so that you have quality and consistency across enterprise processes. EIM Usages

Copyright 2009, Information Builders. Slide 9 Copyright 2007, Information Builders. Slide 9 EIM Dimensions DW/DM System

Copyright 2009, Information Builders. Slide 10 Processes Transactions Documents Supplier, Partners Customer, Exchange Data Warehouse, Data Mart. ODS Applications Portals Enterprise Search BI and Real-Time Dashboards Universal Adapter Suite Core Integration Services Reporting Application Data Management Mainframe Data, Applications and Transactions Applications, CRM, ERP, etc Databases, Data Warehouse, Data Marts Documents, Files, Content Management Messages, Transactions, s SWIFT, HIPAA, EDI Formats EIM and WebFOCUS Solutions Core Reporting Services

Copyright 2009, Information Builders. Slide 11 Impact Copyright 2007, Information Builders. Slide 11

Copyright 2009, Information Builders. Slide 12 Impact of Data Quality Address Data 36 % Naturally Correct 64 % Manual Attention

Copyright 2009, Information Builders. Slide 13 3 % Manual Attention 61 % Automated Cleansing 36 % Naturally Correct + Impact of Data Quality Address Data

Copyright 2009, Information Builders. Slide 14 What Is Data Management? Data Quality and Master Data Management Copyright 2007, Information Builders. Slide 14

Copyright 2009, Information Builders. Slide 15 Data Profiling  Profiling  Basic Analysis  Minimums  Maximums  Averages  Counts  Etc.  Patterns  Extremes  Quantities  Frequency Analysis  Foreign Key Analysis  Masking  Drilldown Copyright 2007, Information Builders. Slide 15

Copyright 2009, Information Builders. Slide 16  Parsing  data parsed into components (pattern based)  Standardization  transformation into standard format (Jim Smith -> James Smith)  standard and nonstandard abbreviations (Str. -> Street)  language-specific replacements  Data quality improvement  validation against rules  validation against reference tables  Large number of domain oriented algorithms - examples:  Address  Party  Vehicle  Name  Identification number  Credit Card number  Bank account number  Extension by custom validation steps  using complex function and rules including  Levenshtein distance  SoundEx  internal (java-based) functions Data Cleansing

Copyright 2009, Information Builders. Slide 17  External company register  standard company name  registration ID  official address  national bank account classification  Geocodes  adding geo-codes for identified address  allows showing map locations  used for geomarketing or insurance risks  External address register  adding missing zip-codes, street names, city, etc.  validating existence against register of addresses  List of names, surnames, academic and social titles  validating existence  standardization (PHD -> Ph.D.)  adding missing components Data Enrichment

Copyright 2009, Information Builders. Slide 18  Unification  identification of the set of records connected to one  person  address  vehicle  contact  …etc.  Deduplication  golden record creation (the best representation of the identified subject)  Identification  new data entries – to identify subject (person, address, etc.) to which the new record is connected (matched)  Complex business rules  using sophisticated algorithms and functions including  Levenshtein distance  Hamming distance  Edit distance  Data quality scores values  Data stamps of last modification  Source system originating data  etc. Match & Merge

Copyright 2009, Information Builders. Slide 19 Master Data Management (MDM) Defined  MDM for customer data systems are software products that:  Support the global identification, linking and synchronization of customer information across heterogeneous data sources  Create and manage a central, database-based system of record  Enable the delivery of a single view for all stakeholders  MDM architectural styles vary in:  Instantiation of the customer master data — varying from the maintenance of a physical customer profile to a more-virtual, metadata-based indexing structure  The latency of customer master data maintenance — varying from real-time, synchronous, reading and writing of the master data in a transactional context to batch, asynchronous harmonization of the master data across systems  An MDM program potentially encompasses the management of customer, product, asset, person or party, supplier and financial masters.

Copyright 2009, Information Builders. Slide 20 MDM Architectures  Master is Single Version of Truth  Data Quality at Master  Updates occur at Sources  Updates propagated to Master Master Source  Multiple Versions of Truth  Data Quality is Ongoing  Updates occur at Sources  Keys and Metadata in Registry  Updates propagated to other Sources (Optional) Master Source Consolidated Registry  Master is Single Version of Truth  Data Quality is Ongoing  Updates occur at Sources or Master  Updates propagated to other Sources Master Source Coexistence Master Source  Master is Single Version of Truth  Data Quality at Master  Updates occur at Master  Updates propagated to Sources Centralized

Copyright 2009, Information Builders. Slide 21 Examples And Demonstration Copyright 2007, Information Builders. Slide 21

Copyright 2009, Information Builders. Slide 22 Data Quality Examples Copyright 2007, Information Builders. Slide 22

Copyright 2009, Information Builders. Slide 23 Original data – before cleansing Source data NameGSINBirth DateAddress Dr. John SmithF /16/ Ave Surrey V3R 2A9 Smith W. JohnM Surrey Ave John William SmithSIN Linden Str Toronto M4X 1V5 Dr. J.W. SmithM /16/78 John Smith Leslie L3T 7M8 Toronto Smith John Leslie street Marham John Smiht Jane Watson Leslie str. Toronto L3T 7M8 Watson JaneF Leslei street Toronto L3T 7M8 Jane SmithFSIN J. Smith

Copyright 2009, Information Builders. Slide 24 Titles Parsing NameGSINBirth DateTitlesClearing Codes Dr. John SmithF /16/1978Dr. Academic_Title Smith W. JohnM John William Smith SIN Dr. J.W. SmithM /16/78Dr. Academic_Title John Smith Smith John John Smiht Jane Watson Watson JaneF Jane SmithFSIN J. Smith

Copyright 2009, Information Builders. Slide 25 Name Parsing FirstMLastGSINBirth DateClearing Codes JohnSmithF /16/1978 Academic_Title JohnW.SmithM JohnWilliamSmithSIN J.W.SmithM /16/78 Academic_Title JohnSmith JohnSmith JohnSmiht Last_name_not_found JaneWatson JaneWatsonF JaneSmithFSIN J.Smith

Copyright 2009, Information Builders. Slide 26 Update gender (based on first name) FirstMLastGSINBirth DateClearing Codes JohnSmithM /16/ le, Gender_changed JohnW.SmithM JohnWilliamSmithMSIN Gender_updated J.W.SmithM /16/78 Academic_Title JohnSmithM Gender_updated JohnSmithM Gender_updated JohnSmihtM Last_name_not_found JaneWatsonF Gender_updated JaneWatsonF JaneSmithFSIN J.Smith

Copyright 2009, Information Builders. Slide 27 Validate Social Security Number FirstMLastGSINBirth DateClearing Codes JohnSmithM /16/ nged, SIN_blacklist JohnW.SmithM SIN_removed_dashes JohnWilliamSmithMSIN ated, SIN_extra_chars J.W.SmithM /16/78...mic_Title, SIN_invalid JohnSmithM Gender_updated JohnSmithM updated, SIN_missing JohnSmihtM Last_name_not_found JaneWatsonF Gender_updated JaneWatsonF SIN_removed_dashes JaneSmithFSIN SIN_extra_characters J.Smith SIN_removed_dashes

Copyright 2009, Information Builders. Slide 28 Validate Social Security Number (after) FirstMLastGSINBirth DateClearing Codes JohnSmithM12/16/ nged, SIN_blacklist JohnW.SmithM SIN_removed_dashes JohnWilliamSmithM ated, SIN_extra_chars J.W.SmithM11/16/78...mic_Title, SIN_invalid JohnSmithM Gender_updated JohnSmithM updated, SIN_missing JohnSmihtM Last_name_not_found JaneWatsonF Gender_updated JaneWatsonF SIN_removed_dashes JaneSmithF SIN_extra_characters J.Smith SIN_removed_dashes

Copyright 2009, Information Builders. Slide 29 Validate Birth Date FirstMLastGSINBirth DateClearing Codes JohnSmithM12/16/ nged, SIN_blacklist JohnW.SmithM SIN_removed_dashes JohnWilliamSmithM ated, SIN_extra_chars J.W.SmithM11/16/78...mic_Title, SIN_invalid JohnSmithM Gender_updated JohnSmithM updated, SIN_missing JohnSmihtM Last_name_not_found JaneWatsonF _updated, BD_invalid JaneWatsonF SIN_removed_dashes JaneSmithF SIN_extra_characters J.Smith SIN_removed_dashes

Copyright 2009, Information Builders. Slide 30 Validate Birth Date (after) FirstMLastGSINBirth DateClearing Codes JohnSmithM nged, SIN_blacklist JohnW.SmithM SIN_removed_dashes JohnWilliamSmithM ated, SIN_extra_chars J.W.SmithM mic_Title, SIN_invalid JohnSmithM Gender_updated JohnSmithM updated, SIN_missing JohnSmihtM Last_name_not_found JaneWatsonF _updated, BD_invalid JaneWatsonF SIN_removed_dashes JaneSmithF SIN_extra_characters J.Smith SIN_removed_dashes

Copyright 2009, Information Builders. Slide 31 Prepared data (after cleansing) Cleansed data FirstLastGSINBirth DateAddress JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM M4X 1V5;ON;Toronto;25 Linden Street SmithM JohnSmithM L3T 7M8;ON;Markham;8500 Leslie Str. JohnSmithM L3T 7M8;ON;Markham;8500 Leslie Str. JohnSmihtM JaneWatsonF L3T 7M8;ON;Markham;8500 Leslie Str. JaneWatsonF L3T 7M8;ON;Markham;8500 Leslie Str. JaneSmithF J.Smith

Copyright 2009, Information Builders. Slide 32 Master Data Management Examples Copyright 2007, Information Builders. Slide 32

Copyright 2009, Information Builders. Slide 33 Prepared data (after cleansing) Cleansed data FirstLastGSINBirth DateAddress JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM M4X 1V5;ON;Toronto;25 Linden Street SmithM JohnSmithM L3T 7M8;ON;Markham;8500 Leslie Str. JohnSmithM L3T 7M8;ON;Markham;8500 Leslie Str. JohnSmiht JaneWatsonF L3T 7M8;ON;Markham;8500 Leslie Str. JaneWatsonF L3T 7M8;ON;Markham;8500 Leslie Str. JaneSmithF J.Smith

Copyright 2009, Information Builders. Slide 34 Match Cleansed data FirstLastGSINBirth DateAddress JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM M4X 1V5;ON;Toronto;25 Linden Street SmithM JohnSmithM L3T 7M8;ON;Markham;8500 Leslie Str. JohnSmithM L3T 7M8;ON;Markham;8500 Leslie Str. JohnSmiht JaneWatsonF L3T 7M8;ON;Markham;8500 Leslie Str. JaneWatsonF L3T 7M8;ON;Markham;8500 Leslie Str. JaneSmithF J.Smith

Copyright 2009, Information Builders. Slide 35 Merge Cleansed data FirstLastGSINBirth DateAddress JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM V3R 2A9;BC;Surrey; Avenue JohnSmithM M4X 1V5;ON;Toronto;25 Linden Street Golden record FirstLastGSINBirth DateAddress JohnSmithM M4X 1V5;ON;Toronto;25 Linden Street The newest permanent address The most frequent address V3R 2A9;BC;Surrey; Avenue

Copyright 2009, Information Builders. Slide 36 Demonstration Copyright 2007, Information Builders. Slide 36

Copyright 2009, Information Builders. Slide 37 Thank-You Copyright 2007, Information Builders. Slide 37