Download presentation
Presentation is loading. Please wait.
1
Achieving better Operations and Analytics
through Master Data Management James Cotton Sr. Solution Architect European Headquarters
2
What is Master Data? Master data is data that is shared by multiple computer systems. The Information Difference Master data is information that is key to the operation of a business…persistent, non-transactional data that defines a business entity for which there is, or should be, an agreed-upon view across the organisation. Wikipedia Master data is the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts. Gartner Master data is often one of the key assets of a company. Microsoft
3
What is Master Data Management?
Master data management is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise’s official shared master data assets. Gartner Master Data Management comprises a set of processes, governance, policies, standards and tools that consistently defines and manages the master data. Wikipedia The creation of: The Golden Record Single Version of the Truth
4
Types of data in an organisation
Unstructured Found in , white papers, magazine articles, corporate intranet portals, product specifications, marketing collateral, and PDF files Transactional Related to sales, deliveries, invoices, trouble tickets, claims, and other monetary and non-monetary interactions Metadata Data about other data and includes: report definitions, column descriptions in a database, log files, connections, and configuration files Hierarchical Stores the relationships between other data such as company organisational structures or product lines. Master Critical nouns of a business and fall generally into the groupings: people, places and things, The What, Why, and How of Master Data Management – Microsoft November 2006
5
Understanding Master Data
Think of nouns and verbs Bob Smith buys a widget (SKU #A1234) and ships it to his home address The master data elements are the nouns and are people, things, and places The transactional data elements are verbs that describe what happens to those people, places, and things. Bob Smith widget (SKU #A1234) home address CRM Marketing ERP WMS Financial
6
Deciding what Master Data should be Managed
Generally speaking, master data should meet the following requirements: Cardinality Volatility Lifetime Value Reuse
7
Master Data Management
Name: Bob Smith Tel: DOB: 23/10/71 Gender: M Name: Bob Smith Tel: DOB: Gender: M Name: B Smith Tel: DOB: 23/10/71 Gender: M Name: Bob Smith Tel: DOB: 23/10/71 Gender: Name: Bob Smith Tel: DOB: Gender: Male Name: B Smith Tel: (0) DOB: 23-Oct-71 Gender: M Name: Smith, Bob Tel: (01283)56982 DOB: 23/10/1971 Gender: CRM Marketing ERP WMS Financial
8
The Current Landscape of MDM Systems
Aberdeen Group – April 2012
9
Operational vs. Analytical Master Data Management
Operational data is the lifeblood of an organisation Operational MDM centres on assuring ‘single view’ of master data in the core systems used by business users Sales, service, order management, manufacturing, purchasing, billing, accounts receivable, accounts payable, payroll, etc. Rely heavily on integration technologies to keep systems in sync
10
Operational vs. Analytical Master Data Management
Analytical data is used to support a company's decision making Analytical MDM centres on assuring ‘single view’ of master data in the downstream data warehouse used most often to supply the data for a business intelligence (BI) solution for historical and predictive analysis Any data cleansing done inside an Analytical MDM solution is invisible to the transactional applications
11
Master Data Management - Value Across the Enterprise
Operational Analytical Single Version of Truth = Better System synchronisation Consistency in transactional data Party/product data across all systems System integration/migration Cost reduction within the business process Data aggregation & analysis Marketing segmentation & analysis Risk management Financial reporting Cost reduction and time savings in analysis Maximum business value comes from managing both operational and analytical master data
12
Data Quality Improvement Concept
Data Governance Share Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Get Connect Orchestrate Data Integration Manage Control
13
Data Quality Improvement Concept
Data Governance Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Share Manage Control Get Connect Orchestrate Data Integration
14
Data Governance It embodies: Data quality Data management
People It embodies: Data quality Data management Data policies Business process management Risk management It is about putting people in charge of fixing and preventing issues with data so that the enterprise can become more efficient.* It’s about using technology when necessary in many forms to help aid the process.* When companies desire, or are required, to gain control of their data, they empower their people, set up processes and get help from technology to do it.* *Sarsfield, Steve (2009). "The Data Governance Imperative", IT Governance. Process Technology
15
Data Quality Improvement Concept
Data Governance Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Share Manage Control Get Connect Orchestrate Data Integration
16
Data Integration - Batch
17
Data Integration – Real Time
18
Data Quality Improvement Concept
Data Governance Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Share Manage Control Get Connect Orchestrate Data Integration
19
Profiling Profiling – Technical (Pre-built) Basic Analysis
Minimums Maximums Averages Counts Etc. Patterns / Masking Extremes Quantities Frequency Analysis Foreign Key Analysis Profiling – All Charting Grouping / Aggregate Drilldown / Interactive Displays
20
Monitoring Advanced Profiling ‘Custom’ analysis of data
Defined by user and relevant to data context Multiple fields can be considered, e.g. Address lines Complex computation may be required Output is Binary (true/false) – Data Quality Indicators
21
Data Governance - Monitoring
Portal DQ plan Profiling/DQIs Reports
22
Data Quality Improvement Concept
Data Governance Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Share Manage Control Get Connect Orchestrate Data Integration
23
Data Quality - Facts and Stats
The amount of data you have doubles every 12 to 18 months Thomas Redman – Data-Driven The average amount of inaccurate data in an organisation increased by 30% last year. Experian Data Quality Survey 50% of Data Warehouse projects will fail or receive limited acceptance because of NOT proactively addressing data quality issues Thomas Redman – Data-Driven 75% of 250 CFOs Surveyed said “data quality significantly impedes performance.” Gartner Survey
24
Master Data Management
Name: Bob Smith Tel: DOB: Gender: M Name: B Smith Tel: DOB: 23/10/71 Gender: M Name: Bob Smith Tel: DOB: 23/10/71 Gender: Name: Bob Smith Tel: DOB: Gender: Male Name: B Smith Tel: (0) DOB: 23-Oct-71 Gender: M Name: Smith, Bob Tel: (01283)56982 DOB: 23/10/1971 Gender: CRM Marketing ERP WMS Financial
25
Cleansing Parsing Validation of Data Quality Enrichment
Data parsed into components (pattern based) E.G. Jim Smith -> Jim + Smith Validation of Data Quality Validation against rules Validation against reference tables Enrichment Adding data Standardisation Transformation into standard format (Jim Smith -> James Smith) Standard and nonstandard abbreviations (Str. -> Street) Language-specific replacements Parsing Validation Enrichment Large number of domain oriented algorithms - examples: Name Address Credit Card number Bank account number Extension by custom validation steps Using complex function and rules including Levensthein distance SoundEx Industry standard functions Standardisation 25
26
Scoring Cleansing Parsing Validation Enrichment Standardisation 26
27
Data Before and After Cleansing
Name ANNE PHILLIPS CHRISTINE HALL JOHN SMITH IAN SCOTT Gender F N Male Date of Birth 14/11/1987 10/12/1940 10/01/1971 28.Oct.1956 Telephone Address Line 1 6 BOOTON COURT 56C HORNCHURCH ROAD 22 RINGMORE STREET 56 WOULD LANE Address Line 2 Address Line 3 Address Line 4 KIDDERMINSTER PLYMUTH ISLEWORTH Address Line 5 PORCESTERSHIRE DEVON LONDON MIDDLESEX Postcode DY102YZ PL5 2TF SE233DE TW7-5ED Score 210 300 600 Explanation ADDRESS_VALID GENDER_TAKEN_FROM_NAME ADDRESS_CORRECTED_MINOR _INV DATE_STANDARDIZED GENDER_STANDARDIZED TELEPHONE_STANDARDIZED ADDRESS_CORRECTED_MAJOR out_first_name Anne Christine John Ian out_last_name Phillips Hall Smith Scott out_gender M out_birthdate 28/10/1956 out_telephone out_ out_address_line_1 22 RINGMORE RISE 56 WOOD LANE out_address_line_2 out_address_line_3 out_address_line_4 out_post_town PLYMOUTH out_postcode DY10 2YZ SE23 3DE TW7 5ED Name IAN SCOTT out_first_name Ian out_last_name Scott Gender Male out_gender M Date of Birth 28.Oct.1956 out_birthdate 28/10/1956 Telephone out_telephone out_ Address Line 1 56 WOULD LANE out_address_line_1 56 WOOD LANE Address Line 2 out_address_line_2 Address Line 3 out_address_line_3 Address Line 4 ISLEWORTH out_address_line_4 Address Line 5 MIDDLESEX out_post_town Postcode TW7-5ED out_postcode TW7 5ED Score 600 Explanation _INV DATE_STANDARDIZED GENDER_STANDARDIZED TELEPHONE_STANDARDIZED ADDRESS_CORRECTED_MAJOR Name JOHN SMITH out_first_name John out_last_name Smith Gender out_gender M Date of Birth 10/01/1971 out_birthdate Telephone out_telephone out_ Address Line 1 22 RINGMORE STREET out_address_line_1 22 RINGMORE RISE Address Line 2 out_address_line_2 Address Line 3 out_address_line_3 Address Line 4 out_address_line_4 Address Line 5 LONDON out_post_town Postcode SE233DE out_postcode SE23 3DE Score 300 Explanation GENDER_TAKEN_FROM_NAME ADDRESS_CORRECTED_MINOR
28
Data Governance – Issue Resolution
Is the score lower than the threshold? Yes No
29
Data Governance - Issue Management
Portal DQ plan Profiling/DQIs Reports Issue data Issue Database Issue List Workflow Exception Mgt
30
Data Quality Improvement Concept
Data Governance Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Share Manage Control Get Connect Orchestrate Data Integration
31
Master Data Management
Name: Bob Smith Tel: DOB: 23/10/71 Gender: M Name: Bob Smith Tel: DOB: Gender: M Name: B Smith Tel: DOB: 23/10/71 Gender: M Name: Bob Smith Tel: DOB: 23/10/71 Gender: Name: Bob Smith Tel: DOB: Gender: Male Name: B Smith Tel: (0) DOB: 23-Oct-71 Gender: M Name: Smith, Bob Tel: (01283)56982 DOB: 23/10/1971 Gender: CRM Marketing ERP WMS Financial
32
Matching Goal: Identify groups of records that in reality
represent a single client or entity. Match & Merge This may not be so simple : Data comes from different sources Must handle data that is missing, wrong or conflicting There’s no single ‘correct’ solution
33
How many people are here?
Cleansed data First Last G SIN Birth Date Address John Smith M 22 Ringmore Street, London, SE23 3DE 74 Arnold Street, Boldon Colliery, Bolton, NE35 9BD 3 Catalina Avenue, Pembroke Dock, SA72 6YB Smiht Jane Watson F J.
34
Match Cleansed data First Last G SIN Birth Date Address John Smith M
22 Ringmore Street, London, SE23 3DE 74 Arnold Street, Boldon Colliery, Bolton, NE35 9BD 3 Catalina Avenue, Pembroke Dock, SA72 6YB Smiht Jane Watson F J.
35
Merging Creating the Golden Record
Can cherry pick the best fields or even the best record Using rules to determine the best field/record For example: The one from the ‘reference system’ The newest one The one of highest quality Aggregation functions SQL-like: count, sum, minimum, maximum, average Modus, concatenate Match & Merge
36
Match Cleansed data First Last G SIN Birth Date Address John Smith M
22 Ringmore Street, London, SE23 3DE 74 Arnold Street, Boldon Colliery, Bolton, NE35 9BD 3 Catalina Avenue, Pembroke Dock, SA72 6YB Smiht Jane Watson F J.
37
Merge Cleansed data First Last G SIN Birth Date Address Golden record
John Smith M 22 Ringmore Street, London, SE23 3DE 74 Arnold Street, Boldon Colliery, Bolton, NE35 9BD John Smith M 22 Ringmore Street, London, SE23 3DE 74 Arnold Street, Boldon Colliery, Bolton, NE35 9BD Golden record First Last G SIN Birth Date Address The most frequent address The newest permanent address
38
Data Quality Improvement Concept
Data Governance Communicate Analyse Propagate Data Distribution Build Match Merge Data Mastering Improve Standardise Enrich Data Quality Know Explore Profile Data Analysis Share Manage Control Get Connect Orchestrate Data Integration
39
Master Data Management Architectures
Consolidated Master is Single Version of Truth Data Quality at Master Updates occur at Sources Updates propagated to Master Coexistence Master is Single Version of Truth Data Quality is ongoing Updates occur at Sources or Master Updates propagated to other Sources Registry Multiple Versions of Truth Data Quality is ongoing Updates occur at Sources Keys and Metadata in Registry Updates optionally propagated to other Sources Centralised Master is Single Version of Truth Data Quality at Master Updates occur at Master Updates propagated to Sources
40
Data Distribution
41
Intelligence
42
Summary It’s a process and it’s iterative
Enable the process via technology Start small with an eye on the long-term Understand that requirements will change over time Know that Information Builders can help you
43
Data Quality Challenge
44
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.