Presentation is loading. Please wait.

Presentation is loading. Please wait.

How well do you know your DATA?

Similar presentations


Presentation on theme: "How well do you know your DATA?"— Presentation transcript:

1 How well do you know your DATA?
John Ramoutsakis May 10, 2012

2 Is Data Liability? $$$ for Data Storage $$$ for Data Backups
$$$ for Data Archiving $$$ for Data Replication $$$ for Data Synchronization $$$ for Disaster Recovery Planning

3 Is Data Asset? Helps in making decisions
Provides 360 degree view across the enterprise Helps to understand the customer Helps in building effective Marketing Campaigns Predictive Analysis Statistical Analysis Sentimental Analysis

4 Data Governance Program
People Organizations need executive sponsorship Process Documented repeatable processes and procedures Technology Data Integration, Data Quality, Data Synchronization, and Data Management Data Governance People Process Technology

5 iWay Data Integration Enablement
ERP/Financials Ariba I2 JD Edwards Lawson Manugistics Microsoft Oracle SAP Industry HIPAA CIDX HL7 RNIF SWIFT 1Sync Legacy Systems CICS IMS VSAM .NET Java TUXEDO etc SFA/CRM Amdocs/Clarify BMC/Remedy MSDynamics Oracle/Siebel Salesforce.com SAP Data Warehouse DB2 ETL Oracle/Essbase MS SSAS/OLAP Netezza SAP BW Teradata B2B Internet EDI Legacy EDI MFT Online B2B XML 300+ Adapters

6 Data Profiling Statistical Analysis
An overview of summary values, such as extremes, distribution and frequency analysis. Domain Analysis A configurable analysis of data types. Mask and Group Analysis An overview of value formats, groups and dimensions. Business Rules An analysis of the results of user-defined business rules. Foreign Key and Dependency Analyses An inside look into complex connections in the data. Drill Through The option to display individual records that correspond to aggregated results. Data Mart Reporting and analysis across multiple data set analyses Web and/or hardcopy report viewing and distribution

7 Data Quality Management Cycle
Parsing Association (householding) Format correction Issues causes identification Content evaluation Metadata understanding Automatic Profiling Context-based cleansing Deviance Standardization Ongoing monitoring Enrichment KPI definition Unification Deduplication / identification Data Quality Management Cycle Monitoring and reporting Data understanding Data enhancement Data cleansing

8 iWay Data Quality Center
Parsing: Decomposition of fields into component parts. Cleansing: Modification of data values to meet domain restrictions, integrity constraints or other business rules that define sufficient data quality for the organization. Standardization: Formatting of values into consistent layouts based on industry standards, local standards, user-defined business rules and knowledge bases of values and patterns. Validation: Formatting of values into consistent layouts based on industry standards, local standards, user-defined business rules and knowledge bases of values and patterns. Enrichment: Enhancing the value of internally held data by appending related attributes from external sources. Matching: Identification, linking or merging related entries within or across sets of data.

9 Mastering Master Data What is Master Data?
Data describing your main business entities Data duplicated in multiple systems Data reused by multiple business processes Examples Customer/Citizen/Patient Company/Partner/Agency Products/Items/Equipment Vendors/Suppliers Cost Centers/Employees Etc, etc, …

10 Master Data – Match & Merge
Unification identification of the set of records connected to one person address vehicle contact …etc. Deduplication golden record creation (the best representation of the identified subject) Identification new data entries – to identify subject (person, address, etc.) to which the new record is connected (matched) Complex business rules using sophisticated algorithms and functions including Levenstein distance Hamming distance Edit distance Data quality scores values Data stamps of last modification Source system originating data etc.

11 Data Quality Portal - Complex Exception Handling
KPI / DQI calculation DQ plan Reports Invalid data extraction Resolution queue Resolution Queue Workflow Exception DB Exception management

12 Human Mind vs. Computer Systems
Hahaha raed tihs! i cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg. The phaonemnel pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it dseno't mtaetr in waht oerdr the ltteres in a wrod are, the olny iproamtnt tihng is taht the frsit and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it whotuit a pboerlm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Azanmig huh?

13 Original data – before cleansing
Source data Name G SIN Birth Date Address Dr. John Smith M 12/16/1978 Ave Surrey V3R 2A9 Smtih W. John Surrey Ave Jhon William Simth SIN 781612 25 Linden Str Toronto M4X 1V5 Dr. J.W. Smith 11/16/78 John Smith 8500 Leslie L3T 7M8 Toronto Smith Jhon 8500 Leslie street Marham John Smiht

14 Prepared data (after cleansing)
Cleansed data First Last G SIN Birth Date Address John Smith M V3R 2A9;BC;Surrey; Avenue Smtih Jhon Simth M4X 1V5;ON;Toronto;25 Linden Street L3T 7M8;ON;Markham;8500 Leslie Str. Smiht

15 Match Cleansed data First Last G SIN Birth Date Address John Smith M
V3R 2A9;BC;Surrey; Avenue Smtih Jhon M4X 1V5;ON;Toronto;25 Linden Street L3T 7M8;ON;Markham;8500 Leslie Str. Smiht

16 Merge Cleansed data First Last G SIN Birth Date Address Golden record
John Smith M V3R 2A9;BC;Surrey; Avenue Smtih Jhon M4X 1V5;ON;Toronto;25 Linden Street John Smith M V3R 2A9;BC;Surrey; Avenue M4X 1V5;ON;Toronto;25 Linden Street Golden record First Last G SIN Birth Date Address The newest permanent address The most frequent address

17 Merged records – before update
Source data First Last G SIN Birth Date Address John Smith M V3R 2A9;BC;Surrey; Avenue M4X 1V5;ON;Toronto;25 Linden Street L3T 7M8;ON;Markham;8500 Leslie Str. Smiht Golden record First Last G SIN Birth Date Address John Smith M M4X 1V5;ON;Toronto;25 Linden Street L3T 7M8;ON;Markham;8500 Leslie Str.

18 Merged records – after update
Source data First Last G SIN Birth Date Address John Smith M V3R 2A9;BC;Surrey; Avenue M4X 1V5;ON;Toronto;25 Linden Street L3T 7M8;ON;Markham;8500 Leslie Str. Smiht One updated source record may cause modification in several records in MDC Golden record First Last G SIN Birth Date Address John Smith M V3R 2A9;BC;Surrey; Avenue M4X 1V5;ON;Toronto;25 Linden Street

19 Real World Use Case The Goal The Challenge The Strategy The Benefits
Major hospital group is building a Master Patient Index Need to bring in acquisitioned systems Cleanse, Standard, Deduplicate The Challenge Previously manually processed by hiring temporary staff Current phase projected to take temporary staff of 20 over 18 months The Strategy Automate the cleansing, matching and merging business rules Data Stewardship provides human oversight to automated process The Benefits Identifies the duplicate records according to very complex business rules Reusable rules for future phases Significantly reduced project time – from 18 down to 4 months. Over 400% ROI projected

20 Real World Use Case Goal The Challenge The Strategy The Benefits
Performance Management Business Intelligence Change Management Process The Challenge 100 Locations 14 Systems with out-of-sync master data The Strategy Cleanse, Standardize, Match Master Data Management – Directorate, Borough, Site, Service Type, Service Point, Team, Staff, Patient Master Data Governance Workflow The Benefits Dynamic organizational change to support strategic initiatives Complete visibility into performance of organization vs goals

21 Real World Use Case The Goal The Challenge The Strategy The Benefits
Services organization supporting the airline industry sells decision support information to the industry members. The Challenge Data Quality was adversely affecting the customer base satisfaction Data Quality was impacting new revenue generation opportunities The Strategy Profile analysis according to specific business validation rules Monitor rolling 13 month window comparison of monthly data profiles Accumulate and report analysis to data providers The Benefits Improves customer satisfaction and confidence in the information Increases reliability of the information as new data sources are added Documents and audits quality-control processes for customer review Reduces the dependency on human resources to detect and correct data quality issues

22 Summary of considerations
Information Access Data Quality Master Management Governance Access to variety of data sources Ability to influence data improvement anywhere in the process Useable in batch and/or (real) real-time processing mode Extensible by customized business rules Access to third party data and services Historical and distributable analysis Reusability across multiple phases and projects Integrated data stewardship Platform flexibility for deployment and licensing Vendor partnership and support

23 iWay Software Benefits
Integrate All Information Any Data Any System Any Protocol Any Platform Real-time, Online, and Batch Data Integration Application Integration Business Integration Service Oriented Architecture Any Process Latency Scheduled Process Driven Event Driven User Driven Single Solution Platform Single Engine Fast and Scalable Secure and Reliable Fully Extensible

24 Questions?


Download ppt "How well do you know your DATA?"

Similar presentations


Ads by Google