Steve Simon MVP SQL Server BI

Slides:



Advertisements
Similar presentations
1 Computational Asset Description for Cyber Experiment Support using OWL Telcordia Contact: Marian Nodine Telcordia Technologies Applied Research
Advertisements

C6 Databases.
Managing Data Resources
DBI207 3 Data QualityIssueSample Data Problem Standard Are data elements consistently defined and understood ? Gender code = M, F, U in one system and.
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
Dale Roberts 1 Department of Computer and Information Science, School of Science, IUPUI Dale Roberts, Lecturer Computer Science, IUPUI
Get More Value from Your Reference Data—Make it Meaningful with TopBraid RDM Bob DuCharme Data Governance and Information Quality Conference June 9.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
IBM Start Now Business Intelligence Solutions. Agenda Overview of BI Who will buy and why Start Now BI solution Benefit to customer.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Case 2: Emerson and Sanofi Data stewards seek data conformity
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Introduction to Software Engineering. Why SE? Software crisis manifested itself in several ways [1]: ◦ Project running over-time. ◦ Project running over-budget.
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Dimensional Modeling Primer Chapter 1 Kimball & Ross.
Master Data Management & Microsoft Master Data Services Presented By: Jeff Prom Data Architect MCTS - Business Intelligence (2008), Admin (2008), Developer.
Database and Information Management Chapter 9 – Computers: Understanding Technology, 3 rd edition.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 17 Sharing Enterprise Data.
Managing Data Resources File Organization and databases for business information systems.
Eugene Meidinger Power BI: Start to
Supervisor : Prof . Abbdolahzadeh
James A. Senn’s Information Technology, 3rd Edition
Getting started with Accurately Storing Data
Once Upon a Time: The Story of a Successful BI Implementation
Architecture Review 10/11/2004
Applying Deep Neural Network to Enhance EMPI Searching
What’s new in SQL Server 2017 for BI?
Modern Systems Analysis and Design Third Edition
DQS: Business Logic Meets Enterprise Integration
Matt Masson Senior Program Manager Microsoft Corporation
Challenges to designing financial warehouses, lessons learnt
Steve Simon MVP SQL Server BI
CIS 332 Course Experience Tradition / snaptutorial.com
Data Warehouse.
Designing Business Intelligence Solutions with Microsoft SQL Server
Auditing in SQL Server 2008 DBA-364-M
SQL Server BI on Windows Azure Virtual Machines
Business Intelligence for Project Server/Online
Database Management System (DBMS)
Windows PowerShell Remoting: Definitely NOT Just for Servers
SQL Server Analysis Services Fundamentals
SQL Server Analysis Services Fundamentals
Surviving parsing XML with T-SQL
Basic Concepts in Data Management
MANAGING DATA RESOURCES
Wavestore Integrates… Paxton Net2 Access Control
Wavestore Integrates…
Chapter 1 Database Systems
DAX and the tabular model
Wavestore Integrates…
Modern Systems Analysis and Design Third Edition
06 | Managing Enterprise Data
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Becoming a successful Business Intelligence developer
Modern Systems Analysis and Design Third Edition
From DTS to SSIS, Redesign or Upgrade
Evaluation & Experiences ‘YTY-System’ Statistics Finland
Data Quality in the BI Life Cycle
Chapter 1 Database Systems
SQL Server Reporting Services 2017 on Steroids!!
Data Warehousing Concepts
Power BI: Start to Finish
The 2nd Generation Live Database: A “World Class Solution”
Database Design Using Access
Lecture 23 CS 507.
Efficient and Effective coding of stored procedures
Presentation transcript:

Steve Simon MVP SQL Server BI http://www.infogoldusa.com A dive into Data Quality Services SQL Server 2014 Boston Ma September 27 ,2014 Steve Simon MVP SQL Server BI http://www.infogoldusa.com

Steve Simon is SQL Server MVP and a Senior Business Intelligence Development Engineer with Atrion Networking Corporation, Providence RI USA. He has been involved with database design and analysis for over 29 years. Steve has presented numerous papers at PASS summits over the years including PASS Europe, in addition to numerous presentations at SQL Saturday events, the Amsterdam and Copenhagen and other local user groups. He a PASS Virtual Chapter Regional Mentor.

Our business challenge

Lack of conformity Products, manufacturers and descriptions can be added to the database table in many ways. Leaves these ’descriptions’ as unreliable for use within query predicates. With SQL Server 2008 we had data profiling task which advised of a problem, however had no intelligent solutions.

OLAP solutions seemed to be the least flexible. Lack of conformity Weekly modifications to core data that should be correctly entered the first time. OLAP solutions seemed to be the least flexible. Reprocessing of cubes and the time expended can mount up.

Enter Data Quality Services

What is Data Quality Services? A set of tools which allow data stewards to improve data quality (Domain experts). Produces a result set with suggested improvements. Does not change the original source data set.

Why should we use Data Quality Services? You can get Subject Matter Expert(SME)input. Manually define, match and cleanse (Man & Machine) . Computer cleansing of your data. How? Programmatically, then manually approve. The system learns.

Why should we use Data Quality Services? Can integrate with third party data. Can integrate with other data processing e.g. SSIS.

How to use Data Quality Services List of basic steps: Create / Refine / Use a knowledge base. Perform a data quality evaluation. Generate output.

List of components DQS Server. DQS Client.

Three main activities

When do we use Data Quality Services? Issue Detail Completeness Is all the data there? Conformity Is all the data in the correct format? (capitals ?) Consistency Do values represent the same meaning? (Data Mining and Human check) Accuracy Do data objects represent real world values? Validity Do data values fall within acceptable ranges? Duplication Are there multiple copies of the same data?

Courtesy Elad Ziklik

Requirements SQL Server 2012 Post Install BI Edition Must run DQS Installation Post SQL install Requirements Enterprise Edition Do Master Data Services Integration

Installing the server portion

After installation in SSMS

After having installed the DQS Server portion of the application, you get a few new databases and new security roles.

There is no API at present, so we must work with the ‘native’ client. Designed for single write (at this moment) There is no API at present, so we must work with the ‘native’ client.

Being a single write system.. While editing KB it is locked. Others cannot use. Complex KB.. Split up and import later. Each person can create their own entities or domains. KB ’pieces’ can be imported.

Mean while back at the ranch ! ‘The challenge’

Cleansing Issues Are duplicates wrong here?

Demo

SSIS

Data quality can be problematic.Human problems. Take Aways Data quality can be problematic.Human problems. DQS like any mining model is able to ’learn’. DQS learns data patterns and can detect data duplication? Human intervension decreases as models mature.

DQS knowledge bases can be used with SSIS. Take Aways Better quality data reduces data processing times. E.g. Cubes effective and efficient. DQS knowledge bases can be used with SSIS. Output correct/problematic can be sent to data stewards for validation and verification. Less frustration from end users and decision makers.

At the end of the day it all boils down to More importantly, why all of us can definitely benefit from a..

Steve Simon MVP SQL Server BI http://www.infogoldusa.com A dive into Data Quality Services SQL Server 2014 Boston Ma September 27 ,2014 Steve Simon MVP SQL Server BI http://www.infogoldusa.com