SQL Server Data Quality Services A knowledge driven Data Quality Solution.

Slides:



Advertisements
Similar presentations
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Advertisements

Joe Homnick, MCITP: Business Intelligence Developer
Data Quality Services + Whats new in SSIS in SQL Server 2012 James Beresford
Kentico CMS 5.5 R2 What’s New. Highlights Intranet Solution Document management package – WebDAV support – Project & task management – Document libraries.
1er Simposio Latinoamericano Data Quality Fundamentals Miguel Angel Granados Troncoso.
Power BI Sites and Mobile BI. What You Will Learn Sharing and Collaboration Introducing Power BI Exploring Power BI Features and Services Partner Opportunities.
Jeremy Kashel BI 200 End to End Master Data Management With SQL Server Master Data Services (MDS)
Accelerate Business Success With CRM CRM Interoperability.
27. to 28. March 2007 | Geneva, Switzerland. Fabrice Romelard ilem SA Level 200.
Customising SharePoint David Gristwood Developer & Platform Group
DBI207 3 Data QualityIssueSample Data Problem Standard Are data elements consistently defined and understood ? Gender code = M, F, U in one system and.
Tutorial 11: Connecting to External Data
Managing Master Data with MDS and Microsoft Excel
® IBM Software Group © IBM Corporation IBM Information Server Metadata Management.
Master Data Services In SQL Server Denali Jeremy Kashel
Creating a SharePoint App with Microsoft Access Services
SharePoint Portal Server 2003 JAMES WEIMHOLT WEIDER HAO JUAN TURCIOS BILL HUERTA BRANDON BROWN JAMES WEIMHOLT INTRODUCTION OVERVIEW IMPLEMENTATION CASE.
November 10 th, 2011 DQS BOOTCAMP D AVID F AIBISH, S ENIOR P ROGRAM M ANAGER SQL S ERVER D ATA Q UALITY S ERVICES Microsoft SQL Server 2012.
Experience the World’s Data with the DataMarket Adam Wilson Senior Program Manager Microsoft Corporation.
Microsoft Windows 2003 Server. Client/Server Environment Many client computers connect to a server.
DYNAMICS CRM AS AN xRM DEVELOPMENT PLATFORM Jim Novak Solution Architect Celedon Partners, LLC
SPONSORS. Microsoft PowerPivot for SQL Server, Excel 2010, and SharePoint 2010 Michael Herman Syntergy, Inc.
SQL S ERVER D ATA Q UALITY S ERVICES Marc Jellinek Principal Consultant – Neudesic
Crystal Hoyer Program Manager IIS Team Preview of features that will be announced at MIX09 Please do not blog, take pictures or video of session.
SMART Agency Tipsheet Staff List This document focuses on setting up and maintaining program staff. Total Pages: 14 Staff Profile Staff Address Staff Assignment.
Classroom User Training June 29, 2005 Presented by:
- 1 - Roadmap to Re-aligning the Customer Master with Oracle's TCA Northern California OAUG March 7, 2005.
Office Live Workspace Visio 2007 Outlook 2007 Groove 2007 Access 2007 Excel 2007 Word 2007.
INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting)
Cloud On Your Terms Breakthrough Insight Unlock new insights with pervasive data discovery across the organization Create business solutions fast, on.
© 2008 IBM Corporation ® IBM Cognos Business Viewpoint Miguel Garcia - Solutions Architect.
…. PrePlanPrepareMigratePost Pre- Deployment PlanPrepareMigrate Post- Deployment First Mailbox.
DENALI SSIS AND DATA QUALITY ENHANCEMENTS Dr Greg Low Principal Mentor and CEO SolidQ Australia SESSION CODE: DAT307 (c) 2011 Microsoft. All rights reserved.
SharePoint Portal Server Office XP Launch Tour Breakout Presentation nametitle Microsoft Corporation.
Embarquez les services d'intégration SQL Server 2005 Romelard Fabrice D311.
Advanced ETL: Embedding Integration Services Ashvini Sharma Development Lead DAT411 Microsoft Corporation Sergei Ivanov Technical Lead DAT411 Microsoft.
Atlanta User Group Introduction to: Data Quality & Master Data Management.
MICROSOFT CODENAME “DATA EXPLORER”. “Data Explorer” is a self-service experience in the cloud and on the desktop for discovering, transforming and publishing.
Master Data Management & Microsoft Master Data Services Presented By: Jeff Prom Data Architect MCTS - Business Intelligence (2008), Admin (2008), Developer.
Introducing Data Quality Services and its role in an Enterprise Information Management (EIM) Process James Beresford Group Manager, Avanade DBI217.
1 © Xchanging 2010 no part of this document may be circulated, quoted or reproduced without prior written approval of Xchanging. MOSS Training – UI customization.
Virtual techdays INDIA │ 9-11 February 2011 virtual techdays Data grail: Data Market on Windows Azure Sudhindra Kovalam │ Developer, Icertis Inc.
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Welcome November 2012 Einführung in.
Please note that the session topic has changed
Self-Service Data Integration with Power Query Stéphane Fréchette.
November 10 th, 2011 C LEANSING D ATA IN SSIS D AVID F AIBISH, S ENIOR P ROGRAM M ANAGER SQL S ERVER D ATA Q UALITY S ERVICES Microsoft SQL Server 2012.
Mastering Master Data Services Presented By: Jeff Prom BI Data Architect Bridgepoint Education MCTS - Business Intelligence, Admin, Developer.
PowerApps and Microsoft Flow for SharePoint Developers Brian T. Jackett Sr. Premier Field Engineer, Microsoft.
Steve Simon MVP SQL Server BI
Data Platform and Analytics Foundational Training
Data Platform and Analytics Foundational Training
Bought to you by.
Presenter Date | Location
BIM 360 Glue Migration to BIM 360 Account Administration (HQ)
of our Partners and Customers
DQS: Business Logic Meets Enterprise Integration
What’s New in SQL Server 2016 Master Data Services
Matt Masson Senior Program Manager Microsoft Corporation
Steve Simon MVP SQL Server BI
07 | Analyzing Big Data with Excel
Business Intelligence for Project Server/Online
Swagatika Sarangi (Jazz), MDM Expert
JDXpert Workday Integration
Module 1: Introduction to Business Intelligence and Data Modeling
06 | Managing Enterprise Data
Enterprise Program Management Office
Data Quality in the BI Life Cycle
Power BI with Analysis Services
Microsoft Azure Data Catalog
02 | Mastering Your Data Graeme Malcolm | Data Technology Specialist, Content Master Pete Harris | Learning Product Planner, Microsoft.
Presentation transcript:

SQL Server Data Quality Services A knowledge driven Data Quality Solution

Microsoft Charlotte, NC Microsoft Charlotte has ~900 employees CTS Support (Windows, Exchange, SQL, Visual Studio,.Net, Sharepoint, Office 365), MCS Consulting, MS Sales, Premier Technical Account Managers, Premier Field Engineers, Premier Labs

Defining EIM – Enterprise Information Managements The set of capabilities enabling the enterprise to get the right data to the right consumers, reliably, repeatably, efficiently & with high confidence. Technology phrases you hear: Enterprise Information Management, Data Governance, Data Stewardship, Metadata management Data Quality, Data Cleansing, Matching, Deduplication, Identity Resolution,Master Data Management, Dimension Management, Reference Data Management Data Integration, ETL, ELT, Replication, EII, Federated Query, IaaSCDC and more … Technology phrases you hear: Enterprise Information Management, Data Governance, Data Stewardship, Metadata management Data Quality, Data Cleansing, Matching, Deduplication, Identity Resolution,Master Data Management, Dimension Management, Reference Data Management Data Integration, ETL, ELT, Replication, EII, Federated Query, IaaSCDC and more …

Enterprise Information Management in SQL Server “Denali” Data Quality Services Knowledge based Data Cleansing and Matching Master Data Services Master and reference data Management Integration Services ETL and Data Integration Tool Audience Poll… how many of you use any of these 3 features today?

SQL Server Data Quality Services A knowledge driven Data Quality Solution

What is Data Quality ? 6

Common Data Quality Issues Data QualityIssueSample Data Problem Standard Are data elements consistently defined and understood ? Gender code = M, F, U in one system and Gender code = 0, 1, 2 in another system Complete Is all necessary data present ?20% of customers’ last name is blank, 50% of zip-codes are Accurate Does the data accurately represent reality or a verifiable source? A Supplier is listed as ‘Active’ but went out of business six years ago Valid Do data values fall within acceptable ranges? Salary values should be between 60, ,000 Unique Data appears several timesBoth John Ryan and Jack Ryan appear in the system – are they the same person?

DBA Data Steward / Business Analyst BI Developer Audience Poll: who is responsible for Data Quality in your Organization?

Requirements for Data Quality Solutions 10 Cleansing MatchingProfiling Monitoring Monitoring Tracking and monitoring the state of Quality activities and Quality of Data Cleansing Amend, remove or enrich data that is incorrect or incomplete. This includes correction, standardization and enrichment. Profiling Analysis of the data source to provide insight into the quality of the data and help to identify data quality issues. Matching Identifying, linking or merging related entries within or across sets of data.

What is DQS ? Data Quality Services (DQS) is a Knowledge-Driven data quality solution, enabling IT Pros and data stewards to easily improve the quality of their data

12 Based on a Data Quality Knowledge Base (DQKB) Knowledge-Driven Data Domains capture the semantics of your data Knowledge Discovery Acquires additional knowledge the more you use it Semantics Support use of user-generated knowledge and IP by 3 rd party reference data providers Open and Extendible Compelling user experience designed for increased productivity Easy to use

Make Data Quality Approachable To Everyone

DQS Process Build Use DQ Projects Knowledge Management Match & De-dupe Correct & standardize Knowledge Manage Discover / Explore Data / Connect Enterprise Data Reference Data Reference Data Cloud Services Integrated Profiling Notifications Progress Status Knowledge Base

DQS High Level Scenarios Creating and managing the Data Quality Knowledge Bases Discover knowledge from your org’s data samples Exploration and integration with 3 rd party reference data Creating and managing the Data Quality Knowledge Bases Discover knowledge from your org’s data samples Exploration and integration with 3 rd party reference data Knowledge Management & Reference Data Correction, de-duplication and standardization of the data Cleansing & Matching Tools to monitor and control data quality processes Administration

1. Run SQL Setup to add DQS features Need to be Administrator 64-bit recommended One DQS server per SQL instance possible Separate Checkboxes for Client and Server and SSIS 2. Run DQSInstaller.exe Be Windows Admin Be SQL SysAdmin Find DQSInstaller.exe Run as UAC elevated Admin Enter Password Overwrite existing DQS? 3. Setup Initial Security and Connectivity Sysadmin add logins and users Enable users in DQS_MAIN Map to a to dqs_* roles Enable TCP connectivity Enable Access to Data Sources Excel bit

C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\ MSSQL\Binn\DQSInstaller.exe

Data Quality Knowledge Base (DQKB) Domains Represent the data type Domains Represent the data type Values Rules & Relations 3 rd party Reference Data Knowledge Base Composite Domains Matching Policy Domains

Create a KB / Domain Management Create a new KB or open existing one Define Domains and their data types, rules, set up reference data, domain rules, term based relationships Define Composite Domains to combine multiple simple domains into a single complex domain entity Define Matching Policy Point to example source data Define Matching Rules Run Data Discovery Prime the KB with knowledge values and terms into the various KB Domains Import clean knowledge data from a table or type in manual entries Correct data manually and define the standard for what is correct Publish the KB Data Projects can reference and use the KB once it is published You can go back and edit a KB as needed, but data projects cannot see edits until published again.

BuildUseMonitor/Configure

Publish Data Projects can reference and use the KB once it is published You can go back and edit a KB as needed, but data projects cannot see edits until published again. Cleansing Point to source data from a SQL table or Excel worksheet. Map source columns to KB domains Run the Cleanse to find mistakes, empty values, non standard values, values that do not meet rule requirements Manually Review the automatic suggestions and corrections. Tweak low confidence values. Export to save the cleansed results to a SQL table or Excel Matching Point to the source data to import froma SQL table or Excel Workbook Run Matching to find Similar Values Review results and suggested synonyms Export to save the results to a SQL Table or Excel workbook

DQ Client User Interaction DQS Server Algorithms DQ Client User Interaction Create/Open Project Pick Source. Map Source columns to Domain Run the Cleansing and review Profiler progress Manage and View Results interactively Export Results

28 Account ID Building Your Knowledge Account ID Home TeamTeam Type Revenue TypeSales Home Arena Address LineCityStateZip A124324Boston CelticsBasketball Food & Beverages655TD Garden100 Legends WayBostonMA New York YankeesBaseballMusic389Yankee Stadium East 161st Street & River AvenueNY Seattle MarinersBaseballMusic443Safeco Field1516 First Avenue SSeattleWA98134 Reference Data Service: Composite Domain containing Address Line, City, State & Zip Domains Reference Data Service: Composite Domain containing Address Line, City, State & Zip Domains Account ID A Team Type Basketball Baseball MLB Address LineCityStateZip 100 Legends WayBostonMA2114 East 161st Street & River AvenueNY 1516 First Avenue SSeattleWA98134 Composite Domain - Full Address Address LineCityStateZip BIA-319-M | Data Quality Services – A Closer Look

demo DQS Demo 1 - Interactive Cleanse & Knowledge Management

Matching Reference Data DQS Architecture Overview DQ Clients DQS UI DQ Server DQ Projects StoreCommon Knowledge StoreKnowledge Base Store DQ Engine 3 rd Party MS DQ Domains Store MS DQ Domains Store Reference Data Services Reference Data Sets DQ Active Projects MS Data Domains Local Data Domains Published KBs Knowledge Discovery Data Profiling & Exploration Cleansing Knowledge Discovery and Management Interactive DQ Projects Data Exploration Future Clients – Excel, SharePoint… Azure Market Place Categorized Reference Data Categorized Reference Data Services Reference Data API (Browse, Get, Update…) Reference Data API (Browse, Get, Update…) RD Services API (Browse, Set, Validate…) RD Services API (Browse, Set, Validate…)

DQS Knowledge Sources Easily cleanse and enrich data with Reference Data Services from Azure MarketPlace Website that contains DQS knowledge available for downloading DataMarket DQS Data Store Discover knowledge from data samples of your organization Organization Data A set of data domains that come out of the box with DQS Out of the Box Knowledge

demo

Reference Data Services (RDS)

Batch Cleansing - Using SSIS Microsoft Confidential—Preliminary Information Subject to Change Reference Data Definition Values/Rules New Corrections & Suggestions Correct Invalid SSIS Data Flow Source + Mapping DQS Cleansing Component SSIS Package Destination Reference Data Services DQS Server

demo DQS Demo 3 - Cleansing using Reference Data Services & Composite Domains

Rich Knowledge Base Continuous improvement and knowledge acquisition Build once, reuse for multiple DQ improvements Focus on productivity and user experience Designed for business users Out-of-the-box knowledge Focus on cloud-based Reference Data User-generated knowledge Integration with SSIS Knowledge-driven Easy To Use Open & Extendible

DQS Technet Wiki will list major known issues Install Issues: Operational Issues: DQS Documentation DQS Azure DataMarket

DQS Blog DQS Forum US/sqldataqualityservices/ DQS Videos

SQL Connect SQL Support

Cleanse and Match data with SQL Server 2012 Data Quality Services. Please enjoy DQS responsibly Cleanse and Match data with SQL Server 2012 Data Quality Services. Please enjoy DQS responsibly