Introducing Data Quality Services and its role in an Enterprise Information Management (EIM) Process James Beresford Group Manager, Avanade DBI217
Who am I? … and what am I doing here?
Introducing Data Quality Services and its role in an Enterprise Information Management (EIM) Process Goals of session: Introduce DQS Concepts Terminology Showcase Automation Integration into an EIM process
Data Quality Services and its role in an Enterprise Information Management (EIM) Process Outcomes for you: Manage data quality Automate data cleansing Support Master Data Management
The Data Quality Problem !=
Hands up if you’ve been 2 years old
Your first data quality problem
Data Quality = Shape Sorting
There is Good Data
There is Bad Data
There is Repairable Data
Data Quality Services Terms Collection of Shapes = Knowledge Base
Data Quality Services Terms A Shape = Domain
Data Quality Services Terms Shape Sorting = Domain Rule
The Data Quality Client Allows you to: Create or Maintain Knowledge Bases Data Quality Projects
demo Create a Knowledge Base using Knowledge Discovery
The Data Quality Client Create a Knowledge Base What did we see? Create a Knowledge Base from Data using Knowledge Discovery Set values as Correct, Error or Invalid Teach the KB some simple auto correction using Domain Rules
The Data Quality Client Create a Knowledge Base What can we do? Keep our knowledge about data quality in one location Accept, Correct or Reject values Update it as an ongoing process
The Data Quality Client Pop Quiz: Who owns a Knowledge Base? Maintaining KB’s is done by: a)Qualified Data Professionals b)A BI Developer c)Users d)Microsoft
The Data Quality Client Pop Quiz: Who owns a Knowledge Base? Maintaining KB’s is done by: a)Qualified Data Professionals b)A BI Developer c)Users d)Microsoft
The Data Quality Client Create a Composite Domain + n
demo Create a Composite Domain with a Domain Rule
The Data Quality Client Composite Domains What did we see? Use values from one domain to interact with another
The Data Quality Client Composite Domains What can we do? Manage interdependent data fields for quality purposes
The Data Quality Client Other Domain Functions Things that I won’t demonstrate Term Based Relations – autocorrect substrings (e.g. Inc. > Incorporated) Reference Data – Validation against external sources on Azure DataMarket, e.g. Melissa Data
The Data Quality Client Data Quality Projects Application of a Knowledge Base through the DQS Client Interactively process data Output results
demo Create a Data Quality Project
The Data Quality Client Create a Data Quality Project What did we see? Interactive Cleansing of data Export of cleansed data set
The Data Quality Client Create a Data Quality Project What can we do? Clean Data Export results
The Data Quality Client Other DQS Features Things that I won’t demonstrate Matching De-Duplication
Automation with SSIS
The DQS Cleansing Task
demo Automation with SSIS and the DQS Cleansing Task
What did we see? Automatic Cleansing of data Managing different results Automation with SSIS The DQS Cleansing Task
What can we do? Clean data in an integrated manner User input to data quality can affect DW Updating DQ is independent of updating ETL Automation with SSIS The DQS Cleansing Task
Data Quality Services Master Data Services Integration Services EIM: Credible, Consistent Data
Master Data Services (MDS) MDS Provides A central store of reliable data Web and Excel UI’s Easy access to Master Data
Loading MDS Creating our Entity
Updating MDS Handling new, trusted data
Feeding from MDS Publishing with Subscription Views ?
demo Integrating DQS processes with MDS processes
DQS + MDS As part of an EIM process What did we see? Knowledge from DQS captured in MDS using Excel New knowledge captured in DQS transferred to MDS using SSIS
MDS As part of an EIM process What can we do? Use DQS to provide the foundation for Master Data Use DQS as a means of providing updated Master Data Have a user driven EIM process
Introducing Data Quality Services and its role in an Enterprise Information Management (EIM) Process Goals of session: Introduce DQS Concepts Terminology Showcase Automation Integration into an EIM process
Introducing Data Quality Services and its role in an Enterprise Information Management (EIM) Process You can find me at the speaker meet & greet – , speakers lounge, Thursday 13th And follow me at: MSDN SSIS & DQS Forums
Introducing Data Quality Services and its role in an Enterprise Information Management (EIM) Process