Jared Kuehn – Skyline Technologies

Slides:



Advertisements
Similar presentations
SharePoint Forms All you ever wanted to know about forms but were afraid to ask.
Advertisements

Newsletter Plugin The newsletter plugin allows you to create and send newsletters to a managed list or multiple lists of users. Your users can subscribe.
Calendar Browser is a groupware used for booking all kinds of resources within an organization. Calendar Browser is installed on a file server and in a.
1 Agenda Views Pages Web Parts Navigation Office Wrap-Up.
Supple.DOC v1.0 By Supple.TEAM
Classroom User Training June 29, 2005 Presented by:
The Project Process Inception - initial planning Elaboration - refining the design Construction - building the system Transition - installation support.
Step by Step Instruction: How to Conduct Direct Certification using File Upload: SAIS IDs Released January 2014 “How to Conduct Direct Certification using.
New Tools to Increase Sales And to Enhance The User Experience.
LSP 121 Week 1 Intro to Databases. Welcome to LSP 121 Quantitative Reasoning and Technological Literacy II Continuation of quantitative data concepts.
Concept demo System dashboard. Overview Dashboard use case General implementation ideas Use of MULE integration platform Collection Aggregation/Factorization.
.NET Code Auditing Keith Rull Software Engineer First Allied Securities Inc.
Chapter 9 Database Management Discovering Computers Fundamental.
INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting)
© Hanson Research Corporation Deduping contacts in Sage CRM 24 th Day of November 2010.
Customer Name Here Date Here Presenter: Jim Halepaska
1 Duplicate Analyzer Exercises. 2 Installation and Initial Configuration: Exercises Exercises 1.Install Duplicate Analyzer on your local PC. 2.Configure.
Irvine Unified School District Library Media Elementary Type to Learn 4.
Network Analyst in ArcGIS Pro Scott Sandusky. Network Analyst in ArcGIS Pro This session covers how to use Network Analyst in ArcGIS Pro. It will also.
Master Data Management & Microsoft Master Data Services Presented By: Jeff Prom Data Architect MCTS - Business Intelligence (2008), Admin (2008), Developer.
IS6146 Databases for Management Information Systems Lecture 1: Introduction to IS6146 Rob Gleasure robgleasure.com.
Transportation Agenda 77. Transportation About Columns Each file in a library and item in a list has properties For example, a Word document can have.
SQL for Super Users Presented by: Adam Jacobson Red Three Consulting, Inc.
1 Record Linkage & Fuzzy Matching (More on "Blocking" for Performance Improvement) Joseph Vertido Melissa Data Fuzzy.
Mastering Master Data Services Presented By: Jeff Prom BI Data Architect Bridgepoint Education MCTS - Business Intelligence, Admin, Developer.
TDD Unit tests from a slightly different point of view Katie Dwyer.
User Manual for Contact Management Customer Relationship Management (CRM) for Bursa Malaysia 2014 Version 1.0 | 4 September 2014.
Introduction to the new robust security system from SCC.
Discovering Use Cases.
Cleveland SQL Saturday Catch-All or Sometimes Queries
Getting Started in Power BI
The Basics of Dashboards
of our Partners and Customers
DQS: Business Logic Meets Enterprise Integration
Performing Mail Merges
Core LIMS Training: Advanced Administration
Basic Work-Flow with SQL Server Standard
Mastering Your Search Data
Make Links from your Baan System
Fast Action Links extension A love letter to CiviCRM
Where I am at: Swagatika Sarangi MDM Lead PASS Summit SQL Saturdays
Presented by: Warren Sifre
Building and Using Queries
ECONOMETRICS ii – spring 2018
SQL Server May Let You Do It, But it Doesn’t Mean You Should
Swagatika Sarangi (Jazz), MDM Expert
AN OUTSTANDING POWERPOINT
Writing a good expository Essay
Populating a Data Warehouse
Academic Communication Lesson 2
Optimize Your Java Code By Tools
06 | Managing Enterprise Data
REAL-TIME, INTERACTIVE DOCUMENT AUTOMATION
Delivering Business Value Faster
Discovering Use Cases.
SharePoint 2019 Overview and Use SPFx Extensions
From MDS to SSRS - a short walkthrough
Navigating SWIS Webinar
M1G Introduction to Database Development
Unit4 Customer Portal Knowledge User Access.
CPAN 260 Relational Database Design and SQL
Forms, Resource Links, Discounts & Locations
LONG MULTIPLICATION is just multiplying two numbers.
Donald Donais Minnesota SharePoint Users Group – April 2019
Topic 11 Lesson 1 - Analyzing Data in Access
Give great customer service with Microsoft Dynamics CRM
Software Development Techniques
SESSION 6 Overview presentation of available and future plans for reporting systems First OBSERVER TUNA DATABASE Workshop (OTDW-1) 3-7 March 2014 SPC,
Navigating SWIS Webinar
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
Presentation transcript:

Jared Kuehn – Skyline Technologies When Low-Quality Data Strikes: Fuzzy Tools Provide clarity in Matching and deduplication Jared Kuehn – Skyline Technologies

About me Likes BLTs Male pattern baldness for a theater production Weird Al is my hero I like hats My daughter is adorable My dog is fuzzy

This tangent is too divergent Let’s get to our topic!

Today’s agenda What is Fuzzy logic? What are the typical matching approaches? Let’s see it in action! Demo, demo, demo!

What is Fuzzy logic? Stock photo I found online that clearly displays my point…kind of -Taking two pieces of information and identifying a match based on how similar they are.

Case study!!! Two datasets of people for your data warehouse Both contain names and demographic information One comes from your company’s main application Already in the warehouse. High-quality, managed well The other comes from a new application Data has been identified as low-quality Typos, blank fields, varied formatting A person can exist in both lists Goal is to merge the two lists into one master person dataset for your warehouse Minimize the number of duplicates without finding bad matches Here’s a second bullet point because I couldn’t think of a second point and I learned in high school that having only one sub bullet point is frowned upon

Approaches to matching Exact Match Fuzzy Match Manual Match Match Game

Exact Match Define columns that you want to compare Data in columns must match exactly to find matching records Strict rules result in more confidence in matches Can define multiple rules

Fuzzy Match Define which columns you want to compare Find matches based on similarity Faster to set up for complex, low-quality scenarios Better at handling low-quality data

Manual match Trust the human brain to find accurate matches Can account for any number of variances in data Most accurate form of matching

Still there? Good, cause it’s Demo time!!!!

Which Approach or Tool do I pick? How much time do you want to invest in finding accurate matches? What resources are available for you to use? Business users? Yet another second bullet point with no information. I really need to be better about this. Oh no, I did it again…

Fuzzy tool options I know of SQL Server Integration Services (SSIS) Versions 2005 and later Fuzzy Lookup and Fuzzy Grouping SQL Server Full Text Search Analyzes character patterns and linguistics Restricted to only text data Allows configuration for specific languages CLR functions Data Quality Services (DQS) and Master Data Services (MDS) DQS - Versions 2012 and later MDS – Versions 2008 R2 and later Engaging business users Business user friendly? Fuzzy Lookup for Excel Add-In (https://www.microsoft.com/en-us/download/details.aspx?id=15011)

Final thoughts Fuzzy logic is another tool that you can use. But it's still a tool Don't hammer a nail with a screwdriver Also, I need to improve my use of sub bullet points If you want to try it, plan some time to experiment with it Useful information to follow up on My email: jkuehn@skylinetechnologies.com Skyline blogs (https://www.skylinetechnologies.com/Blog) Fuzzy Lookup Excel Add-In (https://www.microsoft.com/en- us/download/details.aspx?id=15011) Check SQL Saturday website for script/SSIS packages

When your memory is fuzzy, stay fuzzy!