1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

Slides:



Advertisements
Similar presentations
N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis.
Advertisements

PARTITIONAL CLUSTERING
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
MIS2502: Data Analytics Clustering and Segmentation.
Livelihoods analysis using SPSS. Why do we analyze livelihoods?  Food security analysis aims at informing geographical and socio-economic targeting 
Building an Intelligent Web: Theory and Practice Pawan Lingras Saint Mary’s University Rajendra Akerkar American University of Armenia and SIBER, India.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Week 9 Data Mining System (Knowledge Data Discovery)
Basic Data Mining Techniques Chapter Decision Trees.
Slide 1 EE3J2 Data Mining Lecture 16 Unsupervised Learning Ali Al-Shahib.
Basic Data Mining Techniques
Data mining and statistical learning - lecture 14 Clustering methods  Partitional clustering in which clusters are represented by their centroids (proc.
Evaluation of MineSet 3.0 By Rajesh Rathinasabapathi S Peer Mohamed Raja Guided By Dr. Li Yang.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data Mining: A Closer Look
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
Enterprise systems infrastructure and architecture DT211 4
1 Chapter 1: Introduction 1.1 Introduction to SAS Enterprise Miner.
Chapter 1: Introduction
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Chapter 5: Data Mining for Business Intelligence
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
Household Panel Data Reference
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
SAS Homework 4 Review Clustering and Segmentation
Data MINING Data mining is the process of extracting previously unknown, valid and actionable information from large data and then using the information.
1 Statistical Techniques Chapter Linear Regression Analysis Simple Linear Regression.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Outline Knowledge discovery in databases. Data warehousing. Data mining. Different types of data mining. The Apriori algorithm for generating association.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
1 Chapter 6: Using Prompts in Tasks and Queries 6.1 Prompting in Projects 6.2 Creating and Using Prompts in Tasks 6.3 Creating and Using Prompts in Queries.
CLUSTERING AND SEGMENTATION MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
1 Chapter 3: Graphical Data Exploration 3.1 Exploring Relationships Between Continuous Columns 3.2 Examining Relationships Between Categorical Columns.
Data Mining and Decision Support
1 Chapter 3: Getting Started with Tasks 3.1 Introduction to Task Dialogs 3.2 Creating a Listing Report 3.3 Creating a Frequency Report 3.4 Creating a Two-Way.
CLUSTERING AND SEGMENTATION MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Data Mining Copyright KEYSOFT Solutions.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
Chap 6: Association Rules. Rule Rules!  Motivation ~ recent progress in data mining + warehousing have made it possible to collect HUGE amount of data.
Monday, February 22,  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 3 Basic Data Mining Techniques Jason C. H. Chen, Ph.D. Professor of MIS School of Business.
Intro. ANN & Fuzzy Systems Lecture 20 Clustering (1)
Chapter 1 MARKETING IS ALL AROUND US. The Scope of Marketing Marketing is activity, set of institutions, and processes for creating, communicating, delivering,
Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
MIS2502: Data Analytics Clustering and Segmentation Jeremy Shafer
Data Mining: Confluence of Multiple Disciplines Data Mining Database Systems Statistics Other Disciplines Algorithm Machine Learning Visualization.
DATA MINING © Prentice Hall.
A Research Oriented Study Report By :- Akash Saxena
Chapter 13 Marketing: Helping Buyers Buy
12/2/2018.
MIS2502: Data Analytics Clustering and Segmentation
MIS2502: Data Analytics Clustering and Segmentation
Presentation transcript:

1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

2 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

3 Pattern Discovery 3 The Essence of Data Mining? “…the discovery of interesting, unexpected, or valuable structures in large data sets.” – David Hand...

4 Pattern Discovery 4 “If you’ve got terabytes of data, and you’re relying on data mining to find interesting things in there for you, you’ve lost before you’ve even begun.” The Essence of Data Mining? “…the discovery of interesting, unexpected, or valuable structures in large data sets.” – David Hand – Herb Edelstein

5 Pattern Discovery Caution 5 Poor data quality Opportunity Interventions Separability Obviousness Non-stationarity

6 Pattern Discovery Applications 6 Data reduction Novelty detection Profiling Market basket analysis Sequence analysis C B A...

7 Pattern Discovery Tools 7 Data reduction Novelty detection Profiling Market basket analysis Sequence analysis C B A...

8 Pattern Discovery Tools 8 Data reduction Novelty detection Profiling Market basket analysis Sequence analysis C B A

9 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

10 Unsupervised Classification 10 inputs Unsupervised classification: grouping of cases based on similarities in input values. grouping cluster 1 cluster 2 cluster 1 cluster 3...

11 Unsupervised Classification 11 inputs Unsupervised classification: grouping of cases based on similarities in input values. grouping cluster 1 cluster 2 cluster 1 cluster 3...

12 k -means Clustering Algorithm 12 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Re-assign cases. 6.Repeat steps 4 and 5 until convergence.

13 k -means Clustering Algorithm 13 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Re-assign cases. 6.Repeat steps 4 and 5 until convergence.

14 k -means Clustering Algorithm 14 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

15 k -means Clustering Algorithm 15 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

16 k -means Clustering Algorithm 16 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

17 k -means Clustering Algorithm 17 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

18 k -means Clustering Algorithm 18 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

19 k -means Clustering Algorithm 19 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

20 k -means Clustering Algorithm 20 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

21 k -means Clustering Algorithm 21 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

22 k -means Clustering Algorithm 22 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

23 k -means Clustering Algorithm 23 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

24 k -means Clustering Algorithm 24 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

25 k -means Clustering Algorithm 25 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

26 k -means Clustering Algorithm 26 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

27 Segmentation Analysis 27 When no clusters exist, use the k-means algorithm to partition cases into contiguous groups. Training Data

28 Demographic Segmentation Demonstration 28 Analysis goal: Group geographic regions into segments based on income, household size, and population density. Analysis plan: Select and transform segmentation inputs. Select the number of segments to create. Create segments with the Cluster tool. Interpret the segments.

29 Segmenting Census Data This demonstration introduces SAS Enterprise Miner tools and techniques for cluster and segmentation analysis. 29

30 Exploring and Filtering Analysis Data This demonstration introduces SAS Enterprise Miner tools and techniques that explore and filter analysis data, particularly data source exploration and case filtering. 30

31 Setting Cluster Tool Options This demonstration illustrates how to use the Cluster tool to segment the cases in the CENSUS2000 data set. 31

32 Creating Clusters with the Cluster Tool This demonstration illustrates how the Cluster tool determines the number of clusters in the data. 32

33 Specifying the Segment Count This demonstration illustrates how you can change the number of clusters created by the Cluster node. 33

34 Exploring Segments This demonstration illustrates how to use graphical aids to explore the segments.

35 Profiling Segments This demonstration illustrates using the Segment Profile tool to interpret the composition of clusters.

36 Exercises This exercise reinforces the concepts discussed previously. 36

37 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

38 Market Basket Analysis 38 A B C A C D B C D A D E B C E...

39 Market Basket Analysis 39 A B C A C D B C D A D E B C E...

40 Implication? 40 Checking Account No Yes NoYes Savings Account 4,000 6,000 10,000 Support(SVG  CK) = 50% Confidence(SVG  CK) = 83% Lift(SVG  CK) = 0.83/0.85 < 1 Expected Confidence(SVG  CK) = 85%

41 Barbie Doll  Candy 1.Put them closer together in the store. 2.Put them far apart in the store. 3.Package candy bars with the dolls. 4.Package Barbie + candy + poorly selling item. 5.Raise the price on one, and lower it on the other. 6.Offer Barbie accessories for proofs of purchase. 7.Do not advertise candy and Barbie together. 8.Offer candies in the shape of a Barbie doll. 41

42 Data Capacity 42 A AB CD A D AAB B A

43 Association Tool Demonstration 43 Analysis goal: Explore associations between retail banking services used by customers. Analysis plan: Create an association data source. Run an association analysis. Interpret the association rules. Run a sequence analysis. Interpret the sequence rules.

44 Market Basket Analysis This demonstration illustrates how to conduct market basket analysis.

45 Sequence Analysis This demonstration illustrates how to conduct a sequence analysis.

46 Pattern Discovery Tools: Review 46 Generate cluster models using automatic settings and segmentation models with user-defined settings. Compare within-segment distributions of selected inputs to overall distributions. This helps you understand segment definition. Conduct market basket and sequence analysis on transactions data. A data source must have one target, one ID, and (if desired) one sequence variable in the data source.