Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

Similar presentations


Presentation on theme: "1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)"— Presentation transcript:

1 1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

2 2 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

3 3 Pattern Discovery 3 The Essence of Data Mining? “…the discovery of interesting, unexpected, or valuable structures in large data sets.” – David Hand...

4 4 Pattern Discovery 4 “If you’ve got terabytes of data, and you’re relying on data mining to find interesting things in there for you, you’ve lost before you’ve even begun.” The Essence of Data Mining? “…the discovery of interesting, unexpected, or valuable structures in large data sets.” – David Hand – Herb Edelstein

5 5 Pattern Discovery Caution 5 Poor data quality Opportunity Interventions Separability Obviousness Non-stationarity

6 6 Pattern Discovery Applications 6 Data reduction Novelty detection Profiling Market basket analysis Sequence analysis C B A...

7 7 Pattern Discovery Tools 7 Data reduction Novelty detection Profiling Market basket analysis Sequence analysis C B A...

8 8 Pattern Discovery Tools 8 Data reduction Novelty detection Profiling Market basket analysis Sequence analysis C B A

9 9 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

10 10 Unsupervised Classification 10 inputs Unsupervised classification: grouping of cases based on similarities in input values. grouping cluster 1 cluster 2 cluster 1 cluster 3...

11 11 Unsupervised Classification 11 inputs Unsupervised classification: grouping of cases based on similarities in input values. grouping cluster 1 cluster 2 cluster 1 cluster 3...

12 12 k -means Clustering Algorithm 12 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Re-assign cases. 6.Repeat steps 4 and 5 until convergence.

13 13 k -means Clustering Algorithm 13 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Re-assign cases. 6.Repeat steps 4 and 5 until convergence.

14 14 k -means Clustering Algorithm 14 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

15 15 k -means Clustering Algorithm 15 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

16 16 k -means Clustering Algorithm 16 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

17 17 k -means Clustering Algorithm 17 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

18 18 k -means Clustering Algorithm 18 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

19 19 k -means Clustering Algorithm 19 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

20 20 k -means Clustering Algorithm 20 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

21 21 k -means Clustering Algorithm 21 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

22 22 k -means Clustering Algorithm 22 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

23 23 k -means Clustering Algorithm 23 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

24 24 k -means Clustering Algorithm 24 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

25 25 k -means Clustering Algorithm 25 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

26 26 k -means Clustering Algorithm 26 Training Data 1.Select inputs. 2.Select k cluster centers. 3.Assign cases to closest center. 4.Update cluster centers. 5.Reassign cases. 6.Repeat steps 4 and 5 until convergence....

27 27 Segmentation Analysis 27 When no clusters exist, use the k-means algorithm to partition cases into contiguous groups. Training Data

28 28 Demographic Segmentation Demonstration 28 Analysis goal: Group geographic regions into segments based on income, household size, and population density. Analysis plan: Select and transform segmentation inputs. Select the number of segments to create. Create segments with the Cluster tool. Interpret the segments.

29 29 Segmenting Census Data This demonstration introduces SAS Enterprise Miner tools and techniques for cluster and segmentation analysis. 29

30 30 Exploring and Filtering Analysis Data This demonstration introduces SAS Enterprise Miner tools and techniques that explore and filter analysis data, particularly data source exploration and case filtering. 30

31 31 Setting Cluster Tool Options This demonstration illustrates how to use the Cluster tool to segment the cases in the CENSUS2000 data set. 31

32 32 Creating Clusters with the Cluster Tool This demonstration illustrates how the Cluster tool determines the number of clusters in the data. 32

33 33 Specifying the Segment Count This demonstration illustrates how you can change the number of clusters created by the Cluster node. 33

34 34 Exploring Segments This demonstration illustrates how to use graphical aids to explore the segments.

35 35 Profiling Segments This demonstration illustrates using the Segment Profile tool to interpret the composition of clusters.

36 36 Exercises This exercise reinforces the concepts discussed previously. 36

37 37 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)

38 38 Market Basket Analysis 38 A B C A C D B C D A D E B C E...

39 39 Market Basket Analysis 39 A B C A C D B C D A D E B C E...

40 40 Implication? 40 Checking Account No Yes NoYes Savings Account 4,000 6,000 10,000 Support(SVG  CK) = 50% Confidence(SVG  CK) = 83% Lift(SVG  CK) = 0.83/0.85 < 1 Expected Confidence(SVG  CK) = 85%

41 41 Barbie Doll  Candy 1.Put them closer together in the store. 2.Put them far apart in the store. 3.Package candy bars with the dolls. 4.Package Barbie + candy + poorly selling item. 5.Raise the price on one, and lower it on the other. 6.Offer Barbie accessories for proofs of purchase. 7.Do not advertise candy and Barbie together. 8.Offer candies in the shape of a Barbie doll. 41

42 42 Data Capacity 42 A AB CD A D AAB B A

43 43 Association Tool Demonstration 43 Analysis goal: Explore associations between retail banking services used by customers. Analysis plan: Create an association data source. Run an association analysis. Interpret the association rules. Run a sequence analysis. Interpret the sequence rules.

44 44 Market Basket Analysis This demonstration illustrates how to conduct market basket analysis.

45 45 Sequence Analysis This demonstration illustrates how to conduct a sequence analysis.

46 46 Pattern Discovery Tools: Review 46 Generate cluster models using automatic settings and segmentation models with user-defined settings. Compare within-segment distributions of selected inputs to overall distributions. This helps you understand segment definition. Conduct market basket and sequence analysis on transactions data. A data source must have one target, one ID, and (if desired) one sequence variable in the data source.


Download ppt "1 Chapter 8: Introduction to Pattern Discovery 8.1 Introduction 8.2 Cluster Analysis 8.3 Market Basket Analysis (Self-Study)"

Similar presentations


Ads by Google