N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis.

Slides:



Advertisements
Similar presentations
Clustering II.
Advertisements

Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Clustering.
Hierarchical Clustering
Cluster Analysis: Basic Concepts and Algorithms
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
Clustering Categorical Data The Case of Quran Verses
Discriminant Analysis Database Marketing Instructor:Nanda Kumar.
MIS2502: Data Analytics Clustering and Segmentation.
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
Introduction to Bioinformatics
AEB 37 / AE 802 Marketing Research Methods Week 7
Cluster Analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
Clustering II.
Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.
Cluster Analysis: Basic Concepts and Algorithms
Multivariate Data Analysis Chapter 9 - Cluster Analysis
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
Overview of Database Marketing. Historical Perspective Mass Production, Mass Media and Mass Mkt now replaced by -a one-to-one economic system The one-to-one.
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
Segmentation Analysis
Unsupervised Learning. CS583, Bing Liu, UIC 2 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate.
COMP53311 Clustering Prepared by Raymond Wong Some parts of this notes are borrowed from LW Chan ’ s notes Presented by Raymond Wong
CLUSTER ANALYSIS.
Store segmentation using SAS clustering Baofu Ma Merchandising AUTOZONE ANALYST,MERCH RESEARCH.
© 2007 Prentice Hall20-1 Chapter Twenty Cluster Analysis.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
1 Cluster Analysis Objectives ADDRESS HETEROGENEITY Combine observations into groups or clusters such that groups formed are homogeneous (similar) within.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
CLUSTERING AND SEGMENTATION MIS2502 Data Analytics Adapted from Tan, Steinbach, and Kumar (2004). Introduction to Data Mining.
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
Machine Learning Queens College Lecture 7: Clustering.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L10.1 Lecture 10: Cluster analysis l Uses of cluster analysis.
Slide 1 EE3J2 Data Mining Lecture 18 K-means and Agglomerative Algorithms.
Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.
Definition Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to)
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
1 Cluster Analysis Prepared by : Prof Neha Yadav.
Multivariate statistical methods Cluster analysis.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Basic statistical concepts Variance Covariance Correlation and covariance Standardisation.
Chapter_20 Cluster Analysis Naresh K. Malhotra
CLUSTER ANALYSIS. Cluster Analysis  Cluster analysis is a major technique for classifying a ‘mountain’ of information into manageable meaningful piles.
Topic 4: Cluster Analysis Analysis of Customer Behavior and Service Modeling.
Unsupervised Learning
Multivariate statistical methods
Data Mining: Basic Cluster Analysis
Topic 3: Cluster Analysis
MIS2502: Data Analytics Clustering and Segmentation
Clustering.
DATA MINING Introductory and Advanced Topics Part II - Clustering
MIS2502: Data Analytics Clustering and Segmentation
MIS2502: Data Analytics Clustering and Segmentation
Chapter_20 Cluster Analysis
Cluster Analysis.
Topic 5: Cluster Analysis
Clustering The process of grouping samples so that the samples are similar within each group.
Cluster analysis Presented by Dr.Chayada Bhadrakom
Business Application & Conceptual Issues
Unsupervised Learning
Presentation transcript:

N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis

2 Agenda Discussion of the first Assignment Motivation for conducting Cluster Analysis Benefit Segmentation Cluster Analysis Basic Concepts Hierarchical/Non- Hierarchical Clustering Implementation in SAS and interpreting the output

N. Kumar, Asst. Professor of Marketing Voter Profiling What are the different voting segments out there? What do they want to hear i.e. issues they care about? What should I say?

N. Kumar, Asst. Professor of Marketing Ad Campaign How many customer segments are there? How many do I want to target? How should I target – what message should I communicate to each segment?

N. Kumar, Asst. Professor of Marketing Promotional Strategies Coupon Drops – who should they be targeted at? Catalog Example – should the catalog be accompanied with a $5 coupon or a $10 coupon or no coupon?

N. Kumar, Asst. Professor of Marketing What is Cluster Analysis? Cluster Analysis is a technique for combining observations into groups or clusters such that: Each group is homogenous with respect to certain characteristics (that you specify) Each group is different from the other groups with respect to the same characteristics

N. Kumar, Asst. Professor of Marketing Data ConsumerIncome ($ 1000s)Education (years)

N. Kumar, Asst. Professor of Marketing Geometrical View of Cluster Analysis Education Income

N. Kumar, Asst. Professor of Marketing Similarity Measures Why are consumers 1 and 2 similar? Distance(1,2) = (5-6) 2 + (5-6) 2 More generally, if there are p variables: Distance(i,j) =  (x ik - x jk ) 2

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1C2C3C4C5C6 C C C C C C

N. Kumar, Asst. Professor of Marketing Clustering Techniques Hierarchical Clustering Non-Hierarchical Clustering

N. Kumar, Asst. Professor of Marketing Hierarchical Clustering Distance(1,2) = 2 = Distance(3,4) Say, we group 1 and 2 together and leave the others as is How do we compute the distance between a group that has two (or more) members and the others?

N. Kumar, Asst. Professor of Marketing Hierarchical Clustering Algorithms Centroid Method Nearest-Neighbor or Single-Linkage Farthest-Neighbor or Complete-Linkage Average-Linkage Ward’s Method

N. Kumar, Asst. Professor of Marketing Centroid Method Each group is replaced by an average consumer Cluster 1 – average income = 5.5 and average education = 5.5

N. Kumar, Asst. Professor of Marketing Data for Five Clusters ClusterMembersIncomeEducation 1C1&C25.5 2C C C C63019

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1&C2C3C4C5C6 C1&C20 C C C C

N. Kumar, Asst. Professor of Marketing Data for Four Clusters ClusterMembersIncomeEducation 1C1&C25.5 2C3&C C C63019

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1&C2C3&C4C5C6 C1&C20 C3&C41810 C C

N. Kumar, Asst. Professor of Marketing Data for Three Clusters ClusterMembersIncomeEducation 1C1&C25.5 2C3&C C5&C

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1&C2C3&C4C5&C6 C1&C20 C3&C41810 C5&C

N. Kumar, Asst. Professor of Marketing Dendogram for the Data C1C2C3C4C5C6

N. Kumar, Asst. Professor of Marketing Single Linkage First Cluster is formed in the same fashion Distance between Cluster 1 comprising of customers 1 and 2 and customer 3 is the minimum of Distance(1,3) = 181 and Distance(2,3) = 145

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1&C2C3C4C5C6 C1&C20 C31450 C C C

N. Kumar, Asst. Professor of Marketing Complete Linkage Distance between Cluster 1 comprising of customers 1 and 2 and customer 3 is the maximum of Distance(1,3) = 181 and Distance(2,3) = 145

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1&C2C3C4C5C6 C1&C20 C31810 C C C

N. Kumar, Asst. Professor of Marketing Average Linkage Distance between Cluster 1 comprising of customers 1 and 2 and customer 3 is the average of Distance(1,3) = 181 and Distance(2,3) = 145

N. Kumar, Asst. Professor of Marketing Similarity Matrix C1&C2C3C4C5C6 C1&C20 C31630 C C C

N. Kumar, Asst. Professor of Marketing Ward’s Method Does not compute distance between clusters Forms clusters by maximizing within- cluster homogeneity or minimizing error sum of squares (ESS) ESS for cluster with two observations (say, C1 and C2) = (5-5.5) 2 + (6-5.5) 2 + (5-5.5) 2 + (6-5.5) 2

N. Kumar, Asst. Professor of Marketing Ward’s Method CL1CL2CL3CL4CL5ESS 1C1,C2C3C4C5C61 2C1,C3C2C4C5C C1,C4C2C3C5C C1,C5C2C3C4C C1,C6C2C3C4C C2,C3C1C4C5C C2,C4C1C3C5C690.5

N. Kumar, Asst. Professor of Marketing Non-Hierarchical Clustering Data are grouped into K clusters Requires a priori knowledge of K

N. Kumar, Asst. Professor of Marketing Basic Steps in Non-Hierarchical Clustering Select K initial cluster centroids Assign each observation to the cluster to which it is closest Reassign or reallocate each observation to one of the K clusters according to a pre-determined stopping rule Stop if there is no reallocation Approaches differ in Step 1 and/or step 3

N. Kumar, Asst. Professor of Marketing Algorithm I Selects first K observations as cluster centers

N. Kumar, Asst. Professor of Marketing Initial Cluster Centroids VariableCL1CL2CL3 Income5615 Education5614

N. Kumar, Asst. Professor of Marketing Initial Assignment Distance from C1 Distance from C2 Distance from C3 Assigned to CL C C C C C C

N. Kumar, Asst. Professor of Marketing New Cluster Centroids VariableCL1CL2CL3 Income Education5617

N. Kumar, Asst. Professor of Marketing Distance Matrix Distance from CL1 Distance from CL2 Distance from CL3 Previous Assignment Current Assignment C C C C C C

N. Kumar, Asst. Professor of Marketing Algorithm II Differs from Algorithm I in how the initial seeds are modified As before first K observations are selected as the initial cluster seeds A seed that is a candidate for replacement is from one of the two seeds that are closest to each other An observation qualifies to replace one of the two candidates if the distance between the seeds is less than the distance between the observation and the closest seed

N. Kumar, Asst. Professor of Marketing Algorithm II …contd. C1, C2 and C3 are the initial seeds The smallest distance between the seeds is between C1 and C2 Observation C4 does not qualify as a replacement as Distance(C1,C2) > Distance(C4 and the nearest seed C3) Observation C5 does qualify as a replacement as Distance(C1,C2) < Distance(C5 and the nearest seed C3): replace C2 with C5

N. Kumar, Asst. Professor of Marketing Initial Assignment Distance from C1 Distance from C2 Distance from C3 Assigned to CL C C C C C C

N. Kumar, Asst. Professor of Marketing New Cluster Centroids VariableCL1CL2CL3 Income Education

N. Kumar, Asst. Professor of Marketing Distance Matrix Distance from CL1 Distance from CL2 Distance from CL3 Previous Assignment Current Assignment C C C C C C

N. Kumar, Asst. Professor of Marketing Hierarchical vs. Non-Hierarchical Clustering Hierarchical clustering does not require a priori knowledge of the number of clusters Assignments are static Use hierarchical clustering for exploratory purposes Non-Hierarchical Methods can be viewed as a complementary rather than a competing method

N. Kumar, Asst. Professor of Marketing Voter Profiling Survey of voters concerns may help us group customers with similar concerns – perhaps they all live in a certain area? Target ads/mailings with customized messages

N. Kumar, Asst. Professor of Marketing Ad Campaign Use attitudinal data to segment customers Target message appropriately

N. Kumar, Asst. Professor of Marketing Promotional Strategies Use transaction data to group customers into those that are more prone to purchasing the product on deal Give a stronger incentive to the price sensitive segment