Business Application & Conceptual Issues

Slides:



Advertisements
Similar presentations
N. Kumar, Asst. Professor of Marketing Database Marketing Cluster Analysis.
Advertisements

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Discriminant Analysis Database Marketing Instructor:Nanda Kumar.
Statistics Measures of Regression and Prediction Intervals.
Overview Correlation Regression -Definition
Chapter 17 Overview of Multivariate Analysis Methods
x – independent variable (input)
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Discriminant Analysis Testing latent variables as predictors of groups.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Segmentation Analysis
William B. Hakes, Ph.D.-V Cluster Analysis & Hybrid Models Business Application & Conceptual Issues March 3, 2005.
Measurement in Survey Research Developing Questionnaire Items with Respect to Analysis.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
Logistic Regression Database Marketing Instructor: N. Kumar.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Warm up On slide.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Class 4 Ordinary Least Squares CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Notes 1.3 (Part 1) An Overview of Statistics. What you will learn 1. How to design a statistical study 2. How to collect data by taking a census, using.
Nearest Neighbour and Clustering. Nearest Neighbour and clustering Clustering and nearest neighbour prediction technique was one of the oldest techniques.
Multivariate Analysis - Introduction. What is Multivariate Analysis? The expression multivariate analysis is used to describe analyses of data that have.
Valuation: Market-Based Approach
Warm up On slide.
Inferential Statistics
Chapter 7. Classification and Prediction
Linear Regression with One Regression
Statistical Data Analysis - Lecture /04/03
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
August 25, 2015 Please turn in any forms or assignments from yesterday. Take out a sheet of paper and something to write with for notes.
CHAPTER 3 Describing Relationships
Introduction The two-sample z procedures of Chapter 10 allow us to compare the proportions of successes in two populations or for two treatments. What.
Multivariate Analysis - Introduction
Statistics: The Z score and the normal distribution
Dimension Reduction in Workers Compensation
Linear Programming Dr. T. T. Kachwala.
Capital Asset Pricing Model (CAPM)
Hypothesis Testing Review
Chapter 5 STATISTICS (PART 4).
CJT 765: Structural Equation Modeling
THE BEGINNING.
QM222 A1 Nov. 27 More tips on writing your projects
Graphical Descriptive Techniques
1 Chapter 1: Introduction to Statistics. 2 Variables A variable is a characteristic or condition that can change or take on different values. Most research.
Chapter 1,2 Stats Starts Here.
Chapter 11 Analysis of Variance
Variables and Measurement (2.1)
Chapter 01 Stats Starts Here.
CHAPTER 3 Describing Relationships
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Multivariate Statistics
Chapter 01 Stats Starts Here.
MIS2502: Data Analytics Clustering and Segmentation
MIS2502: Data Analytics Clustering and Segmentation
Chapter 3 A Review of Statistical Principles Useful in Finance
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Cluster Analysis.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 01 Stats Starts Here.
InferentIal StatIstIcs
Multivariate Analysis - Introduction
Data Collection and Experimental Design
CHAPTER 3 Describing Relationships
Presentation transcript:

Business Application & Conceptual Issues Decisionistics Statistical Insight…. Better Decisions Cluster Analysis Business Application & Conceptual Issues

Introduction A financial analyst of an investment firm is interested in identifying a group of mutual funds that are look alike in a “true” context, not simply based on the way Morningstar rates them. A marketing manager is interested in identifying similar cities that can be used for a test marketing campaign in which a new product might be introduced. The Director of Marketing at a telecom firm wants to understand the types of people that he already knows are candidates for the firm’s new long distance service A Golf Club General Manager wants to understand the “natural” segments of his members so that he can better utilize his clubs assets and understand how he might ideally want the club to look in the future.

Cluster Overview Cluster Analysis- A technique used for combining observations into groups or clusters such that : Each group is homogenous or compact with respect to certain characteristics. Each group should be different from other groups with respect to the same characteristic Mathematically, we minimize the sums of squares within and maximize the sums of squares between.

Cluster Overview Cluster Analysis- its easy when: You have a relatively small sample You have nice, neat data Your variables are continuous Cluster Analysis- The Real World Sometimes sample are small, but in business they’re large We’d like our data to be free from error, containing no outliers, but that is rarely the case. Variables are often a mix of continuous and categorical data

Clustering- A Problematic Example Take the following example: You are a firm trying to generate clusters about the Atlanta area with the objective of understanding zip codes to which you want to “mass”market your products. Many different races exist. How do you cluster them? Typically, its: 1) White 2) African American 3) Asian 4) Hispanic 5) Native American 6) Non-white other What will clustering do with this variable as it groups people?

A Problematic Example cont’d Can you cluster this simple example? How will you interpret it (e.g., what’s a common way to look at the “answer” to see if you agree with the differentiation)?

A Problematic Example cont’d Cluster Means- What do they tell us? Assume we have three clusters, and along the “race” dimension, they are as follows: Cluster 1- Mean=2 Cluster 2- Mean=4 Cluster 3- Mean=1 How do you use this data to assign people into clusters?

Binary Variables- One Possible Solution?

Application of Binary Clustering A Golf Club General Manager wants to understand the “natural” segments of his members so that he can better utilize his clubs assets and understand how he might ideally want the club to look in the future. How can cluster analysis help? We took a look at the following Demographic Information Usage Information Cost Information Some data was measured and some was survey data

Application of Binary Clustering

Binary Variables- A Closer Look How will these cases cluster? What can we do about it?

Jaccard Coefficient a___ Sj = a + b + c - a b c d Many different uses , but its works great for clustering (see SPSS) a___ Sj = a + b + c where a is the sum of agreement (+ +) and b, c represent the sums of absent/present combinations (i.e. + - , and - +, respectively). The table below shows this convention of lettering for counts when calculating the similarity between two objects. Values of d are not considered because they represent complete disagreement.   OBJECT 1 + - OBJECT 2 a b c d  

Another Application of Binary Clustering A Security Company wants to understand the “natural” segments of those who have bought their service in the past, and those that have not. What methods do we use first? How can binary cluster analysis help (using Jaccard)? Allows us to use categorical data. Gives us unique summary insight into the true percentages of each cluster along various dimensions. Not tricked by the zero problem.

Issues for Further Research Logistic Regression generates clusters…How Many? Generate model Assess predictive power Score against database. We know “who” is important, but “how” do we reach them? Cluster Analysis Which variables are important in clustering? How do you know? Clustering followed by Rule Induction Develop clusters Use as inputs into algorithm (CHAID) Take simple rules and use to assess cases across a database