MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)

Slides:



Advertisements
Similar presentations
Chapter 7 Classification and Regression Trees
Advertisements

An Overview of Machine Learning
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
PSY 307 – Statistics for the Behavioral Sciences
What is Statistical Modeling
Chapter 13 Multiple Regression
One-Way Between Subjects ANOVA. Overview Purpose How is the Variance Analyzed? Assumptions Effect Size.
Basic Data Mining Techniques
Final Review Session.
Chi-Square and Analysis of Variance (ANOVA) Lecture 9.
Chi Square Test Dealing with categorical dependant variable.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Decision Tree Models in Data Mining
Microsoft Enterprise Consortium Data Mining Concepts Introduction to Directed Data Mining: Decision Trees Prepared by David Douglas, University of ArkansasHosted.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Introduction to Directed Data Mining: Decision Trees
Inferential Statistics
Leedy and Ormrod Ch. 11 Gray Ch. 14
Testing Group Difference
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
Statistical Analysis I have all this data. Now what does it mean?
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
Statistical Analysis I have all this data. Now what does it mean?
Chapter 9 – Classification and Regression Trees
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 5 Auxiliary Uses of Trees.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
K Nearest Neighbors Classifier & Decision Trees
MKT 700 Business Intelligence and Decision Models Week 6: Segmentation and Cluster Analysis.
MKT 700 Business Intelligence and Decision Models Week 6: Segmentation and Cluster Analysis.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
IT Management Case # 8 - A Case on Decision Tree: Customer Churning Forecasting and Strategic Implication in Online Auto Insurance using Decision Tree.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
CADA Final Review Assessment –Continuous assessment (10%) –Mini-project (20%) –Mid-test (20%) –Final Examination (50%) 40% from Part 1 & 2 60% from Part.
CHI SQUARE TESTS.
1 Data Mining dr Iwona Schab Decision Trees. 2 Method of classification Recursive procedure which (progressively) divides sets of n units into groups.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Statistical Analysis. Z-scores A z-score = how many standard deviations a score is from the mean (-/+) Z-scores thus allow us to transform the mean to.
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
MKT 700 Business Intelligence and Decision Models Week 8: Algorithms and Customer Profiling (1)
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
ECE 471/571 – Lecture 20 Decision Tree 11/19/15. 2 Nominal Data Descriptions that are discrete and without any natural notion of similarity or even ordering.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
Classification and Regression Trees
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
1 Cluster Analysis – 2 Approaches K-Means (traditional) Latent Class Analysis (new) by Jay Magidson, Statistical Innovations based in part on a presentation.
Nonparametric Statistics
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
I231B QUANTITATIVE METHODS Analysis of Variance (ANOVA)
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Logistic Regression: Regression with a Binary Dependent Variable.
Vincent Granville, Ph.D. Co-Founder, DSC
Presentation transcript:

MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)

Classification and Prediction

Classification Unsupervised Learning

Predicting Supervised Learning

SPSS Direct Marketing ClassificationPredictive Unsupervised Learning RFM Cluster analysis Postal Code Responses NA Supervised LearningCustomer ProfilingPropensity to buy

SPSS Analysis ClassificationPredictive Unsupervised Learning Hierarchical Cluster Two-Step Cluster K-Means Cluster NA Supervised LearningClassification Trees -CHAID -CART Linear Regression Logistic Regression Artificial Neural Nets

Major Algorithms ClassificationPredictive Unsupervised Learning Euclidean Distance Log Likelihood NA Supervised LearningChi-square Statistics Log Likelihood GINI Impurity Index F-Statistics (ANOVA) Log Likelihood F-Statistics (ANOVA) Nominal: Chi-square, Log Likelihood Continuous: F-Statistics, Log Likelihood

Euclidean Distance

Euclidean Distance for Continuous Variables Pythagorean distance  √d 2 = √(a 2 +b 2 ) Euclidean space  √d 2 = √(a 2 +b 2 +c 2 ) Euclidean distance  d = [(d i ) 2 ] 1/2 (Cluster Analysis with continuous var.)

Pearson’s Chi-Square

Contingency Table NorthSouthEastWestTot. Yes No Tot

Observed and theoretical Frequencies NorthSouthEastWestTot. Yes % No % Tot

Chi-Square: Obs. f o fefe fo-fefo-fe (f o -f e ) 2 f e 1,1 68 1,2 75 1,3 57 1,4 79 2,1 32 2,2 45 2,2 33 2, X 2 = 3.032

Statistical Inference DF: (4 col –1) (2 rows –1) =

Log Likelihood Chi-Square

Log Likelihood Based on probability distributions rather than contingency (frequency) tables. Applicable to both categorical and continuous variables, contrary to chi-square which must be discreticized.

Contingency Table (Observed Frequencies) Cluster 1Cluster 2Total Male103040

Contingency Table (Expected Frequencies) Cluster 1Cluster 2Total Male

Chi-Square: Obs. f o FeFe fo-fefo-fe (f o -f e ) 2 f e 1,1 10 1, X 2 = p < 0.05; DF = 1; Critical value = 3.84

Log Likelihood Distance & Probability Cluster 1Cluster 2 Male O E O/E Ln (O/E) O * Ln (O/E) 2∑O*Ln(O/E) 10/20 = * /20= * *( ) = p < 0.05; critical value = 3.84

Variance, ANOVA, and F Statistics

F-Statistics For metric or continuous variables Compares explained (in the model) and unexplained variances (errors)

Variance SQUARED VALUEMEANDIFFERENCE COUNT20SS =1461 DF=19 VAR =76.88 MEAN43.6SD=8.768 SS is Sum of Squares DF = N-1 VAR=SS/DF SD = √VAR

ANOVA Two Groups: T-test Three + Group Comparisons: Are errors (discrepancies between observations and the overall mean) explained by group membership or by some other (random) effect?

Oneway ANOVA Grand mean Group 1Group 2Group (X-Mean) Group means (X-Mean) SS Within Total SS

MSS(Between)/MSS(Within) Winthin groups Between Groups Total Errors SS = DF24-3=213-1=224-1=23 Mean SS Between Groups Mean SS p-value <.05 Within Groups Mean SS0.696

ONEWAY (Excel or SPSS) Anova: Single Factor SUMMARY GroupsCountSumAverageVariance Group Group Group ANOVA Source of VariationSSdfMSFP-valueF crit Between Groups E Within Groups Total

Profiling

Customer Profiling: Documenting or Describing Who is likely to buy or not respond? Who is likely to buy what product or service? Who is in danger of lapsing?

CHAID or CART Chi-Square Automatic Interaction Detector Based on Chi-Square All variables discretecized Dependent variable: nominal Classification and Regression Tree Variables can be discrete or continuous Based on GINI or F-Test Dependent variable: nominal or continuous

Use of Decision Trees Classify observations from a target binary or nominal variable  Segmentation Predictive response analysis from a target numerical variable  Behaviour Decision support rules  Processing

Decision Tree

Example: dmdata.sav Underlying Theory  X 2

CHAID Algorithm Selecting Variables Example Regions (4), Gender (3, including Missing) Age (6, including Missing) For each variable, collapse categories to maximize chi-square test of independence: Ex: Region (N, S, E, W,*)  (WSE, N*) Select most significant variable Go to next branch … and next level Stop growing if …estimated X 2 < theoretical X 2

CART (Nominal Target) Nominal Targets: GINI (Impurity Reduction or Entropy) Squared probability of node membership Gini=0 when targets are perfectly classified. Gini Index =1-∑p i 2 Example Prob: Bus = 0.4, Car = 0.3, Train = 0.3 Gini = 1 –(0.4^ ^ ^2) = 0.660

CART (Metric Target) Continuous Variables: Variance Reduction (F-test)

Comparative Advantages (From Wikipedia) Simple to understand and interpret Requires little data preparation Able to handle both numerical and categorical data Uses a white box model easily explained by Boolean logic. Possible to validate a model using statistical tests Robust