Lecture 8 MARK2039 Summer 2006 George Brown College Wednesday 9-12.

Slides:



Advertisements
Similar presentations
1 Revisiting salary Acme Bank: Background A bank is facing a discrimination suit in which it is accused of paying its female employees.
Advertisements

How Abacus solutions can increase your ROI Abacus Insights Event – Wednesday 1 st October 2014.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 7: Demand Estimation and Forecasting.
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
AEB 37 / AE 802 Marketing Research Methods Week 5
Matching level of measurement to statistical procedures
Correlations and T-tests
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Brown, Suter, and Churchill Basic Marketing Research (8 th Edition) © 2014 CENGAGE Learning Basic Marketing Research Customer Insights and Managerial Action.
Multiple Regression Research Methods and Statistics.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Decision Tree Models in Data Mining
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation and Linear Regression
Correlation and Linear Regression
Correlation and Linear Regression Chapter 13 Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Lecture 4 MARK2039 Winter 2006 George Brown College Wednesday 9-12.
Decay Effects in Online Advertising: Quantifying the Impact of Time Since Last Exposure Authors: Christian Kugel, Starcom IP Bill Havlena, Ph.D., Dynamic.
Introduction to Linear Regression and Correlation Analysis
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Chapter 13: Inference in Regression
Linear Regression and Correlation
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Multiple Discriminant Analysis and Logistic Regression.
Multiple Regression. In the previous section, we examined simple regression, which has just one independent variable on the right side of the equation.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Lecture 9 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Lecture 7 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
CHAPTER 14 MULTIPLE REGRESSION
Lecture 6 MARK2039 Winter 2006 George Brown College Wednesday 9-12.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
Learning Agenda Emotions & Sales Article Sutton & Rafaeli Understanding the phenomenon Conducting an observational study –qualitative & quantitative info.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Recap of data analysis and procedures Food Security Indicators Training Bangkok January 2009.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Lecture 02.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Lecture 10 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
Lecture 3 MARK2039 Winter 2006 George Brown College Wednesday 9-12.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 12 Multiple.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
ANOVA, Regression and Multiple Regression March
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Wednesday: Need a graphing calculator today. Need a graphing calculator today.
BUS 308 Entire Course (Ash Course) For more course tutorials visit BUS 308 Week 1 Assignment Problems 1.2, 1.17, 3.3 & 3.22 BUS 308.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Multiple Regression Analysis and Model Building
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple logistic regression
CHAPTER 29: Multiple Regression*
Tabulations and Statistics
Multiple Regression Chapter 14.
An Introduction to Correlational Research
Regression and Categorical Predictors
Presentation transcript:

Lecture 8 MARK2039 Summer 2006 George Brown College Wednesday 9-12

2 Assignment 6 Backend: H4B2E5STRUGER Marketing list: H4B2E5STRUGERJOHN4849MAYFAIR Unaddressed Campaign: H4B2E5

3 Assignment 6 Id Total Amount # of months since last trans months months months months

4 Assignment 6 Data needs to be standardized such that we have one value for each gender outcome

5 Assignment 6 Use purchase behaviour field and look at purchase window(say 3 mos.) (April06 to June06). No purchase in window means customer is non defector(0) while purchase in window means customer is defector(1). I would use the other information(income,region,age, and tenure) as potential variables to help predict defection.

6 Classification/Profiling vs. Predictive Modelling Predictive Modelling Pre Age - Tenure - Income -TransactionTransaction Behaviour Independant variables Dependant variable Post - Defector - Non - Defector Predict Profiling Non Defector Age - Tenure - Income -Transaction Behaviour Defector - Age - Tenure - Income - Transaction behaviour Classification

7 Predictive Modelling Examples:Discrete Models –Response Models Cross Sell Upsell Acquisition –Attrition Models –Product Affinity Models –Risk Models

8 Predictive Modelling Examples-Continuous Models –Profitability/Value Models –Spending Models

9 Types of Predictive Models - An acquisition campaign with no targetting was conducted in January. The available information is as follows: – Mail files containing name and address –Responder files containing name and address –2001 Stats Can Census data available at the enumeration area – A conversion table which maps enumeration areas to postal codes How would you use the above information to better target prospects to become new customers. Describe how the analytical file would be created 1) define objective function of creating response variable 2)create response variable by matching responder file to mail file using match key of postal code and last name. Assign value of 1 for matches(responders) and 0 for non matches(non responders). This field will be created on mail file or analytical file 3)Match analytical file to Stats conversion file(contains enumeration area) by postal code. Match new output file to Stats Can file by enumeration area which contains the very rich demographic information. Remember the end deliverable is to create a table with the dependant variable or objective function and examples of other independent or predictor variables.

10 Types of Predictive Models You have been asked to create programs that better target existing customers for insurance products. You have the following info: What would you do and how would you create the analytical file 1) Define objective function and create insurance response variable 2)create insurance response variable by looking at amount spent in certain transaction type and within a certain timeframe. Assign value of 1 if this condition is met and 0 if not.. This field will be created on analytical file 3)Create independent model predictors by creating recency,frequency, and amount variables and by type from the transaction file. Create demographic variables from the customer file such as region of country, tenure, age, income,etc. Remember the end deliverable is to create a table with the dependant variable or objective function and examples of other independent or predictor variables. 1) define objective function of

11 Types of Predictive Models You have been asked to build a targetting tool for a cross-sell campaign to get existing customers to purchase an insurance policy A campaign was conducted in May of What questions do you need to ask in order to help design a proper tool Was the campaign data captured. Are responders clearly identified or do we have to impute them through the database based on the transaction data that occurred within a certain time frame of the campaign.

12 Types of Predictive Models You have been asked to target customer that will not only purchase insurance but will also purchase the largest premiums What type of model would be built here? Two-stage model with one whereby we are targetting both insurance response and premium. Objective function is Expected value of premium: Pr(Response) X Premium

13 Types of Predictive Models Creating The Analytical File –Defining the objective function –Defining the Model predictors Once this is done, the first diagnostic that can be done is the correlation matrix.

14 Correlation Want to determine which variables have the greatest relationship with response Run the correlation of the dependant variable with all the independents (in your reduced set). Based on the highest correlation coefficient select best variables (usually select those with statistical significance criterion of at least 95%) Correlation can be negative or positive Serves as a great pre-screening tool.

15 The Concept of Correlation Using correlation analysis for selecting variables for our response model. Analytical file contains six variables: Dependant Variable/ Modelled Variable Independent Variables Response Age Tenure # of Products # of Promotions Income Household Size The key diagnostics in this routine are: Correlation coefficient Confidence level The key diagnostics in this routine are: Correlation coefficient Confidence level

16 Correlation Coefficient

17 Correlation Analysis The male gender variable has a perfect correlation of +1. The female gender variable has a perfect correlation of -1. Household size has no correlation with response, hence the correlation coefficient is 0.

18 Correlation Results Show the level of confidence which a given variable has with the modelled behaviour i.e. response Correlation coefficient Confidence Interval

19 Correlation Why couldn’t we just use results of correlation to create model and create index values for each sign.variable. –Age –Tenure –# of products purchased –# of promotions since last purchase Because there is interaction between variables that need to be accounted for in modelling exercise(multicollinearity). You can review this concept in more detail in any introductory stats textbook.

20 Examples-Correlation-Response Model Listed below is an example of a correlation matrix Answer the following: Is each variable relevant -all with exception of live in Quebec, # in household and # of months since last purchase What is the relationship or impact of each variable with response -sign of variable tells you relationship where corr. Coeff. tells you impact What is the strongest variable and what is the weakest variable? Strongest var: # of months since last promoted. Weakest var: live in Quebec

21 More examples of correlation -Younger people are more likely to respond -Higher income are more likely to respond -Males are less likely to respond Would the correlation values against response for the above variables be highly positive,close to zero or negative for age,income, and females age: highly negative Income: highly positive Females: highly positive People who live in Quebec exhibit no impact on response, people with high tenure and high number of months since last promotion are less likely to respond. Would the correlation values against response for the each variable be highly positive,close to zero or negative –Quebec: close to zero –tenure: highly negative –Number of months since last promotion:highly negative

22 More examples of correlation Previous analysis has indicated the following trends Would the correlations be closer to 1,-1, or 0 here for both variables? Would the correlations be closer to 1,-1, or 0 here for both variables? Spending: close to 0. tenure: close to -1

23 More examples of correlation Would the correlations be closer to 1,-1, or 0 here for both variables? Would the correlations be closer to 1,-1, or 0 here for both variables? What is the learning here vs. the previous slide- variables have changed in their impact to response What is the learning here vs. the previous slide- variables have changed in their impact to response Spending: close to 1 tenure: close to 0

24 Exploratory Data Analysis Reports(EDA) After looking at the correlation reports, we also need to create EDA reports which help to better understand the relationship of a given variable with the desired marketing behaviour. It helps the business people and marketers to get inside the so-called black box of modelling.

25 Exploratory Data Analysis Reports(EDA)

26 Exploratory Data Analysis Reports(EDA) Let’s take a look at example of a binary variable On the next page are some examples of EDA reports of variables that are not statistically significant according to the correlation matrix. Male# of ObservationsResponse Rate Yes % No % Average %

27 Exploratory Data Analysis Reports(EDA) EDA’s of non-stat.sign. variables

28 Exploratory Data Analysis Reports Exploratory Data Analysis Reports: What does this tell us?

29 Exploratory Data Analysis Reports What does this mean?

30 Creating the Final Model Why couldn’t we just use results of correlation to create model and create index values for each sign.variable. –Age –Tenure –# of products purchased –# of promotions since last purchase Think Statistics here?

31 The Data Mining Process : Application of Data Mining Techniques-Creating the Final Model Problems with Multicollinearity Example: Years of Education and Income on Response Rate Regression Equation is: Response= *income -.03*yrs. of education Years of Income Education Correlation Coefficient Confidence Interval99% 99.50% Response What is the problem here and what do you do? Problems with Multicollinearity Example: Years of Education and Income on Response Rate Regression Equation is: Response= *income -.03*yrs. of education

32 Continuing to build the model Multivariate analytical techniques such as multiple regression,logistic regression,etc. may be employed to produce the final model Final equation: Predicted Response Rate:= A –B1*Age +B2*tenure What is the problem here?

33 Continuing to build the model VariableCorrelation Spend0.6 Live in Ontario0.5 Number in House-0.3 Response=A (+.05 X spend) (-.03 X Live in Ontario) (-.01 X Number in House) VariableCorrelation # of products0.6 Credit Score0.4 Tenure-0.2 Response=A (-.03*number of products) (+.08 X Credit Score) (-.01 X tenure)

34 Continuing to build the model After observing correlation results and EDA’s what can we begin to do at this point. –Derive new variables-EDA’s –Derive new variables-multicollinearity –Derive new variables-Factor Analysis –Derive new variables-CHAID(will explore later) Reference Material: Factor Analysis-look up in any Statistics Handbook Regression-look up in textbook under Regression and Statistics Regression.

35 Continuing to build the model Running further statistical routines, we are able to develop a final model. The marketer or business person should receive a report that looks as follows: For those of you that have statistics training, how is the % Contribution to model calculated derived?

36 Continuing to Build the Model VariablePartialModel EnteredR-Square var var var var var var

37 Continuing to Build the Model What would be the final equation in terms of the sign?

38 Continuing to build the model What would you do here What would you do here

39 Continuing to build the model Suppose we have the following equation: Suppose we have the following equation: Response= X Income +.06 X Tenure +.08 X Product Spend -.04 X Male What is the problem here? What is the problem here?