1 Correlation and Regression Analysis Lecture 11
2 Strength of Association Direction Nature Presence Concepts About Relationships
3.... assesses whether a systematic relationship exists between two or more variables. If we find statistical significance between the variables we say a relationship is present. Relationship Presence
4 Nonlinear relationship = often referred to as curvilinear, it is best described by a curve instead of a straight line. Linear relationship = a “straight-line association” between two or more variables. Relationships between variables typically are described as either linear or nonlinear. are described as either linear or nonlinear. Nature of Relationships
Direction of Relationship 5 The direction of a relationship can be either positive or negative. Positive relationship = when one variable increases, e.g., loyalty to employer, then so does another related one, e.g. effort put forth for employer. Negative relationship = when one variable increases, e.g., satisfaction with job, then a related one decreases, e.g. likelihood of searching for another job.
Strength of Association 6 When a consistent and systematic relationship is present, the researcher must determine the strength of association. The strength ranges from very strong to slight.
7.... exists when one variable consistently and systematically changes relative to another variable. The correlation coefficient is used to assess this linkage. Covariation
Zero Correlation = the value of Y does not increase or decrease with the value of X Positive Correlation = when the value of X increases, the value of Y also increases. When the value of X decreases, the value of Y also decreases. Negative Correlation = when the value of X increases, the value of Y decreases. When the value of X decreases, the value of Y increases.
Exhibit 11-1 Rules of Thumb about Correlation Coefficient Size 9 Coefficient Strength of Range Association +/–.91 to +/– 1.00 Very Strong +/–.71 to +/–.90 High +/–.41 to +/–.70 Moderate +/–.21 to +/–.40 Small +/–.01 to +/–.20 Slight
Pearson Correlation 10 The Pearson correlation coefficient measures the linear association between two metric variables. It ranges from – 1.00 to , with zero representing absolutely no association. The larger the coefficient the stronger the linkage and the smaller the coefficient the weaker the relationship.
Coefficient of Determination 11 The coefficient of determination is the square of the correlation coefficient, or r 2. It ranges from 0.00 to 1.00 and is the amount of variation in one variable explained by one or more other variables.
Exhibit 11-3 Bivariate Correlation Between Work Group Cooperation and Intention to Search for another Job VariablesMeanStandard Deviation N X 4 – Work Group Cooperation X 16 – Intention to Search Descriptive Statistics
Exhibit 11-3 Bivariate Correlation Between Work Group Cooperation and Intention to Search for another Job X 4 – Work Group Cooperation X 16 – Intention to Search X 4 – Work Group Cooperation Pearson Correlation * Sig. (2-tailed)..000 N63 X 16 – Intention to Search Pearson Correlation -.585*1.00 Sig. (2-tailed).000. N63 13 Correlations * Coefficient is significant at the 0.01 level (2-tailed).
Exhibit 11-5 Bar Charts for Rankings for Food Quality and Atmosphere 14
Exhibit 11-4 Correlation of Food Quality and Atmosphere Using Spearman’s rho X 13 – Food Quality Ranking X 14 – Atmosphere Ranking Spearman’s rho X 13 – Food Quality Ranking Correlation Coefficient * Sig. (2-tailed)..000 N 200 X 14 – Atmosphere Ranking Correlation Coefficient -.801*1.000 Sig. (2-tailed).000. N * Coefficient is significant at the 0.01 level (2-tailed).
Exhibit 11-6 Customer Rankings of Restaurant Selection Factors X 13 – Food Quality Ranking X 14 – Atmosphere Ranking X 15 – Prices Ranking X 16 – Employees Ranking NValid200 Missing0000 Median Minimum2211 Maximum Statistics
Exhibit 11-7 Classification of Statistical Techniques 17
Exhibit 11-8 Definitions of Statistical Techniques 18 ANOVA (analysis of variance) is used to examine statistical differences between the means of two or more groups. The dependent variable is metric and the independent variable(s) is nonmetric. One-way ANOVA has a single nonmetric independent variable and two-way ANOVA can have two or more nonmetric independent variables. Bivariate regression has a single metric dependent variable and a single metric independent variable. Cluster analysis enables researchers to place objects (e.g., customers, brands, products) into groups so that objects within the groups are similar to each other. At the same time, objects in any particular group are different from objects in all other groups. Correlation examines the association between two metric variables. The strength of the association is measured by the correlation coefficient. Conjoint analysis enables researchers to determine the preferences individuals have for various products and services, and which product features are valued the most.
Exhibit 11-8 Definitions of Statistical Techniques 19 Discriminant analysis enables the researcher to predict group membership using two or more metric dependent variables. The group membership variable is a nonmetric dependent variable. Factor analysis is used to summarize the information from a large number of variables into a much smaller number of variables or factors. This technique is used to combine variables whereas cluster analysis is used to identify groups with similar characteristics. Logistic regression is a special type of regression that can have a non-metric/categorical dependent variable. Multiple regression has a single metric dependent variable and several metric independent variables. MANOVA is similar to ANOVA, but it can examine group differences across two or more metric dependent variables at the same time. Perceptual mapping uses information from other statistical techniques to map customer perceptions of products, brands, companies, and so forth.
Exhibit Bivariate Regression of Satisfaction and Food Quality X 25 – CompetitorVariablesMean Samouel’s X 17 – Satisfaction4.78 X 1 – Food Quality5.24 Gino’s X 17 – Satisfaction5.96 X 1 – Food Quality Descriptive Statistics
Exhibit Bivariate Regression of Satisfaction and Food Quality X 25 – Competitor ModelRR Square Samouel’s1.513 *.263 Gino’s1.331 * Model Summary * Predictors: (Constant), X 1 – Excellent Food Quality
Exhibit Other Aspects of Bivariate Regression X 25 – Competitor ModelSum of Squares Mean Square FSig. Samouel’s1 Regression * Residual Total Gino’s1 Regression * Residual Total * Predictors: (Constant), X 1 – Excellent Food Quality Dependent Variable: X 17 – Satisfaction
Exhibit Other Aspects of Bivariate Regression continued X 25 – Competitor Model Unstandardized Coefficients Standardized Coefficients tSig. BStd. Error Beta Samouel’s 1 (Constant) Gino’s 1 (Constant) Coefficients * Dependent Variable: X 17 – Satisfaction
Calculating the “Explained” and “Unexplained” Variance in Regression 24 The unexplained variance in regression, referred to as residual variance, is calculated by dividing the residual sum of squares by the total sum of squares. For example, in Exhibit 11-11, divide the residual sum of squares for Samouel’s of by and you get.737. This tells us that a lot of variance (73.7%) in the dependent variable in not explained by this regression equation. The explained variance in regression, referred to as r 2, is calculated by dividing the regression sum of squares by the total sum of squares. For example, in Exhibit 11-11, divide the regression sum of squares for Samouel’s of 35.00l by and you get.263.
How to calculate the t-value? 25 The t-value is calculated by dividing the regression coefficient by its standard error. In Exhibit in the Coefficients table, if you divide the Unstandardized Coefficient for Samouel’s of.459 by the Standard Error of.078, the result will be a t-value of Note that the number in the table for the t-value is The difference between the calculated and the reported in the table is due to the fact that the computer reported the “rounded off” numbers for the Unstandardized Coefficient and the Standard Error but the t-value is calculated and reported without rounding.
How to interpret the regression coefficient ? 26 The regression coefficient of.459 for Samouel’s X 1 – Food Quality reported in Exhibit is interpreted as follows: “... for every unit that X 1 increases, X 17 will increase by.459 units.” Recall that in this example X 1 is the independent (predictor) variable and X 17 is the dependent variable.
Exhibit Multiple Regression of Return in Future and Food Independent Variables X 25 – CompetitorVariablesMean Samouel’sX 18 – Return in Future4.37 X 1 – Excellent Food Quality5.24 X 4 – Excellent Food Taste5.16 X 9 – Wide Variety of Menu Items5.45 Gino’sX 18 – Return in Future5.55 X 1 – Excellent Food Quality5.81 X 4 – Excellent Food Taste5.73 X 9 – Wide Variety of Menu Items Descriptive Statistics
Exhibit Multiple Regression of Return in Future and Food Independent Variables (continued) X 25 – Competitor ModelRR SquareAdjusted R Square Samouel’s1.512* Gino’s1.482* Model Summary * Predictors: (Constant), X 9 – Wide Variety of Menu Items, X 1 – Excellent Food Quality, X 4 – Excellent Food Taste Dependent Variable: X 18 – Return in Future
Exhibit Other Information for Multiple Regression Models X 25 – Competitor ModelSum of Squares Mean Square FSig. Samouel’s1 Regression * Residual Total Gino’s1 Regression * Residual Total * Predictors: (Constant), X 9 – Wide Variety of Menu Items, X 1 – Excellent Food Quality, X 4 – Excellent Food Taste Dependent Variable: X 18 – Return in Future ANOVA
Exhibit Other Information for Multiple Regression Models X 25 – Competitor Unstandardized Coefficients Standardized Coefficients tSig. ModelBStd. ErrorBeta Samouel’s1 (Constant) X 1 – Exc. Food Quality X 4 – Exc. Food Taste X 9 – Wide Variety E Gino’s1 (Constant) X 1 – Exc. Food Quality X 4 – Exc. Food Taste X 9 – Wide Variety E Coefficients * * Dependent Variable: X 18 – Return in Future
Exhibit Summary Statistics for Employee Regression Model 31 ModelRR SquareAdjusted R Square ModelSum of Squares Mean Square FSig. 1 Regression Residual Total Model Summary * Predictors: (Constant), X 12 – Benefits Reasonable, X 9 – Pay Reflects Effort, X 1 – Paid Fairly Dependent Variable: X 14 – Effort
Exhibit Coefficients for Employee Regression Model Unstandardized Coefficients Standardized Coefficients tSig.Collinearity Statistics ModelBStd. Error BetaToleranceVIF 1(Constant) X 1 – Paid Fairly X 4 – Pay Reflects Effort X 9 – Benefits Reasonable Coefficients * * Dependent Variable: X 14 – Effort
Exhibit Bivariate Correlations of Effort and Compensation Variables X 14 – EffortX 1 – Paid FairlyX 1 – Pay Reflects Effort X 12 – Reasonable Benefits X 14 – Effort X 1 – Paid Fairly X 1 – Pay Reflects Effort X 12 – Reasonable Benefits Pearson Correlations
Exhibit Bivariate Correlations of Effort and Compensation Variables X 14 – EffortX 1 – Paid FairlyX 1 – Pay Reflects Effort X 12 – Reasonable Benefits X 14 – Effort X 1 – Paid Fairly X 1 – Pay Reflects Effort.000. X 12 – Reasonable Benefits Statistical Significance of Pearson Correlations (1 – tailed)
Exhibit Stepwise Regression Based on Samouel’s Customer Survey ModelRR Square Adjusted R Square Std. Error of the Estimate * Predictors: (Constant), X 1 – Excellent Food Quality, X 6 – Friendly Employees Dependent Variable: X 17 – Satisfaction Model Summary
Exhibit Stepwise Regression Based on Samouel’s Customer Survey ModelSum of Squares Mean Square FSig. 1 Regression Residual Total Regression Residual Total * Predictors: (Constant), X 1 – Excellent Food Quality, X 6 – Friendly Employees Dependent Variable: X 17 – Satisfaction ANOVA
Exhibit Means and Correlations for Selected Variables from Samouel’s Customer Survey VariablesMean X 17 – Satisfaction4.78 X 1 – Excellent Food Quality5.24 X 4 – Excellent Food Taste5.16 X 9 – Wide Variety of Menu Items5.45 X 6 – Friendly Employees2.89 X 11 – Courteous Employees1.96 X 12 – Competent Employees Descriptive Statistics
Exhibit Independent Variables in Stepwise Regression Model ModelVariables Entered Variables Removed Method 1 X1 – Excellent Food Quality. Stepwise (Criteria: Probability-of-R- to-enter = X6 – Friendly Employees. 38 * Predictors: (Constant), X 1 – Excellent Food Quality, X 6 – Friendly Employees Dependent Variable: X 17 – Satisfaction ANOVA
Exhibit Coefficients for Stepwise Regression Model Unstandardized Coefficients Standardized Coefficients t Sig.Collinearity Statistics ModelBStd. Error BetaToleranceVIF 1(Constant) X 1 – Excellent Food Quality (Constant) X 1 – Excellent Food Quality X 6 – Friendly Employees Coefficients * * Dependent Variable: X 17 – Satisfaction
40 THANK YOU