Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.

Slides:



Advertisements
Similar presentations
Descriptive Statistics-II
Advertisements

Multiple Analysis of Variance – MANOVA
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
STA305 week 31 Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
Assumption of normality
C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.
Correlation Chapter 9.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Chapter 11 Multiple Regression.
Lecture 23 Multiple Regression (Sections )
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Business Statistics - QBM117 Statistical inference for regression.
A Further Look at Transformations
SW388R7 Data Analysis & Computers II Slide 1 Computing Transformations Transforming variables Transformations for normality Transformations for linearity.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Slide 1 Analyzing Patterns of Missing Data While SPSS contains a rich set of procedures for analyzing patterns of missing data, they are not included in.
Regression Analysis We have previously studied the Pearson’s r correlation coefficient and the r2 coefficient of determination as measures of association.
Assumption of Homoscedasticity
SW388R7 Data Analysis & Computers II Slide 1 Assumption of normality Transformations Assumption of normality script Practice problems.
Assumption of linearity
Example of Simple and Multiple Regression
Slide 1 SOLVING THE HOMEWORK PROBLEMS Simple linear regression is an appropriate model of the relationship between two quantitative variables provided.
Three-Group Illustrative Example of Discriminant Analysis
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Inferential Statistics: SPSS
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of Homoscedasticity Homoscedasticity (aka homogeneity or uniformity of variance) Transformations.
Correlation.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Illustration of Regression Analysis
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
6/2/2016Slide 1 To extend the comparison of population means beyond the two groups tested by the independent samples t-test, we use a one-way analysis.
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
11/4/2015Slide 1 SOLVING THE PROBLEM Simple linear regression is an appropriate model of the relationship between two quantitative variables provided the.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
SW318 Social Work Statistics Slide 1 One-way Analysis of Variance  1. Satisfy level of measurement requirements  Dependent variable is interval (ordinal)
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
I271B QUANTITATIVE METHODS Regression and Diagnostics.
© Copyright McGraw-Hill 2004
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
1/11/2016Slide 1 Extending the relationships found in linear regression to a population is procedurally similar to what we have done for t-tests and chi-square.
ANOVA, Regression and Multiple Regression March
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Slide 1 Regression Assumptions and Diagnostic Statistics The purpose of this document is to demonstrate the impact of violations of regression assumptions.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
Analysis of Variance STAT E-150 Statistical Methods.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of linearity Strategy for solving problems Producing outputs for evaluating linearity Assumption.
Assumption of normality
PCB 3043L - General Ecology Data Analysis.
Basic Practice of Statistics - 5th Edition
Regression Analysis Simple Linear Regression
Elementary Statistics
CH2. Cleaning and Transforming Data
Multiple Regression Chapter 14.
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Presentation transcript:

Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions about the data: normality of the metric variables, homoscedastic relationships between the dependent variable and the metric and nonmetric independent variables, linear relationships between the metric variables, and absence of correlated prediction errors. Multivariate analysis requires that the assumptions be tested twice: first, for the separate variables as we are preparing to do the analysis, and second, for the multivariate model variate, which acts collectively for the variables in the analysis and thus must meet the same assumptions as individual variables. In this section, we will examine the tests that we normally perform prior to computing the multivariate statistic. Since the pattern of prediction errors cannot be examined without computing the multivariate statistic, we will defer that discussion until we examine each of the specific techniques. If the data fails to meet the assumptions required by the analysis, we can attempt to correct the problem with a transformation of the variable. There are two classes of transformations that we attempt: for violations of normality and homoscedasticity, we transform the individual metric variable to a inverse, logarithmic, or squared form; for violations of linearity, we either do a power transformation, e.g. raise the data to a squared or square root power, or we add an additional polynomial variable that contains a power term. Testing Multivariate Assumptions

Slide 2 Testing Multivariate Assumptions - 2 Transforming variables is a trial and error process. We do the transformation and then see if it has corrected the problem with the data. It is not usually possible to be certain in advance that the transformation will correct the problem; sometimes it only reduces the degree of the violation. Even when the transformation might decrease the violation of the assumption, we might opt not to include it in the analysis because of the increased complexity it adds to the interpretation and discussion of the results. It often happens that one transformation solves multiple problems. For example, skewed variables can produce violations of normality and homoscedasticity. No matter which test of assumptions identified the violation, our only remedy is a transformation of the metric variable to reduce the skewness. Testing Multivariate Assumptions

Slide 3 1. Evaluating the Normality of Metric Variables Determining whether or not the distribution of values for a metric variable complies with the definition of a normal curve is tested with histograms, normality plots, and statistical tests. The histogram shows us the relative frequency of different ranges of values for the variable. If the variable is normally distributed, we expect the greatest frequency of values to occur in the center of the distribution, with decreasing frequency for values away from the center. In addition, a normally distributed variable will be symmetric, showing the same proportion of cases in the left and right tails of the distribution. In a normality plot in SPSS, the actual distribution of cases is plotted in red against the distribution of cases that would be expected if the variable is normally distributed, plotted as a green line on the chart. Our conclusion about normality is based on the convergence or divergence between the plot of red points and the green line. There are two statistical tests for normality: the Kolmogorov-Smirnov statistic with the Lilliefors correction factor for variables that have 50 cases or more, and the Shapiro-Wilk's test for variables that have fewer than 50 cases. SPSS will compute the test which is appropriate to the sample size. The statistical test is regarded as sensitive to violations of normality, especially for a large sample, so we should examine the histogram and normality plot for confirmation of a distribution problem. The statistical test for normality is a test of the null hypothesis that the distribution is normal. The desirable outcome is a significance value for the statistic more than 0.05 so that we fail to reject the null hypothesis. If we fail to reject the null hypothesis, we conclude that the variable is normally distributed and meets the normality assumption. If the significance value of the normality test statistic is smaller than 0.05, we reject the null hypothesis of normality and see if a transformation of the variable can induce normality to meet the statistical assumption. Testing Multivariate Assumptions

Slide 4 Requesting Statistics to Test Normality Testing Multivariate Assumptions

Slide 5 Requesting the Plot to Test Normality Testing Multivariate Assumptions

Slide 6 Output for the Statistical Tests of Normality Testing Multivariate Assumptions

Slide 7 The Histogram for Delivery Speed (X1) Testing Multivariate Assumptions

Slide 8 The Normality Plot for Delivery Speed (X1) Testing Multivariate Assumptions

Slide 9 The Histogram for Price Level (X2) Testing Multivariate Assumptions

Slide 10 The Normality Plot for Price Level (X2) Testing Multivariate Assumptions

Slide 11 Transformations to Induce Normality Testing Multivariate Assumptions

Slide 12 Computing the Square Root Transformation for Price Level Testing Multivariate Assumptions

Slide 13 Request the Normality Analysis for the Transformed Price Level Variable Testing Multivariate Assumptions

Slide 14 The K-S Lilliefors Test for the Transformed Price Level Variable Testing Multivariate Assumptions

Slide 15 The Histogram for the Transformed Price Level Variable Testing Multivariate Assumptions

Slide 16 The Normality Plot for the Transformed Price Level Variable Testing Multivariate Assumptions

Slide 17 The Histogram for Price Flexibility (X3) Testing Multivariate Assumptions

Slide 18 The Normality Plot for Price Flexibility (X3) Testing Multivariate Assumptions

Slide 19 Computing the Square Root Transformation for Price Flexibility Testing Multivariate Assumptions

Slide 20 Computing the Logarithmic Transformation for Price Flexibility Testing Multivariate Assumptions

Slide 21 Computing the Inverse Transformation for Price Flexibility Testing Multivariate Assumptions

Slide 22 Request the explore command for the three transformed variables Testing Multivariate Assumptions

Slide 23 The K-S Lilliefors tests for the transformed variables Testing Multivariate Assumptions

Slide Evaluating Homogeneity of Variance for Non-metric Variables The Levene statistic tests for equality of variance across subgroups on a non-metric variable. The null hypothesis in the test is that the variance of each subgroup is the same. The desired outcome is a failure to reject the null hypothesis. If we do reject the null hypothesis and conclude that the variance of at least one of the subgroups is not the same, we can use a special formula for computing the variance if one exists, such as we do with t-tests, or we can apply one of the transformations used to induce normality on the metric variable. While the Levene statistic is available through several statistical procedures in SPSS, we can obtain it for any number of groups using the One-way ANOVA Procedure. We will demonstrate this test by checking the homogeneity of variance for the metric variables 'Delivery Speed', Price Level', 'Price Flexibility', 'Manufacturer Image', 'Service', 'Salesforce Image', 'Product Quality', Usage Level', and 'Satisfaction Level' among the subgroups of the non-metric variable 'Firm Size.' Testing Multivariate Assumptions

Slide 25 Requesting a One-way ANOVA Testing Multivariate Assumptions

Slide 26 Request the Levene Homogeneity of Variance Test Testing Multivariate Assumptions

Slide 27 The Tests of Homogeneity of Variances Testing Multivariate Assumptions

Slide 28 Compute the Transformed Variables for 'Manufacturer Image' (x4) Testing Multivariate Assumptions

Slide 29 Request the Levene Test for the Transformed Manufacturer Image Variables Testing Multivariate Assumptions

Slide 30 Levene Test Results for the Transformed Manufacturer Image Variables The results of the Levene Tests of Homogeneity of Variances indicate that none of the transformations are effective in resolving the homogeneity of variance problem for the subgroups of Firm Size on the variable Product Quality. We would note the problem in our statement about the limitations of our analysis. Testing Multivariate Assumptions

Slide 31 Compute the Transformed Variables for 'Product Quality' (x7) Testing Multivariate Assumptions

Slide 32 Request the Levene Test for the Transformed Product Quality Variables Testing Multivariate Assumptions

Slide 33 Results of the Levene Test for the Transformed Product Quality Variables The results of the Levene Tests of Homogeneity of Variances indicate that either the logarithmic transformation or the square root transformation are effective in resolving the homogeneity of variance problem for the subgroups of Firm Size on the variable Product Quality. Testing Multivariate Assumptions

Slide Evaluate Linearity and Homoscedasticity of Metric Variables with Scatterplots Other assumptions required for multivariate analysis focus on the relationships between pairs of metric variables. It is assumed that the relationship between metric variables is linear, and the variance is homogenous through the range of both metric variables. If both the linearity and the homoscedasticity assumptions are met, the plot of points will appear as a rectangular band in a scatterplot. If there is a strong relationship between the variables, the band will be narrow. If the relationship is weaker, the band becomes broader. If the pattern of points is curved instead of rectangular, there is a violation of the assumption of linearity. If the band of points is narrower at one end than it is at the other (funnel-shaped), there is a violation of the assumption of homogeneity of variance. Violations of the assumptions of linearity and homoscedasticity may be correctable through transformation of one or both variables, similar to the transformations employed for violations of the normality assumption. A diagnostic graphic with recommended transformations is available in the text on page 77. SPSS provides a scatterplot matrix for examining the linearity and homoscedasticity for a set of metric variables as a diagnostic tool. If greater detail is required, a bivariate scatterplot for pairs of variables is available. We will request a scatterplot matrix for the eight metric variables from the HATCO data set in the scatterplot matrix on page 43 of the text. None of the relationships in this scatterplot matrix shows any serious problem with linearity or heteroscedasticity, so this exercise will not afford the opportunity to examine transformations. Examples of transformations to achieve linearity will be included in the next set of exercises titled A Further Look at Transformations. Testing Multivariate Assumptions

Slide 35 Requesting the Scatterplot Matrix Testing Multivariate Assumptions

Slide 36 Specify the Variables to Include in the Scatterplot Matrix Testing Multivariate Assumptions

Slide 37 Add Fit Lines to the Scatterplot Matrix Testing Multivariate Assumptions

Slide 38 Requesting the Fit Lines Testing Multivariate Assumptions

Slide 39 Changing the Thickness of the Fit Lines Testing Multivariate Assumptions

Slide 40 Changing the Color of the Fit Lines Testing Multivariate Assumptions

Slide 41 The Final Scatterplot Matrix Testing Multivariate Assumptions