Appendix I A Refresher on some Statistical Terms and Tests
Chapter Objectives Provide a ‘refresher’ of some statistical terms and tests Explain what types of analysis are appropriate, under what conditions and for what objectives Give examples of SPSS computer output Explain descriptive statistics, including frequencies, means, standard deviations, and variance Present a process for statistical hypothesis testing using a computer package Demonstrate how inferential statistics can be used to test hypotheses
Statistics Descriptive Statistics –Help to describe the phenomena of interest Inferential Statistics –Help to draw conclusions from the analysis of data Parametric –Assumes sample drawn from normal population Non-parametric –Assumes sample drawn from a non-normal population
Properties of the Four Measurement Scales Note: The interval scale has 1 as an arbitrary starting point. The ratio scale has the natural origin 0, which is meaningful.
Sample Questionnaire for Data Analysis Business Research Class Questionnaire The purpose of this short questionnaire is to collect some nominal, ordinal, interval and ratio data that can be used to demonstrate some of the basic statistical methods for analysing quantitative data. The individual responses will be anonymous and the data collected will be used only for class exercises. Please tick the appropriate box, provide the data requested or circle a number, where appropriate. 1.What is your gender? Female Male 2.Please indicate your height to the nearest centimetre (cm) 3.Please indicate your weight to the nearest kilogram (kg) 4.What is the colour of your eyes? (just tick one box please!) Blue Brown Other 5.Please indicate the extent to which you disagree or agree with the following statements: Strongly disagreeDisagreeNeutralAgreeagree Statistics is interesting Statistics knowledge is useful in business Many thanks for your time and assistance with completing this questionnaire. We will now proceed to analyse the data collected!
Variable Names, Labels and Values for Sample Data Set
Example of SPSS Data Editor Input Data – Data View
Example of SPSS Data Editor Input Data – Variable View
Descriptive Statistics Frequencies Measures of central tendencies –mean, median, mode Measures of dispersion –range, variance, standard deviation –other measures –standard error of the mean
Example – Responses to the statement ‘Statistics is interesting’
The Mean
Range Represents the difference between the highest and lowest values of a variable of interest. Eg, max = 50, min = 30, range = 20
Variance Formula Note: This formula is correct. The formula in the book is incorrect
Area under the Normal Curve
Box and Whisker Plots
Normal, Skewed and Sampling Distributions Source: Adapted from Zikmund (2000:381).
Standard Error of the Mean When a number of samples are taken from the population, the sample means form a distribution The standard deviation of these sample means is called the standard error of the mean As the sample size increases, the standard error gets smaller
Standard Error of the Mean - Formula
Example of SPSS output of Descriptive Statistics
Inferential Statistics Helps to draw inferences or conclusions from the analysis of the data, such as: The relationships between two variables Differences in variables among different subgroups How several independent variables might explain the variance in an independent variable
Inferential Statistics Statistical hypothesis testing –The null and alternate hypotheses –Choosing a statistical test –Significance level Correlations
Process for Statistical Hypothesis Testing using a Computer
The null and alternate hypotheses Null hypothesis –the conjecture that postulates no differences or no relationship between or among variables Alternate hypothesis – an educated conjecture that sets the parameters one expects to find
Choosing a Statistical Test Parametric tests can be applied to interval and ratio data (and also ordinal data where they are expressed in numeric form and ‘interval’ features are present). Non-parametric tests are applied to categorical data — ie, nominal and most ordinal data
Classification of Statistical Tests
Significance level the probability of rejecting the null hypothesis when it is true also called the critical value the probability of this occurring is called (alpha) Significance level = 1 – confidence level Eg significance level = 0.05, indicates confidence level = 0.95 (or 95%)
Hypothesis Testing and Statistical Decision Making Statistical decision Accept H 0 Reject H 0 H 0 is true Correct (Probability = 1-a) Type I error (Probability = a) True state of the situation H 0 is false Type II error (Probability = b) Correct (Probability = 1- b)
Relationship between Type I and II Errors Source: D. A. Aaker, V. Kumar and G. S. Day 1995, Marketing Research, 5th edition. New York: John Wiley & Sons, p. 473
Pearson Correlation indicates the direction, strength and significance of the bivariate relationships between interval or ratio variables, eg: H O :Role overload and performance are not related to each other. [r = 0] H A :the two are significantly negatively correlated. [r < 0] r = p = r = -0.29p = r = -0.33p = 0.049
Scatter Diagrams of two Variables with different Correlation Coefficients
Procedure for Chi-square Test with SPSS Step 1: Formulating the hypotheses Step 2: Decision criterion Step 3: Analyse data with computer package Step 4: Make a statistical decision Step 5: Interpret the decision
Example of SPSS Output for Crosstabs and Chi-square tests36
Example of SPSS Output for Crosstabs and Chi-square tests36 (cont)
t distribution is suitable for analysing the means of small samples drawn from a population that is normally distributed shape of the t distribution depends on the degrees of freedom (df )
t distribution formula
Comparison of t distribution & normal curve
Example of SPSS output for single sample tests
Example of SPSS output for two independent samples t-tests
Example of SPSS output for one-way between groups ANOVA
Regression Analysis Explains the variance in the dependent variable by a set of predictors R-square (R 2 ) is the explained variance Step-wise regression will indicate the order of importance of the significant preditors in the regression model The Beta weight of the predictors and their significance indicates the weight each predictor (independent variable) exerts in explaining the variance in the dependent variable.
A Simple Regression Model
General Form of Simple Regression Line Y = a + bX where: Y is the dependent variable X is the independent variable a is the intercept of the regression line on the Y (vertical) axis b is the slope of the regression line
Assumptions of Regression Analysis
Example of SPSS output for simple regression analysis
Example of SPSS output for simple regression analysis (cont.)
Factor Analysis helps to reduce a vast number of variables (for example all the questions tapping several variables of interest in a questionnaire) to a meaningful, interpretable and manageable set of factors
Output of a Factor Analysis for the Evaluation Questionnaire
Items under each Factor for the Evaluation Questionnaire
Items under each Factor for the Evaluation Questionnaire (cont.)
Multivariate Analysis examines several variables and their relationships simultaneously Multivariate techniques include: –MANOVA –Discriminant analysis –Canonical correlation –Factor analysis –Cluster analysis –Multidimensional analysis