Download presentation
Presentation is loading. Please wait.
1
Analysis of Survey Results
Nora Galambos, PhD Office of Institutional Research Stony Brook University Analysis of Survey Results
2
Survey Planning: Questions to Ask
What hypotheses are being tested? What types of analyses are planned to test the hypotheses? Look over the instrument and create a map or outline of possible analysis methods What is the magnitude of the differences you would like to detect? You may be asked to analyze survey data, where the survey instrument was created by someone without a lot of survey experience. If possible, check on the instrument prior to the data collection. Suggest revisions if there is no hypothesis, or if its design is not compatible with statistical software. Magnitude of differences may not be known for original surveys, unless there will be pilot testing or similar survey questions are asked in the literature, however it is helpful in determining sample size. Survey Planning: Questions to Ask
3
Importance of Pilot Testing
The most obvious reason for pilot testing is to be able to estimate the sample size. Find potential sources of bias Assists in power calculations Discover possible distribution problems prior to surveying the entire sample Importance of Pilot Testing
4
Bonferroni Adjustment
Another issue that needs to be dealt with is Bonferroni adjustment. In NIH grants and journal articles in the social sciences and medicine, this type of correction is expected. Some believe that alpha/n is a bit harsh and there other formulas in the literature, however it is generally acknowledged that some level of adjustment to the alpha-level is necessary of control the experiment-wise error rate. When using an original survey instrument, it may be necessary to take this into consideration during the planning phase. Bonferroni Adjustment
5
Type I and Type II Errors
A Type I error occurs when a true null hypothesis is rejected. The probability of a Type I error is denoted by α, and is the significance level of the hypothesis test, with 0.05 being a common value for α. On the other hand, a Type II error occurs when the null hypothesis is false and it is not rejected. A Type II error is denoted by β and is often set to 0.20. Type I and Type II Errors
6
Hypothesis Testing Table
True Results Experimental Results Ho is true Ho is false Reject Ho α (Type I error rate) Power = 1 - β Accept Ho β (Type II error rate) Hypothesis Testing Table
7
Statistical Power Analysis for the Behavioral Sciences—Jacob Cohen
The power of a significance test is the probability of rejecting a false null hypothesis, and is equal to 1 - β. If β is 0.20, the power = 0.80. 0.80 is generally considered to be adequate level for the power Since sample size and power are related, a small sample size results in less power, or reduced probability of rejecting a false null hypothesis. Statistical power is important in survey planning. Pilot testing can help to insure that the sample size will be sufficient for hypothesis testing. Power Calculations
8
Using Sample Size Tables
Using Sample Size Tables
9
Power for a two-sided test, α=0.01
d = 0.2, 0.5, 0.8 (small, medium, and large effects) n (for each group) 0.2 0.5 0.8 30 0.03 0.24 0.66 40 0.04 0.35 0.82 50 0.06 0.45 0.91 60 0.07 0.55 >0.995 80 0.12 100 0.29 0.99 200 500 0.72 The table demonstrates the wide range of power for varying sample size. This highlights the importance of having information about the magnitude of the effects that you plan to detect, as well as having a some idea about the response rate. Power for a two-sided test, α=0.01
10
Types of Missing Data Missing Completely at Random (MCAR)
Given two variables X and Y, the missingness is unrelated to either. The missing values in X are independent of Y and vice versa. If the data are MCAR, then listwise deletion is appropriate Missing at Random (MAR) Given two variables X and Y, the missingness is related to or dependent upon X, but not Y. Suppose X = age and Y = income and income is more often missing in certain age groups, but within each age group, no income group is missing more often that any others, then the data are MAR. Nonignorable Given two variables X and Y, the missingness is related to X, but may also be related to Y. In our age-income example, certain income groups within an age group may be less likely to respond. Types of Missing Data
11
Evaluating Missing Data
Select items with a missing percentage greater than 1% or 2%. Recode them into binary variables where with 1=missing and 0=non-missing. Analyze these variables by the demographic variables using t-tests or chi-square, as appropriate. Significant results indicate that missingness is associated with one or more of the demographic variables. Once the data have been collected, the first step is evaluating the missing data to check for missing data bias. Dichotomize the survey items into missing or not missing. The demographic proportions or means should be the same for the missing group as for the non-missing group. We also will want to check to see if the demographic variables are an accurate representation of the campus population. Evaluating Missing Data
12
Data Reduction Methods
Used to uncover relationship patterns among a group of variables with the goal of reducing the variables to a smaller group Two types of data reduction methods--confirmatory and exploratory Exploratory factor analysis does not assume any particular structure prior to the analysis and is used to “explore” relationships between variables Confirmatory factor analysis is used to test hypotheses regarding the underlying structure of a group of variables Traditional factor analysis and principal components analysis are exploratory data reduction methods Data Reduction Methods
13
Principal Components Analysis
Principal components analysis a method often used for reducing the number of variables Principal components analysis is part of the factor analysis procedures in SAS and SPSS Although factor analysis (FA) and principal components analysis (PCA) have mathematical differences the results are often similar Many authors loosely use the term “factor analysis” to refer to data reduction methods, in general Principal Components Analysis
14
Principal Components Analysis
Finds groups that are correlated with each other, possibly measuring the same construct. Reduces the variables in the data to a smaller number of items that account for most of the variance of all of the variables in the data The first component accounts for the greatest amount of variance. Then second one accounts for the greatest amount not accounted for by the first component and is uncorrelated with the first component. Principal Components Analysis
15
Necessary Assumptions
Suggested sample size: at least 100 subjects and 10 observations per variable A correlation analysis of the variables should result in most correlations greater than 0.3 Bartlett’s test of sphericity is significant (p < 0.05) Kaiser-Meyer-Olkin (KMO) test of sampling adequacy ≥ 0.6 Determinant > which indicates that multicollinearity is not a problem Bartlett’s test of sphericity should be significant, which means the correlation matrix is not the identity matrix. If the correlation matrix is the identity matrix, then the factor model is inappropriate. The Kaiser-Meyer Olkin test is to determine how well the data factor, and should be >= 0.60. Necessary Assumptions
16
In SPSS select principal components under “extraction method”
Select varimax rotation. A rotation uses a transformation to aid in the interpretation of the factor solution A varimax rotation is orthogonal, so the components are uncorrelated, which maximizes the column variance Obtaining a PCA
17
Evaluating PCA Results
Kaiser criterion—choose components with eigenvalues greater than one. Scree plot—plot of eigenvalues Retain the eigenvalues before the leveling off point of the plot. Want the proportion of variance accounted for by each factor (or component) to be 5% to 10% Cumulative variance accounted for should be 70% to 80% Evaluating PCA Results
18
Abbreviated Table of Variance Explained
19
Scree Plot
20
There should be at least three items with significant loadings on each component
Check the conceptualization of the component items With an orthogonal rotation the factor loadings = correlation between variable and component A communality is the proportion of variance in a variable that is accounted for by the retained components or factors. A communality is large if it loads heavily on at least one component. More about PCA Results
21
Obtaining Scores Factor score Factor-based score
Save the regression scores as variables Standardize the survey responses For each subject’s response, multiply the standardized survey response by the corresponding regression weights—add the results Factor-based score Average the responses of the items in the component Check for reverse codings and missing data. Obtaining Scores
22
Cronbach’s Alpha is used to measure the reliability or the internal consistency of the factors or components. The variables in a scale are all entered into the calculation to obtain the alpha score. A Cronbach’s alpha > 0.7 is considered to be sufficient for demonstrating internal consistency for most social science research, while values > 0.6 are marginably acceptable Cronbach’s Alpha
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.