Download presentation
Presentation is loading. Please wait.
1
Advanced Quantitative Analysis
Shannon Milligan, PhD Institutional Research & Market Analytics Jen Sweet, PhD Teaching, Learning & Assessment April 27, 2018 8009 DePaul Center 1:00pm-2:30pm
2
Workshop Outcomes By the end of this workshop, participants will be able to: Distinguish between Parametric and Nonparametric statistics. Adjust interpretation of the results of parametric statistics when assumptions are violated Define the General Linear Model (GLM) Describe how this model is related to many common statistical methods. Use SPSS for Basic Statistical Analyses Determine When to Use an ANOVA Analysis and, if Appropriate, Run the Analysis Using SPSS
3
Workshop Agenda Parametric v. Nonparametric Statistics
General Linear Model Brief SPSS Overview ANOVA Running Descriptive Statistics in SPSS Running an ANOVA in SPSS
4
Parametric v Nonparametric Analysis
5
Parametric Tests Parametric Tests
Make assumptions about the parameters (or defining properties) of the population that is being studied Most frequent assumptions Distribution of the dependent variable(s) Nature of the data (at least interval-level data – i.e., the data is being measured on a scale with fixed, equal & measurable intervals) Add something about sample size for parametric tests
6
Non-Parametric Tests Do not make assumptions about the underlying population distribution or nature of the data being collected
7
Is there Something In Between?
Yes! Semi-Parametric Statistics Some statistics, such as Bayesian statistics, can begin with a defined underlying population distribution, then update that distribution with known information, such as data about the population or sample data Sadly, we won’t have time to get into this or any non-parametric statistics :(
8
Why is this Important to Know?
The statistics we’ll discuss today are parametric statistics that assume: The population is normally distributed along the dependent variable. You are measuring the data using at least an interval scale Homogeneity of variance – the population, and all samples you could draw have equal variance in regards to your dependent variable
9
What if I Violate these Assumptions?
Most likely, you will… Ideally, you should select a statistical test that is more appropriate given your data Minimally, you should understand how these violations affect the interpretation of your results Refer to:
10
General linear model
11
GLM The General Linear Model is a basic statistical model upon which a lot of common statistics are based. Loosely, based on the formula for a straight line: Y=mX + b Y(outcome, or dependent variable) m(slope of the line) x(independent variable) b(Y intercept)
12
However: The GLM is expressed a little differently: Y = b0 + b1X + E
Y = outcome (or dependent) variable X = independent variable b0 = slope B1 = beta weight (or regression coefficient) of first independent variable Represents the independent contribution of the X (independent variable) to the Y (dependent variable)
13
Examples T-test Y = B0 + B1*X1 + E
Where B1 is the difference between the means of two groups Determine if difference in ACT scores between males and females. ANOVA Y = B0 + B1*X1 + B2*X2 + E Where B1 is the difference between the means of two or more groups Determine if difference in ACT scores based on gender and race. Multiple Regression Y = B0 +B1*X1 + B2*X2 + B3*X3 + E Where each B represents the independent contribution of its associated X to Y Account for the Effects of Gender, Race, and Class Rank in Predicting Students’ ACT scores.
14
Common Statistics that Use GLM
Many common statistics use the Generalized Linear Model as their base: Student’s T-test Analysis of Variance (ANOVA, ANCOVA, MANOVA) Multiple Regression Multivariate Regression Structural Equation Modeling (SEM) Hierarchical Linear Modeling (HLM)
15
Least Squares Estimation Method
The GLM uses the least squares method to estimate the parameters of the model The GLM fits a straight line to your data that minimizes the squared distance between each data point and the ‘best fit’ line.
16
Brief overview of SPSS
17
What is SPSS? Statistical Package for the Social Sciences (SPSS)
Widely used statistical analysis program (across disciplines and industries) Menu-driven program, though can use syntax *DePaul access?
18
Pros and Cons of SPSS Advantages Disadvantages Widely-used
Easy to import Excel files User-friendly “plug and chug” Does all calculations for you Disadvantages Requires some training A lot of options; need to know how to select appropriate options for the analysis you would like to run Need to be able to read and appropriately interpret output Potential problem = too easy to run analyses without understanding them May be expensive Limited data visualization capabilities
19
Running descriptive statistics in SPSS
20
Sample Dataset Chicago Public Schools Progress Report Card (publicly available from Chicago Data Portal) N = 566 schools 79 variables in dataset
21
Sample Question (Frequencies)
How many elementary schools were in CPS in ? Use “ElementaryMiddleorHighSchool” variable Frequency analysis
22
Selecting Variable(s) for Analysis
23
Answer: 462 elementary schools
24
Sample Question (Descriptives)
On average, how many CPS elementary school students exceeded state expectations on the Illinois Standards Achievement Test (ISAT) Math? Use “ISATExceedingMath” variable
25
Selecting Descriptive Analysis
26
Alternate Approaches
27
Answer: 20% Across the CPS elementary schools, roughly 20% of students exceeded state standards for the ISAT Math The median of 16% tells us that the data is skewed in favor of larger values The min and max values tell us that there’s a lot of variance between values
28
ANOVA
29
ANOVA Stands for Analysis of Variance
It is used to compare means among different groups Examples: Gender; Different Age Groups; Race Used to answer questions like is there a difference in performance on the ACT between students based on their racial identity?
30
One-Way versus Two-Way ANOVA
One-Way has multiple levels of one independent variable (race) Two-Way is looking at two different independent variable (gender and race)
31
Formula for ANOVA Y = B0 + B1X1 + B2X2 + B3X3 + B4X4 + E Where Y = ACT score B1X1 = Race 1 B2X2 = Race 2 B3X3 = Race 3 B4X4 = Race 4
32
Results of ANOVA Initial Results only tell you if there are differences between the groups To determine where, specifically, the differences are, you need to run additional (post hoc) analyses This can be done in SPSS
33
Running ANOVA in SPSS
34
Sample Question (ANOVA)
Is there a difference in college enrollment between collaborative networks? Use “CollaborativeName” variable as Independent Variable (Factor) 5 groups: Far South Side Collaborative North-Northwest Side Collaborative South Side Collaborative Southwest Side Collaborative West Side Collaborative Use “CollegeEnrollmentRate” variable as Dependent Variable Note: the data needs to be coded to run an ANOVA (ex. Far South Side Collaborative is coded as “1”)
35
Selecting ANOVA Select one-way ANOVA because we have 1 IV with 5 levels. If we had more than 1 IV, we’d use “General Linear Model” and then “Univariate”
36
Selecting Variables for Analysis
Remember that “Factor” = “Independent Variable”
37
Selecting Post-Hoc Analysis
This is for follow-up analyses
38
Descriptives These tell us the average college enrollments for each of the 5 collaborative networks
39
ANOVA Table This tells us that there is a statistically significant difference between collaborative networks on college enrollment. We check this value again .05-if it is less than .05, we determine there is a significant difference. But we don’t know where the difference(s) is.
40
Follow-up Analysis The statistically significant differences are between: Far South Side and North-Northwest Side Far South Side and Southwest Side North-Northwest Side and South Side North-Northwest and West Side South Side and Southwest Side
41
Follow-up Questions What do the statistically significant differences (and lack thereof) tell us? Why are there so many differences between the North-Northwest Side and other networks?
42
Any Questions?
43
Contact Information Jen Sweet Associate Director, TLA Shannon Milligan Research Associate, IRMA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.