Download presentation
Presentation is loading. Please wait.
Published byFrancis Wright Modified over 6 years ago
1
Just Enough to be Dangerous: Basic Statistics for the Non-Statistician
South Carolina Association for Institutional Research Columbia, SC February 28, 2011
2
Why You Signed Up Maybe you were recruited into this job from another area. Maybe you sometimes feel out of your depth. Maybe you don’t like not understanding the terms and references (model, design, significance) you sometimes hear from colleagues. Maybe you want to empower yourself to talk about these things. Maybe you just want to understand better what we do every day. Maybe you have plans for using the data we collect in a new way (see also: the Big Idea). Today we will really only cover enough to get you more familiar with basic concepts; this is no substitute for coursework and experience. Talk about my experience. Talk about software issues. Also there’s a wide variety of experience in this group; we’re going to shoot for the middle. There are more details in the handout.
3
Basic Design & Definitions
First, write down what you want to find out. This may be a hypothesis or research question. Draw a picture if needed. Define the parts. IV DV Identify all variables, operationally define them. The research design will determine what sort of statistical procedure(s) you will use. How will you sample? How will you collect data, and with what tools? Throughout this, you have to maintain objectivity; don’t alter your design to favor a particular outcome. (hard to do) Identify scales of measurement (nominal, ordinal, interval, ratio).
4
Try these IV/DV? How would you design this? Scales of measurement
A college wants to know if an increase in failing grades from last fall to this fall is due to hours spent on MySpace. Another college wants to know if their new experiential marine science curriculum is producing better results on the capstone evaluation of marine science seniors. Another college wants to know if they are looking at the best set of variables to consider when making freshman admissions decisions. Another college wants to know if they can predict who will be most likely to graduate from items on the NSSE. Another college wants to know if they can predict who will be most likely to gain the ‘freshman 15’ from items on the NSSE. Scales of measurement NSSE frequently/sometimes/occasionally/never items Which marine science curriculum Graduation Hours spent on MySpace
5
First Issue: Am I going to stick to describing something or am I looking to determine the significance of my results or predict future results? Describing Frequencies (counts) Measures of central tendency Measures of variation Correlation Covariance
6
Significance Prediction Need to know assumptions
Select an appropriate distribution Type I & Type II error/power Decision rules Inferential Statistics Prediction Also inferential, but uses bivariate or multivariate relationships in existing data to model ‘future’ data. Probably the type of inferential procedure used most in higher ed.
7
Debbie Downer Section Assumptions Power issues Independence
Equal variance of groups Random sampling & assignment Normal distribution (but if you can’t do one of these, there are assumption-reduced procedures) Power issues Research is often published that was done on an inadequate sample or type I error level to actually detect a significant result.
8
Debbie Downer, again Problems of mathematical maximization Effect size
Correlations can be statistically but not practically significant Can be influenced by variability in samples Regression models can work for the data that was used to produce them, but may not work on samples of new data. SEM models also require data validation (replication). If it can’t be replicated, a predictive technique is useless. Effect size Need to plan from the outset how much of a difference will be enough to act on.
9
Assessing Group Differences
T-tests – to look at significant differences of a mean or proportion (or total with some caveats) between two groups. Dependent, independent groups, tests of gain scores. Analysis of variance – same type of analysis but groups have expanded to at least three. Model asks—are differences due to grouping variable? Expansions of ANOVA – two way, three way effects. Repeated measures models, split plot (with blocking variables), multivariate ANOVA (profile analysis). Nonparametric equivalents—median test, sign test, Wilcoxon, Mann-Whitney. These are reduced-assumption and if you don’t have enough in your groups, a really unbalanced design or reason to believe the variances are different, you can increase the power of your test by using one of these.
10
Assessing Group Differences, pt. 2
Chi-square analysis and other categorical techniques Basically comparing cell frequencies to what you would expect based on marginal totals. Marginal totals are those at the end of the row or column; if you multiply the two marginals and divide by the N at the bottom of the table, you would get what you would ‘expect’ based on the joint distribution of those two categorical variables. The chi-square test calculates a deviation score for each cell from its expected value. Chi-square distribution has # of groups minus 1 or (r-1)(c-1) as degrees of freedom.
11
Modeling Built on correlation, covariance
Shows how variables move with each other Simple regression XY Y=a+bX Multiple regression X1 X2 Y Y = a+b1X1+b2X2 Multivariate regression, logistic regression, logit models (when there isn’t a clear Y) Structural Equation Modeling (SEM) is covariance based, attempt to build simple to very complex models that reproduce the covariance matrix of a sample of data—important to do model building carefully and use validation. HLM MLM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.