Simple Statistical Designs One Dependent Variable.

Simple Statistical Designs One Dependent Variable

Is your Dependent Variable (DV) continuous? YES NO Is your Independent Variable (IV) continuous? Correlation or Linear Regression YES Do you have only 2 treatments? NO Logistic Regression Chi Square NO T-testANOVA If I have one Dependent Variable, which statistical test do I use?

Chi Square

Chi Square (χ 2 )  Non-parametric: no parameters estimated from the sample  Chi Square is a distribution with one parameter: degrees of freedom (df).  Positively skewed but skew decreases with df.  Mean is df  Goodness-of-fit and Independence Tests

Chi-Square Goodness of Fit Test  How well do observed proportions or frequencies fit theoretically expected proportions or frequencies?  Example: Was test performance better than chance? χ 2 =Σ (Observed – Expected) 2 df = # groups -1 Expected ObservedExpected Correct6250 Incorrect3850

Chi Square Test for Independence  Is distribution of one variable contingent on another variable?  Contingency Table  df = (#Rows -1)(#Columns-1)  Example: H o : depression & gender are independent H 1 : depression and gender are not independent MaleFemaleTotal Depressed10(15)20(15)30 Not Depressed 40(35)30(35)70 Total5050100

Chi Square Test for Independence Same χ 2 formula except expected frequencies are derived from the row and column totals: cell proportion X Total = (30/100)(50/100)(100) χ 2 = (10-15) 2 + (20-15) 2 + (40-35) 2 + (30-15) 2 = 4.76 15 15 35 35 15 15 35 35 Critical χ 2 with 1 df = 3.84 at p=.05 Reject H o : depression and gender are NOT independent MaleFemaleTotal Depressed10(15)20(15)30 Not Depressed 40(35)30(35)70 Total5050100

Assumptions of Chi Square  Independence of observations  Categories are mutually exclusive  Sampling distribution in each cell is normal  Violated if expected frequencies are very low ( 20.  Fisher’s Exact Test can correct for violations of these assumptions in 2x2 designs.

Correlation and Regression

Recall the Bivariate Distribution Recall the Bivariate Distribution r = -.17 p=.09

Interpretation of r  Slope of best fitting straight regression line when variables are standardized  measure of the strength of the relationship between 2 variables  r 2 measures proportion of variability in one measure that can be explained by the other  1-r 2 measures the proportion of unexplained variability.

Correlation Coefficients Coefficient Variable 1 Type Variable 2 Type Pearson r continuouscontinuous Point Biserial continuousdichotomy Phi Coefficient dichotomydichotomy Biserialcontinuous Artificial dichotomy Tetrachoric Spearman’s Rho ranksranks

Simple Regression  Prediction: What is the best prediction of variable X?  Regress Y on X (i.e. regress outcome on predictor)  CorrelationRegression.html CorrelationRegression.html

The fit of a straight line  The straight line is a summary of a bivariate distribution  Y = a + bx + ε  DV = intercept + slope(IV) + error  Least Squares Fit: minimize error by minimizing sum of squared deviations: Σ(Actual Y - Predicted Y) 2  Regression lines ALWAYS pass through the mean of X and mean of Y

b  Slope: the magnitude of change in Y for a 1 unit change in X  Beta= b = r(SD y / SD x )  Because of this relationship: Z y = r Z x  Standardized beta: if X and Y are converted to Z scores, this would be the beta – not interpretable as slope.

Residuals  The error in the estimate of the regression line  Mean is always 0  Residual plots are very informative – tell you how well your line fits the data  Linear Regression Applet Linear Regression Applet Linear Regression Applet

Assumptions & Violations Linear Regression Applet Linear Regression Applet Linear Regression Applet  Homoscedasticity: uniform variance across whole bivariate distribution.  Bivariate outlier: not outlier on either X or Y  Influential Outliers: ones that move the regression line  Y is Independent and Normally distributed at all points along line (residuals are normally distributed)  Omission of important variables  Non-linear relationship of X and Y  Mismatched distributions (i.e. neg skew and pos skew – but you already corrected those with transformations, right?)  Group membership (i.e. neg r within groups, pos r across groups)

Logistic Regression  Continuous predictor(s) but DV is now dichotomous.  Predicts probability of dichotomous outcome (i.e. pass/fail, recover/relapse)  Not least squares but maximum likelihood estimate  Fewer assumptions than multiple regression  “Reverse” of ANOVA  Similar to Discriminant Function Analysis that predicts nominal-scaled DVs of > 2 categories

T-test  Similar to Z but with estimates instead of actual population parameters mean1 – mean2 pooled within-group SD  One- or two-tailed, use one-tailed if you can justify through hypothesis - more power  Effect size is Cohen’s d

One Sample t-test Compare mean of one variable to a specific value (i.e. Is IQ in your sample different from national norm?) Sample mean – 100 15 15

Independent Sample t-test  Are 2 groups significantly different from each other?  Assumes independence of groups, normality in both populations, and equal variances (although T is robust against violations of normality).  Pooled variance = mean of variances (or weighted by df if variances are unequal)  If N’s unequal, use Welch t-test

Dependent Samples t-test (aka Paired Samples t-test)  Dependent Samples:  Same subjects, same variables  Same subjects, different variables  Related subjects, same variables (i.e. mom and child)  More powerful: pooled variance (denominator) is smaller  But fewer df, higher critical t

Univariate (aka One-Way) ANOVA AnalysisofVariance  2 or more levels of a factor  ANOVA tests H o that means of each level are equal  Significant F only indicates that the means are not equal.

F  F statistic = t 2 = Between Group Variance = signal Within Group Variance noise Robust against violations of normality unless n is small Robust against violations of homogeneity of variances unless n’s are unequal If n’s are unequal, use Welch F’ or Brown-Forsythe F*

Effect size  Large F does NOT equal large effect  Eta Squared (η 2 ): Sum-of-Squares between Sum-of-squares Total Sum-of-squares Total Variance proportion estimate Positively biased – OVERestimates true effect  Omega squared (ω 2 ) adjusts for within factor variability and is better estimate

Family-wise error  F is a non-directional, omnibus test and provides no info about specific comparisons between factors. In fact, a non-significant omnibus F does not mean that there are not significant differences between specific means.  However, you can’t just run a separate test for each comparison – each independent test has an error rate (α).  Family-wise error rate = 1 – (1- α) c, where c = # comparisons  Example: 3 comparisons with α=.05 1 – (1-.05) 3 =.143

Contrasts  A linear combination of contrast coefficients (weights) on the means of each level of the factor Control Drug 1 Drug 2 mean10205 To contrast the Control group against the Drug 1 group, the contrast would look like this: Contrast = 1(Control) + (-1)(Drug 1) + 0(Drug 2)

Unplanned (Post-hoc) Contrasts  Risk of Family-wise error  Correct with:  Bonferoni inequailty: multiply α by # comparisons  Tukey’s Honest Significant Difference (HSD): minimum difference between means necessary for significance  Scheffe test: critical F’ = (#groups-1)(F) ultraconservative

Planned Contrasts  Polynomial: linear, quadratic, cubic, etc. pattern of means across levels of the factor  Orthogonal: sum of contrast coefficients (weights) equals 0.  Non-orthogonal: sum of contrast coefficients does not equal 0

Polynomial Contrasts (aka Trend Analysis)  Special case of orthogonal contrasts but IV must be ordered (e.g. time, age, drug, dosage) LinearQuadraticCubicQuartic

Orthogonal Contrasts  Deviation : Compares the mean of each level (except one) to the mean of all of the levels (grand mean). Levels of the factor can be in any order. Control Drug 1 Drug 2 Grand Mean 1020511.67

Orthogonal Contrasts Simple: Compares the mean of each level to the mean of a specified level. This type of contrast is useful when there is a control group. You can choose the first or last category as the reference. Control Drug 1 Drug 2 Grand Mean 1020511.67

Orthogonal Contrasts Helmert : Compares the mean of each level of the factor (except the last) to the mean of subsequent levels combined. Control Drug 1 Drug 2 Grand Mean 1020511.67

Orthogonal Contrasts Difference : Compares the mean of each level (except the first) to the mean of previous levels. (aka reverse Helmert contrasts.) Control Drug 1 Drug 2 Grand Mean 1020511.67

Orthogonal Contrasts Repeated : Compares the mean of each level (except the last) to the mean of the subsequent level. Control Drug 1 Drug 2 Grand Mean 1020511.67

Non-orthogonal Contrasts  Not used often  Dunn’s test (Bonforoni t): controls for family-wise error rate by multiplying α by the number of comparisons.  Dunnett’s test: use t-test but critical t values come from a different table (Dunnett’s) that restricts family-wise error.

Simple Statistical Designs One Dependent Variable.

Similar presentations

Presentation on theme: "Simple Statistical Designs One Dependent Variable."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Simple Statistical Designs One Dependent Variable.

Similar presentations

Presentation on theme: "Simple Statistical Designs One Dependent Variable."— Presentation transcript:

Similar presentations

About project

Feedback