“I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it” Lord William Thomson, 1st Baron Kelvin
Statistics = “getting meaning from data” (Michael Starbird)
descriptive statistics “inferential” statistics measures of central values, measures of variation, visualization beating chance!
“inferential” statistics beating chance!
“inferential” statistics beating chance! Sample Population inference PARAMETERS ESTIMATES
But what’s the value of inferential statistics in our field?? 1. More explicit theories 2. More constraints on theory 3. (Limited) generalizability
H 0 = there is no difference, or there is no correlation H a = there is a difference; there is a correlation The (twisted) logic of hypothesis testing
Type I error = behind bars… … but not guilty Type II error = guilty… … but not behind bars The (twisted) logic of hypothesis testing
p < 0.05 What does it really mean?
p < 0.05 = Given that H 0 is true, this data would be fairly unlikely
One- sample t-test Unpaired t-test ANOVA ANCOVA Regression MANOVA χ 2 test Discrimant Function Analysis Paired t-test
One- sample t-test Unpaired t-test ANOVA ANCOVA Regression MANOVA χ 2 test Discrimant Function Analysis Paired t-test
Linear Model
General Linear Model General Linear Model
General Linear Model General Linear Model Generalized Linear Model Generalized Linear Model Generalized Linear Mixed Model
General Linear Model General Linear Model Generalized Linear Model Generalized Linear Model Generalized Linear Mixed Model
what you measure what you manipulate “response” “predictor” RT ~ Noise
best fitting line (least squares estimate)
the intercept the slope
Same intercept, different slopes
Positive vs. negative slope
Same slope, different intercepts
Different slopes and intercepts
The Linear Model response ~ intercept + slope * predictor
The Linear Model Y ~ b 0 + b 1 *X 1 coefficients
The Linear Model Y ~ b 0 + b 1 *X 1 slopeintercept
The Linear Model Y ~ *X 1 slopeintercept
With Y ~ *x, what is the response time for a noise level of x = 10? *10 = 390
Deviation from regression line = residual “fitted values”
The Linear Model Y ~ b 0 + b 1 *X 1 + error
The Linear Model Y ~ b 0 + b 1 *X 1 + error
is continuous is continuous, too!
RT ~ Noise men women
men women RT ~ Noise + Gender
The Linear Model Y ~ b 0 + b 1 *X 1 + b 2 *X 2 coefficients of slopes coefficient of intercept noise (continuous) gender (categorical)
The Linear Model “Response” ~ Predictor(s) Has to be one thing Can be one thing or many things “multiple regression”
The Linear Model “Response” ~ Predictor(s) (we’ll relax that constraint later) Can be of any data type (continuous or categorical) Has to be continuous
The Linear Model RT ~ noise + gender examples pitch ~ polite vs. informal Word Length ~ Word Frequency
Edwards & Lambert (2007); Bohrnstedt & Carter (1971); Duncan (1975); Heise (1969); in Edwards & Lambert (2007) Correlation is (still) not causation
“Response” ~ Predictor(s) Assumed direction of causality Correlation is (still) not causation