Download presentation
Presentation is loading. Please wait.
1
DCAL Stats Workshop Bodo Winter
2
Outline Two learning curves Friday Jan 19 Saturday Jan 20
3
Outline Two learning curves Friday Jan 19 Saturday Jan 20
4
Outline Two learning curves Friday Jan 19 Saturday Jan 20
5
What is statistics? “Math-assisted thinking”
“Statistics, more than most other areas of mathematics, is just formalized common sense, quantified straight thinking.” Paulos (1992: 58) Paulos, J. A. (1992). Beyond numeracy: Ruminations of a numbers man. New York: Vintage Books.
6
Statistics is part of the entire research cycle
Theory/Hypothesis Publish paper, data and scripts Data collection Write-up ALL OF THAT IS STATISTICS Preprocessing/ Data Preparation Statistical Analysis
7
Statistics is part of the entire research cycle
Theory/Hypothesis Publish paper, data and scripts “confirmatory statistics” Data collection Write-up ALL OF THAT IS STATISTICS Preprocessing/ Data Preparation Statistical Analysis
8
Statistics is part of the entire research cycle
“confirmatory statistics” = hypothesis-testing “exploratory statistics” ALL OF THAT IS STATISTICS = hypothesis- generating
9
“Getting meaning from data”
Descriptive Statistics Michael Starbird Inferential Statistics
10
“Getting meaning from data”
Word Emotional Valence minty +1.52 juicy +1.56 smelly -1.87 sweet +2.12 putrid -1.78 delicious +1.82 stinky -1.49 rancid -2.11 Descriptive Statistics Michael Starbird Inferential Statistics Winter (2016), Language, Cognition and Neuroscience
11
“Getting meaning from data”
Word Emotional Valence minty +1.52 juicy +1.56 smelly -1.87 sweet +2.12 putrid -1.78 delicious +1.82 stinky -1.49 rancid -2.11 Winter (2016), Language, Cognition and Neuroscience
12
“Getting meaning from data”
Word Emotional Valence sweet +2.12 delicious +1.82 juicy +1.56 minty +1.52 stinky -1.49 putrid -1.78 smelly -1.87 rancid -2.11 M = 1.8 M = -1.8 Winter (2016), Language, Cognition and Neuroscience
13
Everything is grounded in the notion of a “distribution”
14
Everything is grounded in the notion of a “distribution”
15
Everything is grounded in the notion of a “distribution”
16
Everything is grounded in the notion of a “distribution”
17
Everything is grounded in the notion of a “distribution”
18
Everything is grounded in the notion of a “distribution”
19
Everything is grounded in the notion of a “distribution”
20
Everything is grounded in the notion of a “distribution”
“uniform distribution”
21
Everything is grounded in the notion of a “distribution”
“uniform distribution” Inspired by Cartoon Guide to Statistics
22
Everything is grounded in the notion of a “distribution”
“uniform distribution” Inspired by Cartoon Guide to Statistics
23
Everything is grounded in the notion of a “distribution”
Inspired by Cartoon Guide to Statistics
24
Everything is grounded in the notion of a “distribution”
“normal distribution” Inspired by Cartoon Guide to Statistics
25
Everything is grounded in the notion of a “distribution”
“Gaussian distribution” Inspired by Cartoon Guide to Statistics
26
Everything is grounded in the notion of a “distribution”
“distribution with positive skew” Inspired by Cartoon Guide to Statistics
27
Ways continuous distributions differ
Location Spread Shape Mean Median Mode Range Variance Standard deviation Inter-Quartile Range
28
Differences in location
Warriner et al. (2013), Behavior Research Methods
29
Differences in location
-4 +4 Warriner et al. (2013), Behavior Research Methods
30
Differences in location
-4 +4 Warriner et al. (2013), Behavior Research Methods
31
Differences in location
M = 0.2 -4 +4 Warriner et al. (2013), Behavior Research Methods
32
Differences in location
M = -0.6 -4 +4 Warriner et al. (2013), Behavior Research Methods
33
Differences in location
M = -0.6 -4 +4 Warriner et al. (2013), Behavior Research Methods
34
Differences in location
sum of all the numbers (from the first number to the nth number) Differences in location divided by how many numbers you have +4 Warriner et al. (2013), Behavior Research Methods
35
Example: the mean of three response times
300ms 200ms 400ms Sum: = 900 Divided by N: 900 / 3 = 300ms
36
The mean is a “balance point”. The median is a “half-way point”.
37
The mean is a “balance point”. The median is a “half-way point”.
38
The mean is a “balance point”. The median is a “half-way point”.
50% 50%
39
The mean is a “balance point”. The median is a “half-way point”.
50% 50%
40
Differences in spread: range
-2.11 +1.56 -4 +4 Warriner et al. (2013), Behavior Research Methods
41
Differences in spread: standard deviation
-4 +4 SD = 1.21 Warriner et al. (2013), Behavior Research Methods
42
Differences in spread: standard deviation
-4 +4 SD = 1.21 Warriner et al. (2013), Behavior Research Methods
43
the mean Warriner et al. (2013), Behavior Research Methods
44
differences from the mean
Warriner et al. (2013), Behavior Research Methods
45
squared differences from
the mean Warriner et al. (2013), Behavior Research Methods
46
sum of squared differences from the mean
Warriner et al. (2013), Behavior Research Methods
47
conceptually: “average” of sum of squared differences from the mean
Warriner et al. (2013), Behavior Research Methods
48
conceptually: “undoing” the squaring
Warriner et al. (2013), Behavior Research Methods
49
You can think of the standard deviation conceptually as the “average deviation” from the mean*
* it is not technically the average deviation, but the basic idea is right Warriner et al. (2013), Behavior Research Methods
50
Differences in spread: SD
-4 +4 SD = 1.21 Warriner et al. (2013), Behavior Research Methods
51
Differences in spread: SD
-4 +4 SD = 0.41
52
The 68%-95% rule of thumb
53
The 68%-95% rule of thumb If the distribution is approximately normal 68% of the data fall within the interval: [ mean - SD, mean + SD ] 95% of the data fall within the interval: [ mean + 2 * SD, mean + 2 * SD ]
54
The 68%-95% rule of thumb Imagine a paper reports these two numbers: M = 600 ms, SD = 50 ms Between which two numbers do you expect 68% of the data? 550ms – 650 ms
55
The 68%-95% rule of thumb Imagine a paper reports these two numbers: M = 600 ms, SD = 50 ms Between which two numbers do you expect 95% of the data? 500ms – 700 ms
56
In R, computing all of this is easy...
mean(yournumbers) sd(yournumbers) median(yournumbers) range(yournumbers)
59
Approaching R: Having the right attitude
“I have been writing R code for years, and every day I still write code that doesn’t work!” Wickham & Grolemund (2017: 7) Wickham, H. & Grolemund, G (2017). R for Data Science. Sebastopol, CA: O’Reilly.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.