Presentation is loading. Please wait.

Presentation is loading. Please wait.

DCAL Stats Workshop Bodo Winter.

Similar presentations


Presentation on theme: "DCAL Stats Workshop Bodo Winter."— Presentation transcript:

1 DCAL Stats Workshop Bodo Winter

2 Outline Two learning curves Friday Jan 19 Saturday Jan 20

3 Outline Two learning curves Friday Jan 19 Saturday Jan 20

4 Outline Two learning curves Friday Jan 19 Saturday Jan 20

5 What is statistics? “Math-assisted thinking”
“Statistics, more than most other areas of mathematics, is just formalized common sense, quantified straight thinking.” Paulos (1992: 58) Paulos, J. A. (1992). Beyond numeracy: Ruminations of a numbers man. New York: Vintage Books.

6 Statistics is part of the entire research cycle
Theory/Hypothesis Publish paper, data and scripts Data collection Write-up ALL OF THAT IS STATISTICS Preprocessing/ Data Preparation Statistical Analysis

7 Statistics is part of the entire research cycle
Theory/Hypothesis Publish paper, data and scripts “confirmatory statistics” Data collection Write-up ALL OF THAT IS STATISTICS Preprocessing/ Data Preparation Statistical Analysis

8 Statistics is part of the entire research cycle
“confirmatory statistics” = hypothesis-testing “exploratory statistics” ALL OF THAT IS STATISTICS = hypothesis- generating

9 “Getting meaning from data”
Descriptive Statistics Michael Starbird Inferential Statistics

10 “Getting meaning from data”
Word Emotional Valence minty +1.52 juicy +1.56 smelly -1.87 sweet +2.12 putrid -1.78 delicious +1.82 stinky -1.49 rancid -2.11 Descriptive Statistics Michael Starbird Inferential Statistics Winter (2016), Language, Cognition and Neuroscience

11 “Getting meaning from data”
Word Emotional Valence minty +1.52 juicy +1.56 smelly -1.87 sweet +2.12 putrid -1.78 delicious +1.82 stinky -1.49 rancid -2.11 Winter (2016), Language, Cognition and Neuroscience

12 “Getting meaning from data”
Word Emotional Valence sweet +2.12 delicious +1.82 juicy +1.56 minty +1.52 stinky -1.49 putrid -1.78 smelly -1.87 rancid -2.11 M = 1.8 M = -1.8 Winter (2016), Language, Cognition and Neuroscience

13 Everything is grounded in the notion of a “distribution”

14 Everything is grounded in the notion of a “distribution”

15 Everything is grounded in the notion of a “distribution”

16 Everything is grounded in the notion of a “distribution”

17 Everything is grounded in the notion of a “distribution”

18 Everything is grounded in the notion of a “distribution”

19 Everything is grounded in the notion of a “distribution”

20 Everything is grounded in the notion of a “distribution”
“uniform distribution”

21 Everything is grounded in the notion of a “distribution”
“uniform distribution” Inspired by Cartoon Guide to Statistics

22 Everything is grounded in the notion of a “distribution”
“uniform distribution” Inspired by Cartoon Guide to Statistics

23 Everything is grounded in the notion of a “distribution”
Inspired by Cartoon Guide to Statistics

24 Everything is grounded in the notion of a “distribution”
“normal distribution” Inspired by Cartoon Guide to Statistics

25 Everything is grounded in the notion of a “distribution”
“Gaussian distribution” Inspired by Cartoon Guide to Statistics

26 Everything is grounded in the notion of a “distribution”
“distribution with positive skew” Inspired by Cartoon Guide to Statistics

27 Ways continuous distributions differ
Location Spread Shape Mean Median Mode Range Variance Standard deviation Inter-Quartile Range

28 Differences in location
Warriner et al. (2013), Behavior Research Methods

29 Differences in location
-4 +4 Warriner et al. (2013), Behavior Research Methods

30 Differences in location
-4 +4 Warriner et al. (2013), Behavior Research Methods

31 Differences in location
M = 0.2 -4 +4 Warriner et al. (2013), Behavior Research Methods

32 Differences in location
M = -0.6 -4 +4 Warriner et al. (2013), Behavior Research Methods

33 Differences in location
M = -0.6 -4 +4 Warriner et al. (2013), Behavior Research Methods

34 Differences in location
sum of all the numbers (from the first number to the nth number) Differences in location divided by how many numbers you have +4 Warriner et al. (2013), Behavior Research Methods

35 Example: the mean of three response times
300ms 200ms 400ms Sum: = 900 Divided by N: 900 / 3 = 300ms

36 The mean is a “balance point”. The median is a “half-way point”.

37 The mean is a “balance point”. The median is a “half-way point”.

38 The mean is a “balance point”. The median is a “half-way point”.
50% 50%

39 The mean is a “balance point”. The median is a “half-way point”.
50% 50%

40 Differences in spread: range
-2.11 +1.56 -4 +4 Warriner et al. (2013), Behavior Research Methods

41 Differences in spread: standard deviation
-4 +4 SD = 1.21 Warriner et al. (2013), Behavior Research Methods

42 Differences in spread: standard deviation
-4 +4 SD = 1.21 Warriner et al. (2013), Behavior Research Methods

43 the mean Warriner et al. (2013), Behavior Research Methods

44 differences from the mean
Warriner et al. (2013), Behavior Research Methods

45 squared differences from
the mean Warriner et al. (2013), Behavior Research Methods

46 sum of squared differences from the mean
Warriner et al. (2013), Behavior Research Methods

47 conceptually: “average” of sum of squared differences from the mean
Warriner et al. (2013), Behavior Research Methods

48 conceptually: “undoing” the squaring
Warriner et al. (2013), Behavior Research Methods

49 You can think of the standard deviation conceptually as the “average deviation” from the mean*
* it is not technically the average deviation, but the basic idea is right Warriner et al. (2013), Behavior Research Methods

50 Differences in spread: SD
-4 +4 SD = 1.21 Warriner et al. (2013), Behavior Research Methods

51 Differences in spread: SD
-4 +4 SD = 0.41

52 The 68%-95% rule of thumb

53 The 68%-95% rule of thumb If the distribution is approximately normal 68% of the data fall within the interval: [ mean - SD, mean + SD ] 95% of the data fall within the interval: [ mean + 2 * SD, mean + 2 * SD ]

54 The 68%-95% rule of thumb Imagine a paper reports these two numbers: M = 600 ms, SD = 50 ms Between which two numbers do you expect 68% of the data? 550ms – 650 ms

55 The 68%-95% rule of thumb Imagine a paper reports these two numbers: M = 600 ms, SD = 50 ms Between which two numbers do you expect 95% of the data? 500ms – 700 ms

56 In R, computing all of this is easy...
mean(yournumbers) sd(yournumbers) median(yournumbers) range(yournumbers)

57

58

59 Approaching R: Having the right attitude
“I have been writing R code for years, and every day I still write code that doesn’t work!” Wickham & Grolemund (2017: 7) Wickham, H. & Grolemund, G (2017). R for Data Science. Sebastopol, CA: O’Reilly.

60


Download ppt "DCAL Stats Workshop Bodo Winter."

Similar presentations


Ads by Google