Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to get the most out of null results using Bayes Zoltán Dienes.

Similar presentations


Presentation on theme: "How to get the most out of null results using Bayes Zoltán Dienes."— Presentation transcript:

1 How to get the most out of null results using Bayes Zoltán Dienes

2 The problem: Does a non-significant result count as evidence for the null hypothesis or as no evidence either way?

3 Geoff Cumming: http://www.latrobe.edu.au/psy/esci/index.html

4 The solutions: 1.Power 2.Interval estimates 3.Bayes Factors

5 Problems with Power I)Power depends on specifying the minimal effect of interest (which may be poorly specified by the theory) II)Power cannot make use of your actual data to determine the sensitivity of those data Confidence intervals solve the second problem Bayes Factors can solve both problems By making use of the full range of predictions of the theory, it makes maximal use of the data in assessing the sensitivity of the data in distinguishing your theory from the null A Bayes Factor can show strong evidence for the null hypothesis over your theory, when it is impossible to say anything using power or confidence intervals

6 The four principles of inference by intervals: Rule (iii): Reject a directional theory of a positive difference Rule (iv) : Suspend judgment Difference between means -> Minimal interesting value Rule (i): accept the null region hypothesis Rule (ii): reject the null region hypothesis 0 Null region

7 The Bayes Factor

8

9 Cat hypothesis Devil hypothesis

10

11 If a cat, you lose finger only 1/10 of time If a devil, you will lose finger 9/10 of time

12 Evidence supports the theory that most strongly predicted it

13 John puts his hand in the box and loses a finger. Which hypothesis is most strongly supported, the cat hypothesis or the devil hypothesis?

14 Evidence supports a theory that most strongly predicted it John puts his hand in the box and loses a finger. Which hypothesis is most strongly supported, the cat hypothesis or the devil hypothesis? Cat hypothesis predicts this result with probability = 1/10 Devil hypothesis predicts this result with probability = 9/10

15 Evidence supports a theory that most strongly predicted it John puts his hand in the box and loses a finger. Which hypothesis is most strongly supported, the cat hypothesis or the devil hypothesis? Cat hypothesis predicts this result with probability = 1/10 Devil hypothesis predicts this result with probability = 9/10 Strength of evidence for devil over cat hypothesis = 9/10 divided by 1/10 = 9

16 The evidence is nine times as strong for the devil over the cat hypothesis OR Bayes Factor (B) = 9

17 Consider: John does not lose a finger

18 Consider: John does not lose a finger Now evidence strongly supports cat over devil hypothesis (BF = 9 for cat over devil hypothesis or 1/9 for devil over cat hypothesis )

19 Probability of losing finger given cat = 4/10 Probability of losing finger given devil = 6/10 Now if John loses finger strength of evidence for devil over cat = 6/4 = 1.5 Not very strong

20 We can distinguish: Evidence for cat hypothesis over devil Evidence for devil hypothesis over cat Not much evidence either way.

21 Bayes factor tells you how strongly the data are predicted by the different theories (e.g. your pet theory versus null hypothesis): B = Probability of your data given your pet theory divided by probability of data given null hypothesis

22 If B is greater than 1 then the data supported your theory over the null If B is less than 1, then the data supported the null over your theory If B = about 1, experiment was not sensitive. (Automatically get a notion of sensitivity; contrast: just relying on p values in significance testing.) Jeffreys, 1939/1961: Bayes factors more than 3 or less than a 1/3 are substantial

23 To know which theory data support need to know what the theories predict The null is normally the prediction of e.g. no difference Population difference between conditions Plausibility 02-2 4 On the null hypothesis only this value is plausible

24 To know which theory data support need to know what the theories predict The null is normally the prediction of e.g. no difference Need to decide what difference or range of differences are consistent with one’s theory Difficult - but forces one to think clearly about one’s theory.

25 To calculate a Bayes factor must decide what range of differences are predicted by the theory 1)Uniform distribution 2)Normal 3)Half normal

26 Example: Does imagining a sports move improve sports performance

27 Plausibility Population difference in means between practice versus no practice 02 48 Example: Does imagining a sports move improve sports performance Performance with real practice for same amount of time -2

28 Similar sorts of effects as those predicted in the past have been on the order of a 5% difference between conditions

29 Plausibility Difference between conditions 0 5 10 Similar sorts of effects as those predicted in the past have been on the order of a 5% difference between conditions

30 5 Implies: Smaller effects more likely than bigger ones; effects bigger than 10% very unlikely 0 Plausibility Population difference in means between conditions

31 To calculate Bayes factor in a t-test situation Need same information from the data as for a t-test: Mean difference, Mdiff SE of difference, SEdiff

32 To calculate Bayes factor in a t-test situation Need same information from the data as for a t-test: Mean difference, Mdiff SE of difference, SEdiff Note: t = Mdiff / SEdiff  SEdiff = Mdiff/t Also note F(1,x) = t 2 (x)

33 To calculate a Bayes factor: 1) Google “Zoltan Dienes” 2) First site to come up is the right one: http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/ 3) Click on “Click here for a Bayes factor calculator” 4) Scroll down and click on “Click here to calculate your Bayes factor!”

34

35

36

37 http://www.latrobe.edu.au/psy/esci/index.html Bayes p 2.96.081 4.88.034 0.52.74 4.88.034 2.70.09 0.46.817 4.40.028 1024.6.001 3.33.056 4.88.031 1.73.279 4.28.024 2.96.083 49.86.002 2.16.167 2.12.172 1.01.387 0.65.614 0.75.476 28.00.006 4.28.028 49.86.002 5.60.024 2.36.144 1.73.23 The tai chi of the Bayes factors The dance of the p values

38 A Bayes Factor requires establishing predicted effect sizes. How? Do digit-colour synesthetes show a Stroop effect on digits? You display: 3 … 4 … 5 … 6 What they see: 3 … 4 … 5 … 6 You get a null effect (incongruent minus congruent RTs)... What size effect would be predicted if there were one?

39 A Bayes Factor requires establishing predicted effect sizes. Do digit-colour synesthetes show a Stroop effect on digits? You display: 3 … 4 … 5 … 6 What they see: 3 … 4 … 5 … 6 You get a null effect (incongruent minus congruent RTs)... What size effect would be predicted if there were one? Run normals on a condition in which digits are coloured in the way synesthetes say they are. The Stroop effect is presumably the maximum one could expect synesthetes to show. Use a uniform: Effect for normals with real colours 0 Plausibility Possible population Stroop effects

40 Another group in your experiment might help settle expectations: Subliminal effect using back masking, 5%, SE = 1.5% p<.05 Subliminal effect using a new method, “gaze contingent crowding”: 1%,SE = 2% Is there evidence for any subliminal perception using the new method?

41 Another group in your experiment might help settle expectations: Subliminal effect using back masking, 5%, SE = 1.5% p<.05 Subliminal effect using a new method, “gaze contingent crowding”: 1%,SE = 2% Is there evidence for any subliminal perception using the new method? Thus: Used a half-normal with SD = 5% B H(0,5) = 0.56 Nothing follows about whether or not there was subliminal perception with the new method. Need to run more subjects.

42 If you have a manipulation meant to reduce an effect, effect of manipulation unlikely to be larger than the basic effect e.g. Dienes, Baddeley & Jansari (2012) Predicted sad mood would reduce learning compared to neutral mood So e.g. if on 2-alternative forced choice test, in neutral condition people get 70% correct

43 If you have a manipulation meant to reduce an effect, effect of manipulation unlikely to be larger than the basic effect e.g. Dienes, Baddeley & Jansari (2012) Predicted sad mood would reduce learning compared to neutral mood So e.g. if on 2-alternative forced choice test, in neutral condition people get 70% correct Sad condition expected to be somewhere between 50 and 70% So effect of mood must be?

44 If you have a manipulation meant to reduce an effect, effect of manipulation unlikely to be larger than the basic effect e.g. Dienes, Baddeley & Jansari (2012) Predicted sad mood would reduce learning compared to neutral mood So e.g. if on 2-alternative forced choice test, in neutral condition people get 70% correct Sad condition expected to be somewhere between 50 and 70% So effect of mood must be? 0 Plausibility 20

45 Generalising to categorical data Intervention given to one of two groups about harm of smoking interventionno intervention Not Smoker2015 Smoker1617

46 Generalising to categorical data Intervention given to one of two groups about harm of smoking interventionno intervention Not Smoker2015 Smoker1617 Odds ratio = (20*17) / (16*15) = 1.42 Ln odds ratio (0.35) is normally distributed with squared SE = 1/20 + 1/16 + 1/15 + 1/17 = 0.24 (z = 0.35/√0.24 = 0.71 p =.48 )

47 Generalising to categorical data Intervention given to one of two groups about harm of smoking interventionno intervention Not Smoker2015 Smoker1617 Odds ratio = (20*17) / (16*15) = 1.42 Ln odds ratio (0.35) is normally distributed with squared SE = 1/20 + 1/16 + 1/15 + 1/17 = 0.24 (z = 0.35/√0.24 = 0.71) A different intervention had reduced smokers with odds ratio of 3. Ln 3

48 Generalising to categorical data Intervention given to one of two groups about harm of smoking interventionno intervention Not Smoker2015 Smoker1617 Odds ratio = (20*17) / (16*15) = 1.42 Ln odds ratio (0.35) is normally distributed with squared SE = 1/20 + 1/16 + 1/15 + 1/17 = 0.24 (z = 0.35/√0.24 = 0.71) A different intervention had reduced smokers with odds ratio of 3. B H(0,1.1) = 0.78 Ln 3

49 Watson, Gordon, Stermac, Kalogerakos, and Steckley (2003) compared Process Experiential Therapy (PET) with(CBT) in treating depression. (BDI) prepost Change CBT 25.0912.56 12.53 PET24.5013.0511.45 F for group (CBT vs PET) X time (pre vs post) interaction = 0.18, non-significant. 13

50 Watson, Gordon, Stermac, Kalogerakos, and Steckley (2003) compared Process Experiential Therapy (PET) with(CBT) in treating depression. (BDI) prepost Change CBT 25.0912.56 12.53 PET24.5013.0511.45 F for group (CBT vs PET) X time (pre vs post) interaction = 0.18, non-significant. The sample raw interaction effect is 12.53 – 11.45 = 1.08. t = √0.18 = 0.42. Thus SE = 1.08/0.42 = 2.55. 13

51 Watson, Gordon, Stermac, Kalogerakos, and Steckley (2003) compared Process Experiential Therapy (PET) with(CBT) in treating depression. (BDI) prepost Change CBT 25.0912.56 12.53 PET24.5013.0511.45 F for group (CBT vs PET) X time (pre vs post) interaction = 0.18, non-significant. The sample raw interaction effect is 12.53 – 11.45 = 1.08. t = √0.18 = 0.42. Thus SE = 1.08/0.42 = 2.55. 0 Plausibility 13

52 Watson, Gordon, Stermac, Kalogerakos, and Steckley (2003) compared Process Experiential Therapy (PET) with(CBT) in treating depression. (BDI) prepost Change CBT 25.0912.56 12.53 PET24.5013.0511.45 F for group (CBT vs PET) X time (pre vs post) interaction = 0.18, non-significant. The sample raw interaction effect is 12.53 – 11.45 = 1.08. t = √0.18 = 0.42. Thus SE = 1.08/0.42 = 2.55. B U[0,13] = 0.36 0 Plausibility 13

53 Watson, Gordon, Stermac, Kalogerakos, and Steckley (2003) compared Process Experiential Therapy (PET) with(CBT) in treating depression. (BDI) prepost Change CBT 25.0912.56 12.53 PET24.5013.0511.45 F for group (CBT vs PET) X time (pre vs post) interaction = 0.18, non-significant. The sample raw interaction effect is 12.53 – 11.45 = 1.08. t = √0.18 = 0.42. Thus SE = 1.08/0.42 = 2.55. 13 -1212

54 Watson, Gordon, Stermac, Kalogerakos, and Steckley (2003) compared Process Experiential Therapy (PET) with(CBT) in treating depression. (BDI) prepost Change CBT 25.0912.56 12.53 PET24.5013.0511.45 F for group (CBT vs PET) X time (pre vs post) interaction = 0.18, non-significant. The sample raw interaction effect is 12.53 – 11.45 = 1.08. t = √0.18 = 0.42. Thus SE = 1.08/0.42 = 2.55. B U[-12, 12] = 0.29 13 -1212

55 My typical practice: If think of way of determining an approximate expected size of effect  Use half normal with SD = to that typical size If think of way of determining an approximate upper limit of effect => Use uniform from 0 to that limit

56 Moral and inferential paradoxes of orthodoxy: 1.On the orthodox approach, standardly you should plan in advance how many subjects you will run. If you just miss out on a significant result you are not allowed to just run 10 more subjects and test again. You are not allowed to run until you get a significant result. Bayes: It does not matter when you decide to stop running subjects. You can always run more subjects if you think it will help.

57 Moral paradox: If p =.07 after running planned number of subjects i)If you run more and report significant at 5% you have cheated ii)If you don’t run more and bin the results you have wasted tax payer’s money and your time, and wasted relevant data You are morally damned either way Inferential paradox Two people with the same data and theories could draw opposite conclusions

58 Threshold :345678910 Population effect : d = 0 Reject141211 7765 Accept868786 85797469 d = 1 Reject97100 Accept10000000 Threshold :345678910 Population effect: d = 0 Reject777553 43 Accept9391888183827366 d = 1 Reject100 Accept00000000 Threshold = 3 Fixed 10 trials d = 0 Reject = 2 accept = 55 d = 1 Reject = 91 accept = 0

59 Moral and inferential paradoxes of orthodoxy: 2. On the orthodox approach, it matters whether you formulated your hypothesis before or after looking at the data. Post hoc vs planned comparisons Predictions made in advance of rather than before looking at the data are treated differently Bayesian inference: It does not matter what day of the week you thought of your theory The evidence for your theory is just as strong regardless of its timing

60 But Bayes is still susceptible to analytic flexibility (outlier exclusion process, transforms, etc) So Registered Reports still a good idea!! Specifying analysis in advance means if 9/10 analyses lead to one conclusion you are likely to chose an analysis that reflects what the data by and large say! However, the relation of data to theory depends only on the data and the theory – to be judged by simplicity, elegance, tightness or defensibility of theory-prediction connections

61 Moral and inferential paradoxes of orthodoxy: 3. On the orthodox approach, you must correct for how many tests you conduct in total. For example, if you ran 100 correlations and 4 were just significant, researchers would not try to interpret those significant results. On Bayes, it does not matter how many other statistical hypotheses you investigated (or your RA without telling you). All that matters is the data relevant to each hypothesis under investigation.

62 Smith et al: Time to close a sale 15 seconds faster if client in soft rather than hard chair 6 studies, all priming “closing” in some way All non-sig except for 6 th : mean effect = 10 seconds, SE = 5 seconds, t(30) = 2.0, p <.05. B H(0,15) = 3.72 Orthodoxy: Correction to threshold:.05/6 =.008 Study no longer significant How about Bayes?

63 Smith et al: Time to close a sale 15 seconds faster if client in soft rather than hard chair Replication with 6 studies, all priming “closing” in some way All non-sig except for 6 th : mean effect = 10 seconds, SE = 5 seconds, t(30) = 2.0, p <.05. B H(0,15) = 3.72 Orthodoxy: New threshold:.05/6 =.008 Study no longer significant How about Bayes? If mean priming for other studies = 0 overall priming effect across all studies is (10 +0)/6 = 1.7. Assume all studies had identical standard deviations and Ns. The standard error for the overall mean effect is 5/√6 = 2.2. Thus, B H(0,15) = 0.30, substantial support for the null hypothesis rather than superordinate theory that priming “closing” helps close a sale You must use all data relevant to assessing a theory

64 For orthodoxy but not Bayes: Different people with the same data and theories can come to different conclusions You can thus be tempted to make false (albeit inferentially irrelevant claims), like when you thought of your theory

65 Robustness checks Must make sure that different reasonable ways of representing predictions of the theory give same results (i.e. B > 3, or < 1/3) If not, collect more data until results consistent. Your assumptions must be defensible under intense cross examination

66 Plausibility max But why not represent predictions of theory as: ½*max max Plausibility

67 max why not represent predictions of theory as: ½*max max Plausibility OR:

68 Sample mean= 4, SE = 4

69 Sample slope = 4, SE = 4 Estimated maximum = 10 Assuming uniform from 0 -10:

70 Sample slope = 4, SE = 4 Slopemax = 10 Assuming uniform from 0 -10: BF = 1.28 (data insensitive)

71 Sample slope = 4, SE = 4 Slopemax = 10 Assuming uniform from 0 -10: BF = 1.28 (data insensitive) Assuming normal, mean = 5, SD = 2.5: BF = 1.37

72 Sample slope = 4, SE = 4 Estimated maximum = 10 Assuming uniform from 0 -10: BF = 1.28 (data insensitive) Assuming normal, mean = 5, SD = 2.5: BF = 1.37 Assuming half normal, SD = 5: BF = 1.32 Different reasonable assumptions barely change the BF at all

73 Determining whether knowledge is unconscious Determining that knowledge is unconscious often involves accepting the null hypothesis: 1.Objective measures: Classification at chance when priming shows knowledge 2.Subjective measures: No relation between confidence (guess versus sure) and accuracy

74 1.Objective measures Classification at chance when RTs show knowledge

75 Serial reaction time task (Nissen & Bullemer, 1987) On each trial a location is indicated Just press corresponding button Violates rules Rule governed

76 Recognition of the allowable versus unallowable triplets at chance e.g. 52% correct. Is this evidence for unconscious knowledge or just insensitivity of the recognition test?

77 Shang et al 2013 Based on RT data can determine which triplets have been learned (e.g. 5 out of 12)

78 Shang et al 2013 Based on RT data can determine which triplets have been learned e.g. 5 out of 12 We can predict recognition performance if memory perfect: e.g. 5 plus 7/2 out of 12 i.e. 8.5/12 or 71%

79 Shang et al 2013 Based on RT data can determine which triplets have been learned e.g. 5 out of 12 We can predict recognition performance if memory perfect: e.g. 5 plus 7/2 out of 12 i.e. 8.5/12 or 71% But likely to be noise in recognition, so theory that RT knowledge can be recognised can be modelled as a uniform from 50 to 71% (you would enter 0 to 21). Population recognition performance Plausibility 021

80 If recognition = 52%, SE = 6%

81 If recognition = 52%, SE = 6% (t(30) = 0.33, p = 0.74) BF =.48 This is not strong evidence for the null hypothesis

82 If recognition = 52%, SE = 6% (t(30) = 0.33, p = 0.74) BF =.48 This is not strong evidence for the null hypothesis If recognition = 52% SE = 2% (t(30) = 1.00, p = 0.33) BF =.33 Now we do have strong evidence for the null, and hence for implicit learning!

83 2. Subjective measures Knowledge is unconscious if confidence unrelated to accuracy But how strong a relationship can we expect?

84 2. Subjective measures Knowledge is unconscious if confidence unrelated to accuracy Express accuracy as Type I d’ Express confidence accuracy relationship as Type II d’ or meta-d’ Given theory that Type II cannot exceed Type I Put uniform on Type II between 0 and upper limit defined by Type I Plausibility of population Type II d’ 0Type I d’

85 SignalNoise “Yes”HitFalse Alarm “No”MissCorrect Rejection Type I d’: Signal = item present, grammatical, old... (states of world) “Yes” = “present”, “grammatical”, “old” Type II d’: Signal = response correct (state of knowledge) “yes” = confident (“No” = Guess)

86 Guo et al 2013 Express relation between confidence and accuracy as a slope: Accuracy when “confident” - accuracy when “guessing” 0 Guess Confident accuracy slope chance

87 Guo et al 2013 Express relation between confidence and accuracy as a slope: Accuracy when “confident” - accuracy when “guessing” A bit of algebra shows: This slope cannot be higher than Slopemax = (overall accuracy above baseline)/(proportion of confident responses) 0 Guess Confident accuracy slope chance

88 Guo et al 2013 Express relation between confidence and accuracy as a slope: Accuracy when “confident” - accuracy when “guessing” A bit of algebra shows: This slope cannot be higher than Slopemax = (overall accuracy above baseline)/(proportion of confident responses) Plausibility of population slope 0 slopemax slope Guess Confident accuracy chance

89 slope Guess Confident accuracy chance Overall performance 50% 50 + X % P = proportion of confident responses X = p*C + (1-p)*G (overall performance is a weighted average of guess and confident performance) Slope = C - G, will be largest possible when G = 0. Then slope = C i.e. X = p*C (G = 0 so (1-p)*G = 0) X = p*Slope Slope = X/p G X C

90 Subliminal perception “Priming = X%, p <.05; classification = Y%, p >.05. => subliminal.” Such reasoning depends on asserting the null for classification What classification would be expected if the knowledge were conscious?

91 Subliminal perception “Priming = X%, p <.05; classification = Y%, p >.05. => subliminal.” Such reasoning depends on asserting the null for classification What classification would be expected if the knowledge were conscious? Classification Priming Where people said they ‘saw’ so it is clearly conscious:

92 Subliminal perception “Priming = X%, p <.05; classification = Y%, p >.05. => subliminal.” Such reasoning depends on asserting the null for classification What classification would be expected if the knowledge were conscious? Classification Priming Where people said they ‘saw’ so it is clearly conscious: Mean priming found in subliminal condition X

93 Subliminal perception “Priming = X%, p <.05; classification = Y%, p >.05. => subliminal.” Such reasoning depends on asserting the null for classification What classification would be expected if the knowledge were conscious? Classification Priming Where people said they ‘saw’ so it is clearly conscious: Rough level of classification expected Mean priming found in subliminal condition X E

94 Plausibility SD = E

95 Plausibility SD = E Mean = E SD = ½ *E SD = SE in regression prediction

96 Classification Priming Mean priming found in subliminal condition X Where people said they ‘saw’ so it is clearly conscious:

97 Classification Priming Mean priming found in subliminal condition X Where people said they ‘saw’ so it is clearly conscious:

98 Classification Priming Mean priming found in subliminal condition X Where people said they ‘saw’ so it is clearly conscious: E

99 Priming = 5%, p <.05; classification = 51%, p >.05. Conscious priming Conscious classification Classification Priming

100 Priming = 5%, p <.05; classification = 51%, p >.05. Conscious priming Unconscious priming Conscious classification E So if conscious classification = 70% and conscious priming is twice unconscious priming…

101 Priming = 5%, p <.05; classification = 51%, p >.05. Conscious priming Unconscious priming Conscious classification E So if conscious classification = 70% and conscious priming is twice uncosncious priming E = 60%

102 Priming = 5%, p <.05; classification = 51%, p >.05. Mean = 1%, SE = 2% To represent alternative, assume normal with mean 10%, SD = 5% B =

103 Priming = 5%, p <.05; classification = 51%, p >.05. Mean = 1%, SE = 2% To represent alternative, assume normal with mean 10%, SD = 5% B =.10 There is subliminal priming!

104 Brown et al 2013 Implicit perception in the legally blind Which clock has a hand? 4.6% above baseline of 25% SE = 9.0 p = 0.61 Objective threshold reached?? Which direction is hand pointing? 12.3% above baseline of 8.3%, p =.0003 So subliminal below objective threshold!

105 Brown et al 2013 Implicit perception in the legally blind Which clock has a hand? 4.6% above baseline of 25% SE = 9.0 p = 0.61 Objective threshold reached?? Use 12.3 as expected performance for H1 Scale by (100-.8.3)/(100 - 25) = 1.22 So 4.6*1.22 = 5.6% SE = 9*1.22 = 11 B H(0,12.3) = 0.93 Objective threshold not reached! (But subjects still claimed not see - so subjective threshold reached.)

106 Dienes 2011 Perspectives on Psychological Science

107 Cumulating evidence: Study 1, study 2, study 3... What is the overall evidence for the theory? If use representation1 of the theory each time, can’t just combine each individual BF: i.e. can’t use BF(overall) = BF(study1) x BF(study2) X BF(study3) …. Why? Because each data set should also update the predictions of the theory, i.e. we should use BF(study 1 given representation1) X BF(study 2 given representation 1 + study 1) ….. SO Easiest way to get a BF for all data: Combine all data together first and calculate BF(all data given representation 1)

108 Study 1: mean1 SE1 Study2: mean2 SE2 Study3: mean3 SE3 Weight for study 1 =W1 = 1/SE1 2 = the “precision” of study 1’s estimate Overall mean = (mean1*W1 + mean2*W2 + mean3*W3)/(W1 + W2 + W3) Overall precision P = W1 + W2 + W3 Overall SE = sqrt(1/P) Can use the overall mean and SE for a)An overall BF b)An overall confidence interval (or Bayesian equivalent)

109 Raw versus standardised effects: Raw: mean difference in units of DV (seconds, scale points) Standardised: Cohen’s d, r Standardised effects are affected by how much noise is present in the design: i.e. by number of trials per condition, other factors in analysis Thus standardised effect sizes are often affected by theory irrelevant factors and BFs are typically best calculated on raw effect sizes. r 2 = t 2 / (t 2 + df) Fisher’s z = 0.5 loge[(1 + r) / (1 - r)] (on your calculator loge may be written “Ln”) SE = 1 / sqrt(df -1)

110 Minimu m width : none 10*NR W 5*NR W 4*NW R 3*NR W 2*NR W NRW 0.5*NR W Actual effect : d = 0 Reject1614743100 Accept00000000 d = 1 Reject100 00 Accept00000000 NR[0,0] MaxN = 100 d = 0 accept = 36 Decision rates for NR[-0.1, 0.1] MaxN = 100 MinN = 1

111 Minimu m width : none 10*NR W 5*NR W 4*NW R 3*NR W 2*NR W NRW 0.5*NR W Actual effect : d = 0 Reject1912652100 Accept727883848688890 d = 1 Reject100 0 Accept00000000 Decision rates for NR[-0.1, 0.1] MaxN = 1000 MinN = 1

112 Minimu m width : none 10*NR W 5*NR W 4*NW R 3*NR W 2*NR W NRW 0.5*NR W Actual effect : d = 0 Reject98752100 Accept7982 8788 0 d = 1 Reject100 0 Accept00000000 Decision rates for NR[-0.1, 0.1] MaxN = 1000 MinN = 10


Download ppt "How to get the most out of null results using Bayes Zoltán Dienes."

Similar presentations


Ads by Google