Research planning. Planning v. evaluating research To a large extent, the same thing Plan a study so that it is capable of yielding data that could possibly.

Slides:



Advertisements
Similar presentations
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Advertisements

Parametric Inferential Statistics. Types of Inference Estimation: On the basis of information in a sample of scores, we estimate the value of a population.
Statistical Issues in Research Planning and Evaluation
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Objectives Look at Central Limit Theorem Sampling distribution of the mean.
1. Estimation ESTIMATION.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Sample Size Determination
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Understanding Statistics in Research
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
Sample Size I: 1 Sample Size Determination In the Context of Estimation.
Validity, Reliability, & Sampling
Using Statistics in Research Psych 231: Research Methods in Psychology.
Using Statistics in Research Psych 231: Research Methods in Psychology.
The Sampling Distribution Introduction to Hypothesis Testing and Interval Estimation.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 7 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
AM Recitation 2/10/11.
Comparing Means From Two Sets of Data
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Health and Disease in Populations 2001 Sources of variation (2) Jane Hutton (Paul Burton)
Topic 5 Statistical inference: point and interval estimate
User Study Evaluation Human-Computer Interaction.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Introduction to inference Use and abuse of tests; power and decision IPS chapters 6.3 and 6.4 © 2006 W.H. Freeman and Company.
Essential Statistics Chapter 131 Introduction to Inference.
INTRODUCTION TO INFERENCE BPS - 5th Ed. Chapter 14 1.
CHAPTER 14 Introduction to Inference BPS - 5TH ED.CHAPTER 14 1.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Sampling, sample size estimation, and randomisation
An Introduction to Statistics and Research Design
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 9 1 MER301:Engineering Reliability LECTURE 9: Chapter 4: Decision Making for a Single.
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Medical Statistics as a science
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
Statistics Psych 231: Research Methods in Psychology.
Statistical planning and Sample size determination.
Chapter 8 Parameter Estimates and Hypothesis Testing.
PS215: Methods in Psychology II W eek 8. 2 Next Friday (Week 9) Evaluating research, class test First ten minutes of lecture ( ) Please come a.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Probability & Significance Everything you always wanted to know about p-values* *but were afraid to ask Evidence Based Chiropractic April 10, 2003.
Finishing up: Statistics & Developmental designs Psych 231: Research Methods in Psychology.
Statistical Techniques
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.2 Tests About a Population.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Chapter 13 Understanding research results: statistical inference.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Statistics Psych 231: Research Methods in Psychology.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
Inferential Statistics Psych 231: Research Methods in Psychology.
Descriptive Statistics Psych 231: Research Methods in Psychology.
Inference: Conclusion with Confidence
Statistical Inference
Understanding Results
Inferential Statistics
Inferential Statistics
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Presentation transcript:

Research planning

Planning v. evaluating research To a large extent, the same thing Plan a study so that it is capable of yielding data that could possibly allow you to draw a relevant conclusion from the data Evaluate other studies to check that the conclusions they claim can be drawn from their data really do follow

Summary Quality of the research question link to previous theory (theories) precision Design and ‘causal’ research questions Power Sample size Effect size Confidence intervals

Imaginary study Research question Do second year students have a ‘sweeter tooth’ than third year students? Give WSS to a sample of current y2 and y3 psychology students. Predict, M y2 > M y3 Any good as a research question?

Not a terribly good research question Theoretically vacuous why would we expect third years to lose their taste for sweet things? what psychological theories are supposed to be relevant?

Could be made into a better question Link the research question, in a specific and precise way, to previous research The sugar-experience theory claims that as people acquire more memories, they develop a more dense neural-network. This density requires more sugar for energy and fuel. The sugar-young theory claims that as people get older, they lose bits of brain stuff, and so the fuel requirements of the brain reduce. Consequently sugar becomes less desirable. Of course, it doesn’t have to be a neuropsychological theory

Causal conclusion? Can’t make a causal conclusion because: quasi-experimental design There may be other differences between second and third year students than just year of study

… so if result is M y2 > M y3 Could be because loss of brain stuff due to ageing reduces need for sugar Or, it could be that: - larger class size drives you to sugar - living on campus puts you off sugar … Or, we were unlucky, and its just one of the 5% of samples…

Design of study limits conclusions Experiment, with random allocation of participants to conditions  could allow a causal conclusion Quasi-experiment, or correlational study  no causal conclusion yet

Result Y2 sweetness > Y3 ? Could be because loss of brain stuff due to ageing reduces need for sugar Or, it could be that: - Larger class size drives you to sugar - Living on campus again puts you off sugar … Or, we were unlucky, and its just one of the 5% of samples…

Directness of measures Year of study (2 versus 3) is our IV However, “Year” is standing for the amount of neural material (one hypothesis says it is lost, the other says it is gained) Ideally, we would measure that directly. Aim for the most direct measures you can get

What if there is no significant difference? What can you conclude? There really is no effect There really is an effect, but we did not detect it because… We were unlucky (again!) Measures lack validity reliability Sample size too small

power Probability that any particular (random) sample will produce a statistically significant effect Eg. power = 0.9  90% chance of detecting an effect if there really is an effect Researchers usually aim to have power at 80-90%

make it easier to detect an effect Test of F-ratio for ANOVA F = effect we are interested in error variance

making it easier to detect an effect F = effect we are interested in error variance Effect size ↑ Reliability of measures ↑ Other sources of error ↓

tip: power & ANOVA Each effect in the ANOVA has its own power Eg. 2 x 3 ANOVA Main effect A Main effect B Interaction effect A * B Tip: power is lower for interactions than for main effects

Power and sample size All else being equal, to get more power you need more participants Where “all else” means: reliability of measures other sources of error variance p-value the true size of the effect

Small samples Fewer repetitions of measurement –less reliability Anomalies can have more influence More likely to be quirky

Sample size – ethical issues Too small a sample -- can’t detect significant effects  waste all participants’ time Too large a sample -- waste resources -- waste the extra participants’ time

Sample size – practical issues Resources Time Cost of running each participant Availability Clinical populations are often small Access can take time & require permission

Choosing an appropriate sample size Shortcut Base sample size on previous research (but make sure the previous research is of high quality!)

if you know these… effect size variance of measures you can work out what the sample size should be

Effect size Do year 2 like sweet things better than year 3? Should we order more sugar for the café? M y2 = 42, M y3 = 40 Effect size = 42 – 40 = 2 Statistical significance: p <.05 Practical (‘clinical’) significance: is there an effect that matters?

Significance level (p-value) & sample size a very large sample can detect tiny effects a small sample can miss even a large effect A very small p (like p =.001) does not mean a strong effect Significance and effect size are different things n = 3000, a difference in mean WSS score of 0.1  p <.0001 n = 3, a difference in mean WSS score of 3  p >.10

standardised effect size d= M1 – M2   M1 and M2 are the respective population means  is an estimate of population sd. Values typically range 0 – is "small"; 0.8 is a "large" effect (Cohen, 1977)

Confidence intervals (CI) p-value: is the difference significant? CI Is the difference significant? What is the effect size? How well have we estimated the difference?

Confidence interval A range of effect sizes, with the most likely effect size in the middle CI 95 = 2.37 (1.5 – 3.24) 95% CI  5% p-value tested If the interval includes 0, the difference is not statistically significant. The 95% confidence interval The data are consistent with any value in this range

Confidence interval A range of effect sizes, with the most likely effect size in the middle CI 95 = 2.37 (1.5 – 3.24) The wider the interval, the less precisely we have measured the effect CI 95 = 2.37 (0.5 – 4.24) The 95% confidence interval …and the more uncertainty remains about the true effect size

Summary Quality of the research question link to previous theory (theories) precision Design and ‘causal’ research questions Power Sample size Effect size Confidence intervals

These concepts are inter-related Desired power ↑N ↑ Acceptable p-value ↓N ↑ Effect size to detect ↓N ↑ Reliability of measures ↓N ↑ Other error variance ↑N ↑