Hypothesis Testing Introduction to Study Skills & Research Methods (HL10040) Dr James Betts.

Slides:



Advertisements
Similar presentations
Introduction to Study Skills & Research Methods (HL10040)
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Chapter 16 Introduction to Nonparametric Statistics
PTP 560 Research Methods Week 9 Thomas Ruediger, PT.
Is it statistically significant?
T-tests Part 2 PS1006 Lecture 3
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.
Lecture 2: Basic steps in SPSS and some tests of statistical inference
PSYC512: Research Methods PSYC512: Research Methods Lecture 9 Brian P. Dyre University of Idaho.
EXPERIMENTAL DESIGN Random assignment Who gets assigned to what? How does it work What are limits to its efficacy?
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 13 Using Inferential Statistics.
15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.
Today Concepts underlying inferential statistics
Major Points Formal Tests of Mean Differences Review of Concepts: Means, Standard Deviations, Standard Errors, Type I errors New Concepts: One and Two.
Chapter 14 Inferential Data Analysis
Non-parametric statistics
Mann-Whitney and Wilcoxon Tests.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Choosing Statistical Procedures
AM Recitation 2/10/11.
Comparing Means From Two Sets of Data
Education 793 Class Notes T-tests 29 October 2003.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.
T tests comparing two means t tests comparing two means.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
Ordinally Scale Variables
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.
GNRS 713 Week 3 T-tests. StatisticsDescriptiveInferentialCorrelational Relationships GeneralizingOrganizing, summarising & describing data Significance.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Experimental Design and Statistics. Scientific Method
Experimental Research Methods in Language Learning Chapter 10 Inferential Statistics.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Chapter 10 The t Test for Two Independent Samples
CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.
Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Chapter 11 The t-Test for Two Related Samples
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
T tests comparing two means t tests comparing two means.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Chapter 13 Understanding research results: statistical inference.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
N318b Winter 2002 Nursing Statistics Specific statistical tests: The T-test for means Lecture 8.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Nonparametric statistics. Four levels of measurement Nominal Ordinal Interval Ratio  Nominal: the lowest level  Ordinal  Interval  Ratio: the highest.
Inferential Statistics Assoc. Prof. Dr. Şehnaz Şahinkarakaş.
1 Underlying population distribution is continuous. No other assumptions. Data need not be quantitative, but may be categorical or rank data. Very quick.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Inferential Statistics
ENGR 201: Statistics for Engineers
Data Analysis and Interpretation
Elementary Statistics
Non-parametric tests, part A:
Introduction to Study Skills & Research Methods (HL10040)
COMPARING VARIABLES OF ORDINAL OR DICHOTOMOUS SCALES: SPEARMAN RANK- ORDER, POINT-BISERIAL, AND BISERIAL CORRELATIONS.
InferentIal StatIstIcs
Comparing Means from Two Data Sets
Presentation transcript:

Hypothesis Testing Introduction to Study Skills & Research Methods (HL10040) Dr James Betts

Lecture Outline: What is Hypothesis Testing? Hypothesis Formulation Statistical Errors Effect of Study Design Test Procedures Test Selection.

StatisticsDescriptiveInferentialCorrelational Relationships GeneralisingOrganising, summarising & describing data Significance

Sampling Error Statistics The dependent variable can be generalised from n to N Effective sampling is essential to correctly generalise back to our target population

What is Hypothesis Testing? A  B A = B Null Hypothesis We also need to establish: 1) How unequal are these observations? 2) Are these observations reflective of the general population? Alternative Hypothesis

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Null Hypothesis Alternative Hypothesis ♂ = ♀ ♂  ♀♂  ♀

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Null Hypothesis (H 0 ) There is not a significant difference in the DV between males and females Alternative Hypothesis (H A ) or experimental (H E ) There is a significant difference in the DV between males and females. n.b. these are 2-tailed hypotheses. Most common and more recommended.

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Useful analogy- the criminal trial Imagine you are the prosecutor H 0 = Defendant not guilty H A = Defendant guilty We must assume that the defendant is innocent until proven guilty.

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Sustained Isometric Torque (seconds) N♂N♂ N♀N♀ n♂n♂ n♀n♀ n.b. This is why effective sampling is so important...

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Sustained Isometric Torque (seconds) N♂N♂ N♀N♀ n♂n♂ n♀n♀ …poor/insufficient sampling can lead to errors…

Statistical Errors Type 1 Errors - Rejecting H 0 when it is actually true -Concluding a difference when one does not actually exist Type 2 Errors - Accepting H 0 when it is actually false (e.g. previous slide) -Concluding no difference when one does exist Errors can occur due to biased/inadequate sampling, poor experimental design or the use of inappropriate/non- parametric tests.

Back to Study Design Independent Measures –Individual scores in each data set are independent of one another Repeated Measures –Individual scores in each data set are dependent/paired/correlated

Back to Study Design Independent Measures –Individual scores in each data set are independent of one another Repeated Measures –Individual scores in each data set are dependent/paired/correlated T O1O1 O2O2 T O1O1 OaOa P Pre-Experimental designs. 2 Distinct Groups Same individuals tested twice

Back to Study Design Independent Measures –Individual scores in each data set are independent of one another Repeated Measures True-Experimental design. Depends on how equivalent groups were achieved O1O1 TO2O2 P O4O4 O3O3 R Random Group Assignment Cross-Over Design

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? So the above example is anmeasures design –Which therefore requires an independent t-test. Independent AKA Students’ (Gosset’s) t-test

Sustained Isometric Torque (seconds) n♂n♂ n♀n♀ Independent t-test: Calculation MeanSDn ♀ ♂ Is this a significant effect?

Independent t-test: Calculation MeanSDn ♀ ♂ Step 1: Calculate the Standard Error for Each Mean SEM ♀ = SD/√n = 1.74/5 = SEM ♂ = SD/√n = 1.72/5 = 0.344

Independent t-test: Calculation MeanSDn ♀ ♂ Step 2: Calculate the Standard Error for the difference in means SEMdiff = √ SEM ♀ 2 + SEM ♂ 2 = √ = 0.501

Independent t-test: Calculation MeanSDn ♀ ♂ Step 3: Calculate the t statistic t = (Mean ♀ - Mean ♂ ) / SEMdiff = 2.00

Independent t-test: Calculation MeanSDn ♀ ♂ Step 4: Calculate the degrees of freedom (df) df = (n ♀ - 1) + (n ♂ - 1) = 48

Independent t-test: Calculation MeanSDn ♀ ♂ Step 5: Determine the critical value for t using a t-distribution table Degrees of FreedomCritical t-ratio n.b. Use 0.05 for 2 tailed test

Independent t-test: Calculation MeanSDn ♀ ♂ Step 6 finished: Compare t calculated with t critical Calculated t = 2.00 Critical t = 2.01 Therefore, t calculated < t critical Effect size n.s.

Independent t-test: Calculation MeanSDn ♀ ♂ Interpretation: P > 0.05Reject H A & Accept H O Conclusion: There is not a significant difference in the DV between males and females.

Independent t-test: Calculation MeanSDn ♀ ♂ Evaluation: The wealth of available literature supports that females can sustain isometric contractions longer than males. This may suggest that the findings of the present study represent a type error Possible solution: Increase n

Independent t-test: SPSS Output Swim Data from SPSS session 8 Calculated t df 18 = critical t Ignore sign > So P < 0.05

Repeated Measures Designs As shown earlier, a repeated measures design infers that data in each data set can be paired or correlated with one another An independent t-test is inappropriate to analyse such data Instead, a paired t-test should be used…

Advantages of using Paired Data Data from independent samples is heavily influenced by variance between subjects i.e. This data would have a large SD associated with an independent t-test simply because some subjects performed better than others HOWEVER… Large SD (variance)

Advantages of using Paired Data Data from independent samples is heavily influenced by variance between subjects …using the same participants on two occasions allows us to pair up the data… …now we can remove between subject variance from subsequent analysis…

Paired t-test: Calculation SubjectWeek 1Week 2Diff (D)Diff 2 (D 2 ) ∑D =∑D 2 = Steps 1 & 2: Complete this table

Paired t-test: Calculation ∑D =∑D 2 = Step 3: Calculate the t statistic t = n x ∑ D 2 – (∑D) 2 = √ (n - 1) ∑D

Paired t-test: Calculation ∑D =∑D 2 = Step 3: Calculate the t statistic t = 8 x 137 – (31) 2 = 7.06 √ 7 31

Paired t-test: Calculation Steps 4 & 5: Calculate the df and use a t-distribution table to find t critical Degrees of Freedom Critical t-ratio (0.05 level) df = n -1 Critical t-ratio (0.01 level)

Paired t-test: Calculation Step 6 finished: Compare t calculated with t critical Calculated t = 7.06 Critical t = Therefore, t calculated > t critical Effect size sig. MeanSDn Week Week

Paired t-test: Calculation MeanSDn Week Week Interpretation: P < 0.05Reject H 0 & Accept H A Conclusion: There is a significant difference in the DV between week 1 and week 2.

Paired t-test: SPSS Output Push-up Data from lecture 3 Calculated t df 7 = critical t (0.05) (0.01) Ignore sign > So P < 0.01

Parametric versus Non-Parametric Both the t-tests just shown are parametric tests These examine for differences in the mean Therefore the mean must be an accurate descriptor NormalNon-normal ?

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Sustained Isometric Torque (seconds) Normal Distribution mean is appropriate t-test Mean A Mean B

Example Hypotheses: Isometric Torque Is there any difference in the length of time that males and females can sustain an isometric muscular contraction? Sustained Isometric Torque (seconds) NON-Normal Distribution mean is INappropriate Mean A Mean B Type 2 error

…assumptions of parametric analyses All means and paired differences are ND (this is the main consideration) N acquired through random sampling Data must be of at least the interval LOM Data must be Continuous. …but see Norman (2010) Adv. Health Sci. Educ.

Non-Parametric Tests These tests use the median and do not assume anything about distribution, i.e. ‘distribution free’ Mathematically, value is ignored (i.e. the magnitude of differences are not compared) Instead, data is analysed simply according to rank.

Non-Parametric Tests Independent Measures –Mann-Whitney Test Repeated Measures –Wilcoxon Test e.g. Exam grades (ordinal) from 14 students in 2 separate schools

Mann-Whitney U: Calculation Step 1: Rank all the data from both groups in one series, then total each Student School ASchool B Student Grade Rank J. S. L. D. H. L. M. J. T. M. T. S. P. H. T. J. M. M. K. S. P. S. R. M. P. W. A. F. B- B- A+ D- B+ A- F D C+ C+ B- E C- A- Median = B-; Median = C+; ∑R A = ∑R B =

Mann-Whitney U: Calculation Step 2: Calculate two versions of the U statistic using: Median = B-; Median = C+; ∑R A = ∑R B = U 1 = (n A x n B ) + 2 (n A + 1) x n A - ∑R A AND… U 2 = (n A x n B ) + 2 (n B + 1) x n B - ∑R B

Mann-Whitney U: Calculation Step 2: Calculate two versions of the U statistic using: Median = B-; Median = C+; ∑R A = ∑R B = U 1 = (n A x n B ) + 2 (n A + 1) x n A - ∑R A …OR to save time you can calculate U 1 and then U 2 as follows U 2 = (n A x n B ) - U 1

Mann-Whitney U: Calculation Step 3 finished: Select the smaller of the two U statistics (U 1 = 17.5; U 2 = 31.5) …now consult a table of critical values for the Mann-Whitney test n Calculated U must be less than critical U to conclude a significant difference Conclusion Median A = Median B

Mann-Whitney U: SPSS Output Calculated U (lower value) 17.5 > 8 So P > 0.05 n.s.

Non-Parametric Tests Independent Measures –Mann-Whitney Test Repeated Measures –Wilcoxon Test e.g. One group pre-test post-test, assumed non-normal

Wilcoxon Signed Ranks: Calculation Step 1: Rank all the differences in one series (ignoring signs), then total each Athlete Pre-training OBLA (kph) Rank J. S. L. D. H. L. M. J. T. M. T. S. P. H ∑Signed Ranks = Post-training OBLA (kph) Diff. Signed Ranks Medians =

Wilcoxon Signed Ranks: Calculation Step 2: The smaller of the T values is our test statistic (T+ = 18; T- = 10) …now consult a table of critical values for the Wilcoxon test n Calculated T must be less than critical T to conclude a significant difference Conclusion Median A = Median B

Wilcoxon Signed Ranks: SPSS Output 10 > 2 So P > 0.05 n.s.

So which stats test should you use? Q1. What is the LOM? Ordinal Nominal Interval/Ratio Q2. Are the data ND? No Yes Q3. Are the data paired or independent?

Why do we use Hypothesis Testing? It is easy (i.e. data in  P value out) It provides the ‘Illusion of Scientific Objectivity’ Everybody else does it.

Problems with Hypothesis Testing? P<0.05 is an arbitrary probability (P<0.06?) The size of the effect is not expressed The variability of this effect is not expressed Overall, hypothesis testing ignores ‘judgement’.