University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/2015 10:24 PM 1 Review and important concepts Biological.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Inference Sampling distributions Hypothesis testing.
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Introduction to Statistics
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inference about a Mean Part II
IENG 486 Statistical Quality & Process Control
PSY 307 – Statistics for the Behavioral Sciences
Today Concepts underlying inferential statistics
Richard M. Jacobs, OSA, Ph.D.
Inferential Statistics
Statistical hypothesis testing – Inferential statistics I.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
AM Recitation 2/10/11.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview of Statistical Hypothesis Testing: The z-Test
Intermediate Statistical Analysis Professor K. Leppel.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Determining Sample Size
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 1 Two-sample comparisons Underlying principles.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 23/10/2015 9:22 PM 1 Two-sample comparisons Underlying principles.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 20/02/ :23 PM 1 Multiple comparisons What are multiple.
One Sample Inf-1 In statistical testing, we use deductive reasoning to specify what should happen if the conjecture or null hypothesis is true. A study.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 28/06/2016 4:11 PM 1 Review and important concepts.
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 06/07/2016 6:16 AM 1 Single classification analysis of variance.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing: Hypotheses
Statistical Process Control
Elements of a statistical test Statistical null hypotheses
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 1 Review and important concepts Biological questions and statistical hypotheses

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 2 Concepts map

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 3 Biological questions VS statistical null hypotheses Statistics can help you answer biological questions However, you must learn to translate biological questions into null hypotheses to be tested Do males and females differ in size? Average size of males and females are equal. Does age structure differ between two fish populations? Age (frequency) distribution is independent of population.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 4 The meaning of p Informal: the probability that the null hypothesis is true Strictly correct: the probability of observing data as deviant (from the expected results) as the observed results if in fact the null hypothesis were true, assuming the data were properly collected, and all statistical assumptions are met.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 5 To reject or not reject? The decision to reject or accept the null hypothesis is based on p. This requires some agreement (convention) as to what p value we will consider as significant. This threshold value is arbitrary!

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 6 Test statistics In standard statistical analysis, p is estimated by reference to the distribution of an appropriate test statistic. If we know the distribution of the test statistic, we can calculate the probability of getting a test statistic value at least as large (small) as the calculated value if H 0 were true, i.e., p.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 7 An example Two samples (1, 2) with mean values that differ by some amount . What is the probability p of observing this difference under H 0 that the two means are in fact equal? Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 8 An example (cont’d) If H 0 is true, and if other assumptions are met (we will get back to this…) the expected distribution of the test statistic t is the Student t distribution Probability (p) t Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 9 An example (cont’d) For the two samples, suppose t = 2.01 What is the probability of getting a value at least this large under H 0 that the two means are in fact equal? Since p is small, it is unlikely that H 0 is true. Therefore, reject H Probability t = 2.01 Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 10 Inference: How to translate p into a conclusion? If p < 0.05, reject the null hypothesis......but keep p in mind! Report p, not just whether it is “significant” (or not). Remember, the p < 0.05 “convention” is entirely arbitrary!

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 11 “Statistical significance” and real-world decision-making: an example If you were offered the same odds on each horse, on which would you bet? If you were a bookie, would you offer the same odds on each horse? And if you did, would you still be in business? Clyde’s Fancy Hypattia

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 12 Statistical errors in hypothesis testing Two types: a true null hypothesis may be rejected, or a false null hypothesis may be accepted Type I error (  ): the probability of rejecting a true null hypothesis Type II error (  ) : the probability of accepting a false null hypothesis

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 13 Errors in inference Reality ConclusionH 0 is trueH 0 is false Accept H 0 Reject H 0 no error  

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 14 Errors in inference: an example   Reality No HIV HIV Seronegative Seropositive 99% 95% 5% 1%

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 15 One- and two-tailed null hypotheses For 2-tailed H 0, there are two rejection regions of size  /2. For 1-tailed H 0 there is one rejection region of size  Probability Probability t 1-   

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 16 Example: 2-tailed H 0 No difference in populations H 0 :  1 =  2 Since H 0 is 2- tailed, would reject H 0 if  1 -  2 > 0 or  1 -  2 < 0. Frequency Sample 2 Sample Probability

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 17 Example: 1-tailed H 0 The average size of individuals in population 1 is greater than population 2 H 0 :  1 -  2  0 Since H 0 is 1- tailed, would reject H 0 if  1 -  2 > 0 only. Frequency Sample 2 Sample Probability

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 18 One versus two- tailed hypotheses 2-tailed hypothesis: reject if any non- random pattern is detected. 1-tailed hypothesis: reject if a specified directional non- random pattern is detected H 0 :  1 =  2 (2-tailed, reject) H 0 :  1   2 (1-tailed, accept) Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 19 Important note! For given “directionality”, 1- tailed test is more powerful than 2-tailed Therefore, always specify the nature of H 0 before your analysis! 2 3   Probability Probability

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 20 Parameters of statistical inference Type I error rate (  ) Power (1 - Type II error rate = 1 -  ) Sample size (N) Effect size (  ) Each of the above is a function of the other three. Hence, if three are known, so is the fourth.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 21 Power Power is the probability of rejecting the null hypothesis when it is false and a specified alternate hypothesis is true, i.e. 1- . Power can only be calculated when a specific alternate hypothesis is specified. Therefore, power depends on the alternate hypothesis. Powerful tests can detect small differences, weak tests only large differences.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 22 Calculating power: an example Expected distribution of means of samples of 5 housefly wing lengths from normal populations specified by  as shown above curves and  Y = Centre curve represents null hypothesis, H 0 :  = 45.5, curves at sides represent alternative hypotheses,  = 37 or  = 54. Vertical lines delimit 5% rejection regions for the null hypothesis H 1 :  = 37H 0 :  = 45.5H 1 :  = 54

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 23 Power: cont’d H 0 :  =  0 H 1 :  =  1  1 =54  1 =53  1 =50  1 =48.5  =  =  0 =45.5  =  = Increases in type II error, , as alternative hypothesis, H 1, approaches null hypothesis, H 0 -- that is,  1 approaches . Shading represents . Vertical lines mark off 5% critical regions (2.5% in each tail) for the null hypothesis. To simplify the graph, the alternative distributions are shown for one tail only.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 24 Effect size Every null hypothesis in any statistical test implies a value for some population parameter. E.g. if two sample means are equal, the absolute value of the difference  between the two populations is zero: X Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 25 Effect size (cont’d) More generally, since H 0 specifies a lack of some phenomenon,  quantifies the degree to which the phenomenon is present. So if H 0 is false, it is false to some specific degree, quantified by  the effect size. X Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 26 Types of power analysis I: power as a function of ,  and N Often done after a statistical test, where N (sample size) and effect size (  ) are determined and the null hypothesis has been accepted. Then, for specified , we can calculate 1-  (the power of the test) If 1-  is low, then the Type II error rate is large, so there is a good chance we have accepted a false H 0. X Frequency Sample 2 Sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 27 Types of power analysis II: N as a function of ,  and power A certain effect size (  ) is anticipated (perhaps based on a preliminary sample) with a desired  and 1- . Given ,  and  we can calculate the minimum sample size N min required to achieve the desired specifications. This exercise can be very useful in planning experiments. X Frequency Pre-sample 2 Pre-sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 28 Types of power analysis III:  as a function of , N and power Given a desired , 1-  and N, what is the minimal detectable effect size  min ? If  min is large, then only large deviations from H 0 will be detected (i.e. will result in rejection of H 0 ). Thus, we should be VERY VERY careful NOT to infer that some phenomenon does not exist if we accept H 0. X Frequency Pre-sample 2 Pre-sample 1

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 29 Power: dependence on sample size Power curves for testing H 0 :  = H 1 :  45.5 for n = 5 and for n = 35. For given observed wing length, the probability of rejecting a false null hypothesis decreases as N decreases.   0 Wing length (x 0.1 mm) Power (1-  ) n = 5 n = 35

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 30 Why power matters Two samples, identical means and variances, but differ in N in first case, power is large, p <.05, therefore reject H 0 in second case, power is low, p >.05, therefore accept H 0. Frequency Size Frequency   N = 200 N = 30

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 31 Power: conclusions If sample sizes are small, the power of any test is usually low. So, unless one knows the power of the analysis, a decision to accept the null hypothesis is meaningless! Conversely, if power is very high, rejection of the null is very likely, even if deviations from null expectations are small (and perhaps biologically meaningless)!

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 32 Statistical hypothesis testing: problems and caveats Problem 1: many H 0 s are very unlikely to be true a priori… …so that their rejection is not very informative. Treatment Average yield Treatment 1 Treatment 2 Control

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 33 Statistical hypothesis testing: problems and caveats Problem 2: Nominal type I error (e.g.  = 0.05) is entirely arbitrary, and may not bear any relationship to biological significance… … and even less to decision-making t Probabilty Threshold for decision-making

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 34 Statistical hypothesis testing: problems and caveats Problem 3: p is probability of obtaining a test statistic at least as extreme as that observed if H 0 is true… … but often the actual (sampling) distribution of the test statistic does not match the (assumed) distribution under the null. t Probabilty Null Sampled

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 35 Statistical hypothesis testing: problems and caveats Problem 4: for fixed effect size, p depends on sample size (n)… …so that one can almost always reject H 0 if the sample is sufficiently large, even if the observed effect is trivial Sample size (n) Type I error 0.05 Larger effect size Smaller effect size

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 36 Statistical hypothesis testing: problems and caveats Problem 5: since p depends on sample size (n)… … using a fixed nominal  (e.g.  = 0.05) as n increases is logically inconsistent: even for n = infinity and true H 0,  = 0.05! Sample size (n) Nominal type I error (  ) Fixed  (e.g. 0.05)  depends on n

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 37 Statistical hypothesis testing: solutions Avoid testing trivial null hypotheses Distinguish between biological (or other) significance and statistical significance Always provide estimates of effect sizes and their precision, statistical significance (or lack thereof) notwithstanding Consider using randomization and/or resampling methods to generate actual distribution of test statistics.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 38 The underlying principle of the t-test If the match between observed and expected is poorer than would be expected on the basis of measurement precision, then we should reject the null hypothesis. Fork length Frequency Reject H 0 Accept H 0 Observed Expected ee oo ee oo

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 39 Why correct for precision? Large differences between observed and expected may occur because (1) measurements are imprecise, or (2) the hypothesis is false, or (3) some combination of the two. Therefore, to conclude (2), we must first eliminate (1) and (3). Fork length Frequency Reject H 0 Accept H ee oo Observations True distribution Expected Observed

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 40 Principle of the t-test If the difference between the observed and expected results is much larger than the precision of the measurement, then something is wrong. If the difference between the observed results and those expected under the null hypothesis is much larger than the standard error, then the null hypothesis is probably incorrect.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 41 Components of the t-test Null hypothesis (H 0 ) Observations Test statistic (t) Assumptions

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 42 Test that the mean of a sample is equal to some theoretical value  T by calculating: What is probability of obtaining a t value as deviant as that observed given the null hypothesis is true? Testing an extrinsic hypothesis Reject H 0 Accept H 0 Expected Observations True distribution Observed

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 43 An example: growth rate in rainbow trout Use observed relationship between growth rate ( ) and pH to predict in a lake of a pH = 4.5. Null hypothesis is H 0 : Compare expected ( = T ) with average observed in lake with pH = 4.5. Accept H 0 pH T 10  mm/m  T Frequency Expected Observed True distribution

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 44 Inference: How to translate p into a conclusion? If p < 0.05, reject the null hypothesis... … but keep p in mind! Report p, not just whether it is “significant” (or not). Remember, the p < 0.05 is entirely arbitrary!

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 45 Assumptions p is calculated assuming the test statistic t is distributed as Student’s t (t s ) which has a well-known distribution. This assumption is true only if the data are normally distributed.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 46 The distribution of t versus Student’s t (t s ) Calculation of p assumes p(t) = p(t s ). But, as data become increasingly non- normal, the deviation between the two increases. Therefore, calculated p values are incorrect. t, data highly non-normal t  data slightly non-normal tsts Probability (p)

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 47 What if data are not normal? Translation of t into p is incorrect. But, bias is often very small, especially with large samples, due to Central Limit Theorem. So, use common sense...and worry only when p is close to nominal  level.

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 48 Increase sample size. Transform data. Use another (non-parametric) test, one that does not assume normality. What if data are not normal and p is close to  ?

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 49 Data transformations typically simple mathematical functions like log(X), sqrt(X), arcsin(X) Choice based upon theory or trial and error. problem 1: finding an appropriate transformation can be a like finding a needle in a haystack problem 2: some data cannot be normalized!

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 50 Statistical analysis as model building All statistical analyses begin with a mathematical model that supposedly “describes” the data, e.g., regression, ANOVA. “Model fitting” is then the process by which model parameters are estimated. X Y Y 22 22   42 Group 1 Group 2 Group 3 Linear regression ANOVA

University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/10/ :24 PM 51 Translating biological questions into statistical models Blackfly abundance varies spatially? Hypothesis: Food is the answer to everything Prediction: Abundance is related to food availability Model: Abundance=k+Food+Error H 0 : Abundance=k+Error