Copyright (c) Bani Mallick1 STAT 651 Lecture 7. Copyright (c) Bani Mallick2 Topics in Lecture #7 Sample size for fixed power Never, ever, accept a null.

Slides:



Advertisements
Similar presentations
A small taste of inferential statistics
Advertisements

Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
+ Chapter 10 Section 10.4 Part 2 – Inference as Decision.
Copyright (c) Bani Mallick1 Stat 651 Lecture 5. Copyright (c) Bani Mallick2 Topics in Lecture #5 Confidence intervals for a population mean  when the.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #17.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Nemours Biomedical Research Statistics March 19, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Copyright (c) Bani Mallick1 Lecture 2 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #2 Population and sample parameters More on populations.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture 9.
Copyright (c) Bani Mallick1 STAT 651 Lecture 10. Copyright (c) Bani Mallick2 Topics in Lecture #10 Comparing two population means using rank tests Comparing.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #16.
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #15.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #13.
Chapter 9 Hypothesis Testing.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture # 12.
BCOR 1020 Business Statistics
Copyright (c) Bani Mallick1 STAT 651 Lecture # 11.
5-3 Inference on the Means of Two Populations, Variances Unknown
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
8/20/2015Slide 1 SOLVING THE PROBLEM The two-sample t-test compare the means for two groups on a single variable. the The paired t-test compares the means.
Confidence Intervals and Hypothesis Testing - II
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Hypothesis testing – mean differences between populations
1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.
Confidence Intervals and Hypothesis Testing
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
More About Significance Tests
Introduction to Statistical Inferences Inference means making a statement about a population based on an analysis of a random sample taken from the population.
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Comparing Two Population Means
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
CHAPTER 18: Inference about a Population Mean
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
1 Lecture 19: Hypothesis Tests Devore, Ch Topics I.Statistical Hypotheses (pl!) –Null and Alternative Hypotheses –Testing statistics and rejection.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Confidence intervals and hypothesis testing Petter Mostad
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
Chapter 8 Parameter Estimates and Hypothesis Testing.
Example 10.2 Measuring Student Reaction to a New Textbook Hypothesis Tests for a Population Mean.
1 Hypothesis Testing A criminal trial is an example of hypothesis testing. In a trial a jury must decide between two hypotheses. The null hypothesis is.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture 6.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
AP Statistics Chapter 21 Notes
Copyright (c) Bani Mallick1 STAT 651 Lecture 8. Copyright (c) Bani Mallick2 Topics in Lecture #8 Sign test for paired comparisons Wilcoxon signed rank.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Critical Appraisal Course for Emergency Medicine Trainees Module 2 Statistics.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
YOU HAVE REACHED THE FINAL OBJECTIVE OF THE COURSE
Presentation transcript:

Copyright (c) Bani Mallick1 STAT 651 Lecture 7

Copyright (c) Bani Mallick2 Topics in Lecture #7 Sample size for fixed power Never, ever, accept a null hypothesis Paired comparisons in SPSS Student’s t-distributions Confidence intervals when  is unknown SPSS output on confidence intervals, without formulae

Copyright (c) Bani Mallick3 Book Sections Covered in Lecture #7 Chapter 5.5 (sample size) Chapter 6.4 (paired data) Chapter 5.7 (t-distribution) My own screed (never, ever, accept a null hypothesis)

Copyright (c) Bani Mallick4 Lecture 6 Review: Hypothesis Testing Suppose you want to know whether the population mean change in reported caloric intake equals zero We have already done this!!!!! Confidence intervals tell you where the population mean  is, with specified probability If zero is not in the confidence interval, then you can reject the hypothesis

Copyright (c) Bani Mallick5 Lecture 6 Review: Type I Error (False Reject) A Type I error occurs when you say that the null hypothesis is false when in fact it is true You can never know for certain whether or not you have made such an error You can only control the probability that you make such an error  t is convention to make the probability of a Type I error 5%, although 1% and 10% are also used

Copyright (c) Bani Mallick6 Lecture 6 Review: Type I Error Rates Choose a confidence level, call it 1 -  The Type I error rate is   confidence interval:  = 10%  confidence interval:  = 5%  confidence interval:  = 1%

Copyright (c) Bani Mallick7 Lecture 6 Review: Type II: The Other Kind of Error The other type of error occurs when you do NOT reject even though it is false This often occurs because you study sample size is too small to detect meaningful departures from Statisticians spend a lot of time trying to figure out a priori if a study is large enough to detect meaningful departures from a null hypothesis

Copyright (c) Bani Mallick8 Lecture 6 Review: P-values Small p-values indicate that you have rejected the null hypothesis If p < 0.05, this means that you have rejected the null hypothesis with a confidence interval of 95% or a Type I error rate of 0.05 If p > 0.05, you did not reject the null hypothesis at these levels

Copyright (c) Bani Mallick9 Lecture 6 Review: Statistical Power Statistical power is defined as the probability that you will reject the null hypothesis when you should reject it. If  is the Type II error, power = 1 -  The Type I error (test level) does NOT depend on the sample size: you chose it (5%?) The power depends crucially on the sample size

Copyright (c) Bani Mallick10 Sample Size Calculations You want to test at level (Type I error)  the null hypothesis that the mean = 0 You want power 1 -  to detect a change of from the hypothesized mean by the amount  or more, i.e., the mean is greater than  or the mean is less than -  There is a formula for this!!

Copyright (c) Bani Mallick11 Sample Size Calculations  Look up z  and z  Remember what they are? Find the values in Table 1 which give you readings of 1-  and 1-  Required sample size is

Copyright (c) Bani Mallick12 Sample Size Calculations  =0.01  =0.90  =180  =600 Look up z  =2.58 and z  =1.28 (Check this) = 166  =0.01  =0.80  =180  =600, z  =0.84 (Check this) n = 130: the less power you want, the smaller the sample size

Copyright (c) Bani Mallick13 More on Sample Size Calculations Most often, sample sizes are done by convention or convenience: Your professor has used 5 rats/group before successfully You have time only to interview 50 subjects in total

Copyright (c) Bani Mallick14 More on Sample Size Calculations More often, sample sizes are done by convention or convenience: In this case, the sample size calculations can be used after a study if you find no statistically significant effect You can then guess how large a study you would have needed to detect the effect you have just seen but which was not statistically significant

Copyright (c) Bani Mallick15 Never Accept a Null Hypothesis Suppose we use a 95% confidence interval, it includes zero. Why do I say : with 95% confidence, I cannot reject that the population mean is zero. I never, ever say: I can therefore conclude that the population mean is zero. Why is this? Are statisticians just weird? (maybe so, but not in this case)

Copyright (c) Bani Mallick16 Never Accept a Null Hypothesis: Reason 1 Suppose we use a 95% confidence interval, it includes zero: [-3,6]. Why do I say : with 95% confidence, I cannot reject that the population mean is zero. Remember the definition of a confidence interval: the chance is 95% that the true population mean is between -3 and 6: hence, the true population mean could be 5, and is not necessarily = 0.

Copyright (c) Bani Mallick17 Never Accept a Null Hypothesis: Reason 2 Suppose we use a 95% confidence interval, it includes zero: [-3,6]. Why do I say : with 95% confidence, I cannot reject that the population mean is zero. Potential for chicanery: if you want to accept the null hypothesis, how can you best insure it?

Copyright (c) Bani Mallick18 Never Accept a Null Hypothesis: Reason 2 An example of chicanery: generic drugs In the pharmaceutical industry, all the expense involves getting a drug approved by the FDA After a drug goes off-patent, generic drugs can be marketed The main regulation is that the generic must be shown to be “bioeqiuvalent” to the patent drug

Copyright (c) Bani Mallick19 Never Accept a Null Hypothesis: Reason 2 The generic must be shown to be “bioeqiuvalent” to the patent drug One way would be to run a study and do a statistical test to see whether the drugs have the same effects/actions: the null hypothesis is that the patent and generic are the same The alternative is that they are not If the null is rejected, the generic is rejected, and $$$ issues arise

Copyright (c) Bani Mallick20 Never Accept a Null Hypothesis: Reason 2 Test to see whether the drugs have the same effects/actions: the null hypothesis is that the patent and generic are the same If the null is rejected, the generic is rejected, and $$$ issues arise If you pick a tiny sample size, there is no statistical power to reject the null hypothesis

Copyright (c) Bani Mallick21 Never Accept a Null Hypothesis: Reason 2 If you pick a tiny sample size, there is no statistical power to reject the null hypothesis The FDA is not stupid: they insist that the sample size be large enough that any medically important differences can be detected with 80% (1 -  ) statistical power

Copyright (c) Bani Mallick22 Never Accept a Null Hypothesis p-values are not the probability that the null hypothesis is true. For example, suppose you have a vested interest in not rejecting the null hypothesis. Small sample sizes have the least power for detecting effects. Small sample sizes imply large p-values. Large p-values can be due to a lack of power, or a lack of an effect.

Copyright (c) Bani Mallick23 Paired Comparisons: Count you Number of Populations! The hormone assay data illustrate an important point. Sometimes, we measure 2 variables on the same individuals Reference Method and Test Method There is only 1 population. How do we compare the two variables to see if they have the same mean?

Copyright (c) Bani Mallick24 Paired Comparisons: Count you Number of Populations! There is only 1 population. How do we compare the two variables to see if they have the same mean? Answer (Ott & Longnecker, Chapter 6.4): do what we did and first compute the difference of the variables and make inference on this difference: now have 1 variable In making inference, match the number of variables to the number of populations!

Copyright (c) Bani Mallick25 Paired Comparisons in SPSS SPSS has a nice routine way of doing a paired comparison analysis, providing confidence intervals and p-values “Analyze” “Compare Means” “Paired Samples t-test” Highlight the variables that are paired and select: use “options” to get other than 95% CI

Copyright (c) Bani Mallick26 Paired Comparisons in SPSS Demo using computer comes next

Copyright (c) Bani Mallick27 Boxplots and Histograms for Paired data For paired data, SPSS makes it easy to automatically get confidence intervals: it takes the difference of the paired variables for you However, for boxplots, qq-plots, etc., you have to do this manually. Here is how you can define a new variable, called “differen”, in the armspan data for males.

Copyright (c) Bani Mallick28 Computing the Difference in Paired Comparisons Click on “Transform” Click on “Compute” New window shows up, in “Target Variable” type in differen Click on “Type & Label” and type in your label (Height - Armspan in Inches) click on “Continue”

Copyright (c) Bani Mallick29 Computing the Difference in Paired Comparisons Highlight height and move over by clicking the mover button In “Numeric Expression”, type in the minus sign - Highlight armspan and move over Click on “OK” You are done!

Copyright (c) Bani Mallick30 Selecting Cases in SPSS “Data” “Select Cases” Push button of “If condition is satisfied” Select “If” Select “Gender” and move over Then type = ‘Female’ and “Continue” “OK” --> all analyses will be on Females

Copyright (c) Bani Mallick31 Student’s t-Distribution In real life, the population standard deviation  is never known We estimate it by the sample standard deviation s To account for this estimation, we have to make our confidence intervals (make a guess): longer or shorter? Stump the experts!

Copyright (c) Bani Mallick32 Student’s t-distribution Of course: you have to make the confidence interval longer! This fact was discovered by W. Gossett, the brewmaster of Guinness in Dublin. He wrote it up anonymously under the name “Student”, and his discovery is hence called Students t-distribution because he used the letter t in his paper.

Copyright (c) Bani Mallick33 Student’s t-Distribution Effectively, if you want a (1  100% confidence interval, what you do is to replace z  (1.645, 1.96, 2.58) by t  (n-1) found in Table 2 of the book. n-1 is called the degrees of freedom The increase in length of the confidence interval depends on n. If n gets larger, does the CI get larger or smaller?

Copyright (c) Bani Mallick34 Student’s t-Distribution The (1  100% CI when  was known was The (1  100% CI when is  unknown is You replace  by s and by t  (n-1)

Copyright (c) Bani Mallick35 Student’s t-Distribution Take 95% confidence,  = 0.05 z  = 1.96 n = 3, n-1 = 2, t  (n-1) = n = 10, n-1 = 9, t  (n-1) = n = 30, n-1 = 29, t  (n-1) = n = 121, n-1 = 120, t  (n-1) = 1.98

Copyright (c) Bani Mallick36 Student’s t-Distribution Luckily, SPSS is smart. It automatically uses Student’s t- distribution in constructing confidence intervals and p-values! So, all the output you will see in SPSS has this correction built in

Copyright (c) Bani Mallick37 Student’s t-Distribution In the old days, people used the t-test to decide whether the hypothesize value is in the CI. If your hypothesis is that  = 0, then you reject the hypothesis if You learn nothing from this not available in a CI, but its value is in SPSS

Copyright (c) Bani Mallick38 WISH Numerical Illustration s = 613, Xbar = -180 n = 3, s.e. = 613 / 3 1/2 = 354, t  (n- 1) = 4.303, CI is -180 plus and minus 1523, hence the interval is [-1703, 1343] n = 121, s.e. = 613 / 121 1/2 = 59, t  (n-1) = 1.98, CI is -180 plus and minus 118, hence the interval is [-298,-62] Note change in conclusions!

Copyright (c) Bani Mallick39 Armspan Data for Males Outcome is height – armspan in inches In SPSS, “Analyze”, “Descriptives”, “Explore” will get you to the right analysis Illustrate how to do this in SPSS

Copyright (c) Bani Mallick40 Armspan Data for Males Sample mean = Sample standard error = Lower bound of 95% CI = Upper bound of 95% CI = Is there evidence with 95% confidence that armspans for males differ systematically from heights?

Copyright (c) Bani Mallick41 Armspan Data for Males Might ask: what about with 90% confidence Illustrate how to do this in SPSS

Copyright (c) Bani Mallick42 Armspan Data for Males Sample mean = Sample standard error = Lower bound of 90% CI = Upper bound of 90% CI = Is there evidence with 90% confidence that armspans for males differ systematically from heights?

Copyright (c) Bani Mallick43 Armspan Data for Males SPSS will compute the p-value for you as well as confidence intervals. For paired comparisons, “Analyze”, “Compare Means”, “Paired Sample”. Highlight the paired variables. It computes the difference of the first named variable in the list minus the second Illustration in SPSS

Copyright (c) Bani Mallick44 Armspan Data for Males t = p-value (significance level) = SPSS also automatically does a 95% confidence interval for the population mean difference between heights and armspans