Stor 155, Section 2, Last Time Hypothesis Testing –Assess strength of evidence with P-value P-value interpretation: –Yes – No –Gray – level –1 - sided.

Slides:



Advertisements
Similar presentations
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Advertisements

Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Stat 31, Section 1, Last Time Hypothesis Testing –Careful about 1-sided vs. 2-sided Connection: CIs - Hypo Tests 3 Traps of Hypo Testing –Statistically.
Stat 31, Section 1, Last Time Hypothesis Tests –H -, H 0 or H + version of testing –P-values – small for strong evidence –1-sided (strong evidence for.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
STATISTICAL INFERENCE PART V
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
Chapter 9 Hypothesis Testing.
Ch. 9 Fundamental of Hypothesis Testing
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Inference about Population Parameters: Hypothesis Testing
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Confidence Intervals and Hypothesis Testing
Inference in practice BPS chapter 16 © 2006 W.H. Freeman and Company.
14. Introduction to inference
Stat 31, Section 1, Last Time T distribution –For unknown, replace with –Compute with TDIST & TINV (different!) Paired Samples –Similar to above, work.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
More About Significance Tests
BPS - 3rd Ed. Chapter 141 Tests of Significance: The Basics.
Significance Tests in practice Chapter Tests about a population mean  When we don’t know the population standard deviation σ, we perform a one.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Significance Tests: THE BASICS Could it happen by chance alone?
Stat 31, Section 1, Last Time Statistical Inference Confidence Intervals: –Range of Values to reflect uncertainty –Bracket true value in 95% of repetitions.
Essential Statistics Chapter 131 Introduction to Inference.
INTRODUCTION TO INFERENCE BPS - 5th Ed. Chapter 14 1.
CHAPTER 14 Introduction to Inference BPS - 5TH ED.CHAPTER 14 1.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Stor 155, Section 2, Last Time Review…. Stat 31 Final Exam: Date & Time: Tuesday, May 8, 8:00-11:00 Last Office Hours: Thursday, May 3, 12:00 - 5:00 Monday,
Chapter 20 Testing hypotheses about proportions
Hypotheses tests for means
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Essential Statistics Chapter 141 Thinking about Inference.
Section 10.1 Confidence Intervals
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
CHAPTER 9 Testing a Claim
BPS - 3rd Ed. Chapter 141 Tests of significance: the basics.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Statistics for Decision Making Basic Inference QM Fall 2003 Instructor: John Seydel, Ph.D.
Chapter 9: Hypothesis Tests Based on a Single Sample 1.
BPS - 5th Ed. Chapter 151 Thinking about Inference.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Administrative Matters Midterm II Results Take max of two midterm scores:
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
A significance test or hypothesis test is a procedure for comparing our data with a hypothesis whose truth we want to assess. The hypothesis is usually.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Hypothesis Tests for 1-Proportion Presentation 9.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 21 More About Tests and Intervals.
Stat 31, Section 1, Last Time Choice of sample size
Unit 5: Hypothesis Testing
Essential Statistics Introduction to Inference
Significance Tests: The Basics
Presentation transcript:

Stor 155, Section 2, Last Time Hypothesis Testing –Assess strength of evidence with P-value P-value interpretation: –Yes – No –Gray – level –1 - sided vs. 2 - sided “paradox”

Reading In Textbook Approximate Reading for Today’s Material: Pages , , Approximate Reading for Next Class: Pages

Hypothesis Testing, III A “paradox” of 2-sided testing: Can get strange conclusions (why is gray level sensible?) Fast food example: suppose gathered more data, so n = 20, and other results are the same

Hypothesis Testing, III One-sided test of: P-value = … = Part 5 of Two-sided test of: P-value = … = 0.062

Hypothesis Testing, III Yes-no interpretation: Have strong evidence But no evidence !?! (shouldn’t bigger imply different?)

Hypothesis Testing, III Notes: i.Shows that yes-no testing is different from usual logic (so be careful with it!) ii.Reason: 2-sided admits more uncertainty into process (so near boundary could make a difference, as happened here) iii.Gray level view avoids this: (1-sided has stronger evidence, as expected)

Hypothesis Testing, III Lesson: 1-sided vs. 2-sided issues need careful: 1.Implementation (choice does affect answer) 2.Interpretation (idea of being tested depends on this choice) Better from gray level viewpoint

Hypothesis Testing, III CAUTION: Read problem carefully to distinguish between: One-sided Hypotheses - like: Two-sided Hypotheses - like:

Hypothesis Testing Hints: Use 1-sided when see words like: –Smaller –Greater –In excess of Use 2-sided when see words like: –Equal –Different Always write down H 0 and H A –Since then easy to label “more conclusive” –And get partial credit….

Hypothesis Testing E.g. Text book problem 6.34: In each of the following situations, a significance test for a population mean, is called for. State the null hypothesis, H 0 and the alternative hypothesis, H A in each case….

Hypothesis Testing E.g. 6.34a An experiment is designed to measure the effect of a high soy diet on bone density of rats. Let = average bone density of high soy rats = average bone density of ordinary rats (since no question of “bigger” or “smaller”)

Hypothesis Testing E.g. 6.34b Student newspaper changed its format. In a random sample of readers, ask opinions on scale of -2 = “new format much worse”, -1 = “new format somewhat worse”, 0 = “about same”, +1 = “new a somewhat better”, +2 = “new much better”. Let = average opinion score

Hypothesis Testing E.g. 6.34b (cont.) No reason to choose one over other, so do two sided. Note: Use one sided if question is of form: “is the new format better?”

Hypothesis Testing E.g. 6.34c The examinations in a large history class are scaled after grading so that the mean score is 75. A teaching assistant thinks that his students have a higher average score than the class as a whole. His students can be considered as a sample from the population of all students he might teach, so he compares their score with 75. = average score for all students of this TA

Hypothesis Testing E.g. Textbook problem 6.36 Translate each of the following research questions into appropriate and Be sure to identify the parameters in each hypothesis (generally useful, so already did this above).

Hypothesis Testing E.g. 6.36a A researcher randomly divides 6-th graders into 2 groups for PE Class, and teached volleyball skills to both. She encourages Group A, but acts cool towards Group B. She hopes that encouragement will result in a higher mean test for group A. Let = mean test score for Group A = mean test score for Group B

Hypothesis Testing E.g. 6.36a Recall: Set up point to be proven as H A

Hypothesis Testing E.g. 6.36b Researcher believes there is a positive correlation between GPA and esteem for students. To test this, she gathers GPA and esteem score data at a university. Let = correlation between GPS & esteem

Hypothesis Testing E.g. 6.36c A sociologist asks a sample of students which subject they like best. She suspects a higher percentage of females, than males, will name English. Let: = prop’n of Females preferring English = prop’n of Males preferring English

Hypothesis Testing HW on setting up hypotheses: 6.35, 6.37

Hypothesis Testing Connection between Confidence Intervals and Hypothesis Tests: Reject at Level 0.05 P-value < 0.05 dist’n Area < margin of error

Hypothesis Testing & CIs Reject at Level 0.05 Notes: 1.This is why EXCEL’s CONFIDENCE function uses = 1 – coverage prob. 2.If only care about 2-sided hypos, then could work only with CIs (and not learn about hypo. tests)

Hypothesis Testing & CIs HW: 6.71

Hypothesis Testing The three traps of Hypothesis Testing (and how to avoid them…) Trap 1: Statistically Significant is different from Really Significant (don’t confuse them)

Hypothesis Testing Traps Trap 1: Statistically Significant is different from Really Significant E.g. To test a painful diet program, 10,000 people were put on it. Their average weight loss was 1.7 lbs, with s = 73. Assess “significance” by hypothesis testing.

Hypothesis Testing Traps Trap 1: Statistically Significant is different from Really Significant See Class Example 25: Trap 1 P-value = Strongly Statistically Significant Careful: Is this practically significant?

Hypothesis Testing Traps Trap 1, e.g: Is this practically significant? NO! Not worth painful diet to lose 1.7 lbs. Resolution: Hypo. testing resolves question: Could observed results be due to chance variation? Answer here is no, since n is really large.

Hypothesis Testing Traps Trap 1, e.g: Is this practically significant? Answer here is no, since n is really large. But this is different from question: Do results show a big difference?

Hypothesis Testing Traps Trap 2: Insignificant results do not mean nothing is there, Only: Didn’t have strong enough data to actually prove results. E.g. Class 25, Trap 2

Hypothesis Testing Traps Trap 3: Try enough tests, and you will find “something” even where it doesn’t exist. Revisit Class Example 21, Q4 We saw about 5% of CIs don’t cover. So, (using CI – Hypo Test connection), expect about 5% of tests to choose H A, (and claim “strong evidence) even when H 0 true.

Hypothesis Testing Traps Strategies to avoid Trap 3: 1.Scientific Method: Form Hypothesis tests once. 2.For repeated tests: use careful adjustments: (beyond scope of this course) Get help if needed

Hypothesis Testing Traps HW: 6.74 (about 50) 6.82 (0.382, 0.171, ) 6.83, 6.84 (0.0505, )

Some stuff for next time (next slides)

Hypothesis Testing View 3: level testing Idea: instead of reporting P-value, choose a fixed level, say 5% Then reject H 0, i.e. find strong evidence… When P-value < 0.05 (more generally ) (slight recasting of yes-no version of testing) HW: 6.53 (careful, already assigned above)

Hypothesis Testing View 4: P-value shows only half of the decision problem: Graphical Illustration: Truth H 0 true H A true Test Result: Choose H 0 Choose H A CorrectType II Error Type I Error Correct

Hypothesis Testing View 4: Both sides of decision problem: Small P-values  Small Type I error What about Type II error? (seems part of problem) a.Simplistic Answer: Don’t care because have put burden of proof on H A.

Hypothesis Testing View 4: What about Type II error? b.Deeper Answer: Does matter for test sensitivity issues, and test power issues E.g. how large a sample size is needed for a given test power (treated above)?

Hypothesis Testing View 4: Terminology: P{Type I Error} is called level of test 1 - P{Type I Error} is called specificity 1 – P{Type II Error} is called power of test 1 – P{Type II Error} is called sensitivity

And now for something completely different… A statistician’s view on politics… Some Current Controversial Issues: North Carolina State Lottery Replace Social Security by Individual Retirement Plans Debate is passionate, (natural for complex and important issues) But what is missing?

And now for something completely different… Review Ideas on State Lotteries, from our study of Expected Value Not an obvious choice because: Gambling is (at least) unsavory: –Religious objections –Some like it too much –Destroys some lives

And now for something completely different… State Lotteries, not an obvious choice: The only totally voluntary tax: –Nobody required, unlike all other taxes –Money often used for education –Good or bad, given state of economy??? Highest tax burden on the poor –Poor enjoy playing much more –Higher taxes on poor better for society??? –Tendency towards “rich get richer”???

And now for something completely different… What about Individual Retirement Plans: Main Benefit: On average individual investments return greater yields than government investments So can we conclude: “Overall we are all better off”??? Since more total money to go around?

And now for something completely different… Very common mistake in this reasoning: Notice “on average” part of statement Should also think about variation about the average???

And now for something completely different… Variation about average Issue 1: Should think of population of people Average is over this population Except some to do great And expect some to lose everything What will the percentage of losers be? What do we do with those who lose all? What will that cost?

And now for something completely different… Variation about average Issue 2: Also are averaging over time Overall gains of stock market happen only over this average Some need $$$ when market is down How often will this happen? How do we deal with it?

And now for something completely different… Main concept I hope you carry away from this course: Variation is a fundamental concept Look for it Think about it Ask questions about it (Vital to informed citizenship)

And now for something completely different… Australian joke about Variation: Did you hear about the man who drowned in a lake with average depth 6 inches?

And now for something completely different… Australian joke about Variation: He understood “average”, but not variation about the average

And now for something completely different… Really have such lakes? Yes, in Australia

And now for something completely different… Suggestions of such issues (politics, controversy…) for discussion are welcome….

Hypothesis Testing Other views of hypothesis testing: View 2: Z-scores Idea: instead of reporting p-value (to assess statistical significance) Report the Z-score A different way of measuring significance

Hypothesis Testing – Z scores E.g. Fast Food Menus: Test Using P-value = P{what saw or m.c.| H 0 & H A bd’ry}

Hypothesis Testing – Z scores P-value = P{what saw or or m.c.| H 0 & H A bd’ry}

Hypothesis Testing – Z scores P-value This is the Z-score Computation: Class E.g. 24, Part 6 Distribution: N(0,1)

Hypothesis Testing – Z scores P-value So instead of reporting tail probability, Report this cutoff instead, as “SDs away from mean $20,000” HW: 6.67, but use NORMDIST, not Table D

Sec. 7.1: Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution Usually is unknown, so replace with an estimate,. For n large, should be “OK”, but what about: n small? How large is n “large”?

Unknown SD Approach: Account for “extra variability in the approximation” Mathematics: Assume individual I.e. Data have mound shaped histogram Recall averages generally normal But now must focus on individuals

Unknown SD Then Replace by, then has a distribution named: “t-distribution with n-1 degrees of freedom”

t - Distribution Notes: 1.n is a parameter (like ) that controls “added variability from approximation” View: Study Densities, over degrees of freedom…

t - Distribution Comments on movie Serious differences for n <= 10 (since approximation is bad) Very little difference for n large (since approximation really good) Rule of thumb cutoff: many say “negligible difference for n > 30” Cutoffs more informative than curves

t - Distribution Nice Alternate View: Webster West (U. So. Carolina) Applet: Notes: –Big change for small degrees of freedom –Looks very normal for large degrees of freedom –I would overlay with Normal…

t - Distribution Notes: 2.Careful: set “degrees of freedom” = = n – 1 (not n) Easy to forget later Good to add to sheet of notes for exam HW: 7.21 a

t - Distribution Notes: 3.Must work with standardized version of i.e. No longer can plug mean and SD into EXCEL formulas In text this was already done, Since need this for Normal table calc’ns