Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution.

Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution of Sample Means –Law of Averages, Part 1 –Normal Data  Normal Mean –Law of Averages, Part 2: Everything (averaged)  Normal

Reading In Textbook Approximate Reading for Today’s Material: Pages 382-396, 400-416 Approximate Reading for Next Class: Pages 425-428, 431-439

Chapter 6: Statistical Inference Main Idea: Form conclusions by quantifying uncertainty (will study several approaches, first is…)

Section 6.1: Confidence Intervals Background: The sample mean,, is an “estimate” of the population mean, How accurate? (there is “variability”, how much?)

Confidence Intervals Recall the Sampling Distribution: (maybe an approximation)

Confidence Intervals Thus understand error as: How to explain to untrained consumers? (who don’t know randomness, distributions, normal curves)

Confidence Intervals Approach: present an interval With endpoints: Estimate +- margin of error I.e. reflecting variability How to choose ?

Confidence Intervals Choice of “Confidence Interval radius”, i.e. margin of error, : Notes: No Absolute Range (i.e. including “everything”) is available From infinite tail of normal dist’n So need to specify desired accuracy

Confidence Intervals HW: 6.1

Confidence Intervals Choice of margin of error, : Approach: Choose a Confidence Level Often 0.95 (e.g. FDA likes this number for approving new drugs, and it is a common standard for publication in many fields) And take margin of error to include that part of sampling distribution

Confidence Intervals E.g. For confidence level 0.95, want distribution 0.95 = Area = margin of error

Confidence Intervals Computation: Recall NORMINV takes areas (probs), and returns cutoffs Issue: NORMINV works with lower areas Note: lower tail included

Confidence Intervals So adapt needed probs to lower areas…. When inner area = 0.95, Right tail = 0.025 Shaded Area = 0.975 So need to compute:

Confidence Intervals Need to compute: Major problem: is unknown But should answer depend on ? “Accuracy” is only about spread Not centerpoint Need another view of the problem

Confidence Intervals Approach to unknown : Recenter, i.e. look at dist’n Key concept: Centered at 0 Now can calculate as:

Confidence Intervals Computation of: Smaller Problem: Don’t know Approach 1: Estimate with Leads to complications Will study later Approach 2: Sometimes know

Confidence Intervals E.g. Crop researchers plant 15 plots with a new variety of corn. The yields, in bushels per acre are: Assume that = 10 bushels / acre 138 139.1 113 132.5 140.7 109.7 118.9 134.8 109.6 127.3 115.6 130.4 130.2 111.7 105.5

Confidence Intervals E.g. Find: a)The 90% Confidence Interval for the mean value, for this type of corn. b)The 95% Confidence Interval. c)The 99% Confidence Interval. d)How do the CIs change as the confidence level increases? Solution, part 1 of: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg22.xls

Confidence Intervals An EXCEL shortcut: CONFIDENCE Careful: parameter is: 2 tailed outer area So for level = 0.90, = 0.10

Confidence Intervals HW: 6.5, 6.9, 6.13, 6.15, 6.19

Choice of Sample Size Additional use of margin of error idea Background: distributions Small n Large n

Choice of Sample Size Could choose n to make = desired value But S. D. is not very interpretable, so make “margin of error”, m = desired value Then get: “ is within m units of, 95% of the time”

Choice of Sample Size Given m, how do we find n? Solve for n (the equation):

Choice of Sample Size Graphically, find m so that: Area = 0.95 Area = 0.975

Choice of Sample Size Thus solve:

Choice of Sample Size Numerical fine points: Change this for coverage prob. ≠ 0.95 Round decimals upwards, To be “sure of desired coverage”

Choice of Sample Size EXCEL Implementation: Class Example 22, Part 2: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg22.xls HW: 6.22 (1945), 6.23

Interpretation of Conf. Intervals 2 Equivalent Views: Distribution Distribution 95% pic 1 pic 2

Interpretation of Conf. Intervals Mathematically: pic 1 pic 2 no pic

Interpretation of Conf. Intervals Frequentist View: If repeat the experiment many times, About 95% of the time, CI will contain (and 5% of the time it won’t)

Confidence Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Shows proper interpretation: If repeat drawing the sample Interval will cover truth 95% of time

Interpretation of Conf. Intervals Revisit Class Example 17 http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls Recall Class HW: Estimate % of Male Students at UNC C.I. View: Class Example 23 http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg23.xls Illustrates idea: CI should cover 95% of time

Interpretation of Conf. Intervals Class Example 23: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg23.xls Q1: SD too small  Too many cover Q2: SD too big  Too few cover Q3: Big Bias  Too few cover Q4: Good sampling  About right Q5: Simulated Bi  Shows “natural var’n”

Interpretation of Conf. Intervals HW: 6.27, 6.29, 6.31

And now for something completely different…. A fun dance video: http://ebaumsworld.com/2006/07/robotdance.html Suggested by David Moltz

Sec. 6.2 Tests of Significance = Hypothesis Tests Big Picture View: Another way of handling random error I.e. a different view point Idea: Answer yes or no questions, under uncertainty (e.g. from sampling or measurement error)

Hypothesis Tests Some Examples: Will Candidate A win the election? Does smoking cause cancer? Is Brand X better than Brand Y? Is a drug effective? Is a proposed new business strategy effective? (marketing research focuses on this)

Hypothesis Tests E.g. A fast food chain currently brings in profits of $20,000 per store, per day. A new menu is proposed. Would it be more profitable? Test: Have 10 stores (randomly selected!) try the new menu, let = average of their daily profits.

Fast Food Business Example Simplest View: for : new menu looks better. Otherwise looks worse. Problem: New menu might be no better (or even worse), but could have by bad luck of sampling (only sample of size 10)

Fast Food Business Example Problem: How to handle & quantify gray area in these decisions. Note: Can never make a definite conclusion e.g. as in Mathematics, Statistics is more about real life… (E.g. even if or, that might be bad luck of sampling, although very unlikely)

Hypothesis Testing Note: Can never make a definite conclusion, Instead measure strength of evidence. Approach I: (note: different from text) Choose among 3 Hypotheses: H + : Strong evidence new menu is better H 0 : Evidence is inconclusive H - : Strong evidence new menu is worse

Caution!!! Not following text right now This part of course can be slippery I am “breaking this down to basics” Easier to understand (If you pay careful attention) Will “tie things together” later And return to textbook approach later

Hypothesis Testing Terminology: H 0 is called null hypothesis Setup: H +, H 0, H - are in terms of parameters, i.e. population quantities (recall population vs. sample)

Fast Food Business Example E.g. Let = true (over all stores) daily profit from new menu. H + : (new is better) H 0 : (about the same) H - : (new is worse)

Fast Food Business Example Base decision on best guess: Will quantify strength of the evidence using probability distribution of E.g.  Choose H +  Choose H 0  Choose H -

Fast Food Business Example How to draw line? (There are many ways, here is traditional approach) Insist that H + (or H - ) show strong evidence I.e. They get burden of proof (Note: one way of solving gray area problem)

Fast Food Business Example Assess strength of evidence by asking: “How strange is observed value, assuming H 0 is true?” In particular, use tails of H 0 distribution as measure of strength of evidence

Fast Food Business Example Use tails of H 0 distribution as measure of strength of evidence: distribution under H 0 observed value of Use this probability to measure strength of evidence

Hypothesis Testing Define the p-value, for either H + or H -, as: P{what was seen, or more conclusive | H 0 } Note 1: small p-value  strong evidence against H 0, i.e. for H + (or H - ) Note 2: p-value is also called observed significance level.

Fast Food Business Example Suppose observe:, based on Note, but is this conclusive? or could this be due to natural sampling variation? (i.e. do we risk losing money from new menu?)

Fast Food Business Example Assess evidence for H + by: H + p-value = Area

Fast Food Business Example Computation in EXCEL: Class Example 22, Part 1: http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg24.xls P-value = 0.094. “1 in 10”, “could be random variation”, “not very strong evidence”

Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution.

Similar presentations

Presentation on theme: "Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution.

Similar presentations

Presentation on theme: "Stat 155, Section 2, Last Time Binomial Distribution –Normal Approximation –Continuity Correction –Proportions (different scale from “counts”) Distribution."— Presentation transcript:

Similar presentations

About project

Feedback