QM Spring 2002 Business Statistics Introduction to Inference: Hypothesis Testing
Student Objectives Review concepts of sampling distributions List and distinguish between the two types of inference Summarize hypothesis testing procedures Conduct hypothesis tests concerning population/process averages Understand how to use tables for the t distribution
Recall: Parameters versus Statistics Descriptive numerical measures calculated from the entire population are called parameters. – Quantitative data: and – Qualitative data: (proportion) Corresponding measures for a sample are called statistics. – Quantitative data: x-bar and s – Qualitative data: p
The Sampling Process Population or Process Sample Parameter Statistic
Sampling Distributions Quantitative data – Expected value for x-bar is the population or process average (i.e., ) – Expected variation in x-bar from one sample average to another is Known as the standard error of the mean Equal to /√n – Distribution of x-bar is approx normal (CLT) Qualitative data – E(p) is – Standard error is √ (1- )/n – Distribution of p is approx normal (CLT)
A Review Example from the Homework Supposedly, WNB executive salaries equal industry on average ( = 80,000) But sample results were – x-bar = $68,270 – s = $18,599 If truly = 80,000 – Assume for now that = s = – What is P(x-bar < 68270)? – What is P(x-bar 91730) ?
Some Answers Given assumptions about and – Standard error: /√n = 18599/√15 = 4800 – An x-bar value of is standard errors from the supposed population average Table probability = Thus P(x-bar < 68270) = – = 0.7% And P(x-bar 91730) = 1.4% Now, consider how this might be put to use in addressing the claim – Bring action against WNB (false claim?) – What’s the probability of doing so in error?
Putting Sampling Theory to Work We need to make decisions based on characteristics of a process or population But it’s not feasible to measure the entire population or process; instead we do sampling Therefore, we need to make conclusions about those characteristics based upon limited sets of observations (samples) These conclusions are inferences applying knowledge of sampling theory
The Sampling Process Population or Process Sample Parameter Statistic
Two Types of Statistical Inference Hypothesis testing – Starts with a hypothesis (i.e., claim, assumption, standard, etc.) about a population parameter ( , , , , distribution,... ) – Sample results are compared with the hypothesis – Based upon how likely the observed results are, given the hypothesis, a conclusion is made Estimation: a population parameter is concluded to be equal to a sample result, give or take a margin of error, which is based upon a desired level of confidence
Hypothesis Testing Start by defining hypotheses – Null (H 0 ): What we’ll believe until proven otherwise We state this first if we’re seeing if something’s changed – Alternate (H A ): Opposite of H 0 If we’re trying to prove something, we state it as H A and start with this, not the null Then state willingness to make wrong conclusion ( ) Determine the decision rule (DR) Gather data and compare results to DR
The Logic Involved Suppose someone makes a statement and you wonder about whether or not it’s true You typically do some research and get some evidence If the evidence contradicts the statement but not by much, you typically let it slide (but you’re not necessarily convinced) However, if the evidence is overwhelming, you’re convinced and you take action This is hypothesis testing! Statistics helps us to determine what is “overwhelming”
Errors in Hypothesis Testing Type I: rejecting a true H 0 Type II: accepting a false H 0 Probabilities = P(Type I) = P(Type II) Power = P(Rejecting false H 0 ) = P(No error) Controlling risks – Decision rule controls – Sample size controls Worst error: Type III (solving the wrong problem)! Hence, be sure H 0 and H A are correct
Stating the Decision Rule First, note that no analysis should take place before DR is in place! Can state any of three ways – Critical value of observed statistic (x-bar) – Critical value of test statistic (z) – Critical value of likelihood of observed result (p-value) Generally, test statistics are used when results are generated manually and p- values are used when results are determined via computer Always indicate on sketch of distribution
Some Exercises Addressing the Mean Don’t forget to sketch distributions! Large sample (CLT applies) – One tail hypothesis (#8-3) – Two tail hypothesis (#8-8) Small sample (introducing the t distribution) – One tail hypothesis (#8-5) – Note: we’re really always using the t distribution Applies whenever s is used to estimate the standard error It just becomes obvious when sample sizes are small
Homework Section 8-1: – Reread – Rework exercises Read Section 8-3 Work exercises: 28, 29, 30, 34