"Classical" Inference
Two simple inference scenarios Question 1: Are we in world A or world B?
Possible worlds: World A World B Xnumberadded [-.5,.5]38 [-1, 1]6830 [-1.5, 1.5]8719 [-2, 2]958 [-2.5, 2.5]994 (- ∞, ∞)1001 Xnumberadded [4, 6]38 [3, 7]6830 [2, 8]8719 [1, 9]958 [0, 10]994 (- ∞, ∞)1001
Jerzy Neyman and Egon Pearson
Correct acceptance of H 0 pr(D= H 0 | T=H 0 ) = (1 – ) Type I Error pr(D= H 1 | T=H 0 ) = [aka size] Type II Error pr(D= H 0 | T=H 1 ) = Correct acceptance of H 1 pr(D= H 1 | T=H 1 ) = (1 – ) [aka power] D : Decision in favor of: H 1 : Alternative Hypothesis H 0 : Null Hypothesis T : The Truth of the matter: H 1 : Alternative Hypothesis
Definition. A subset C of the sample space is a best critical region of size α for testing the hypothesis H 0 against the hypothesis H 1 if and for every subset A of the sample space, whenever: we also have:
Neyman-Pearson Theorem: Suppose that for for some k > 0: Then C is a best critical region of size α for the test of H 0 vs. H 1.
9 When the null and alternative hypotheses are both Normal, the relation between the power of a statistical test (1 – ) and is given by the formula is the cdf of N(0,1), and q is the quantile determined by . fixes the type I error probability, but increasing n reduces the type II error probability
Question 2: Does the evidence suggest our world is not like World A?
World A Xnumberadded [-.5,.5]38 [-1, 1]6830 [-1.5, 1.5]8719 [-2, 2]958 [-2.5, 2.5]994 (- ∞, ∞)1001
Sir Ronald Aymler Fisher
Fisherian theory Significance tests: their disjunctive logic, and p-values as evidence: ``[This very low p-value] is amply low enough to exclude at a high level of significance any theory involving a random distribution….. The force with which such a conclusion is supported is logically that of the simple disjunction: Either an exceptionally rare chance has occurred, or the theory of random distribution is not true.'' (Fisher 1959, 39)
Fisherian theory ``The meaning of `H' is rejected at level α' is `Either an event of probability α has occurred, or H is false', and our disposition to disbelieve H arises from our disposition to disbelieve in events of small probability.'' (Barnard 1967, 32)
Fisherian theory: Distinctive features Notice that the actual data x is used to define the event whose significance is evaluated. Also based on H 0 and H 1 Can only reject H 0, evidence cannot allow one to accept H 0. Many other theories besides H 0 could also explain the data.
Common philosophical simplification: Hypothesis space given qualitatively; H 0 vs. –H 0, Murderer was Professor Plum, Colonel Mustard, Miss Scarlett, or Mrs. Peacock More typical situation: Very strong structural assumptions Hypothesis space given by unknown numeric `parameters' Test uses: a transformation of the raw data, a probability distribution for this transformation (≠ the original distribution of interest)
Three Commonly Used Facts Assume is a collection of independent and identically distributed (i.i.d.) random variables. Assume also that the X i s share a mean of μ and a standard deviation of σ.
Three Commonly Used Facts For the mean estimator : 1. 2.
Three Commonly Used Facts The Central Limit Theorem. If {X 1,…, X n } are i.i.d. random variables from a distribution with mean and variance 2, then: 3. Equivalently:
Examples Data: January 2012 CPS Sample: PhD’s, working full time, age H 0 : mean income is 75k
Hyp. Value Probability H
Comments The background conditions (e.g., the i.i.d. condition behind the sample) are a clear example of `Quine-Duhem’ conditions. When background conditions are met, ``large samples’’ don’t make inferences ``more certain’’ Multiple tests Monitoring or ``peeking'‘ at data, etc.
Point estimates and Confidence Intervals
Many desiderata of an estimator: Consistent Maximum Likelihood Unbiased Sufficient Minimum variance Minimum MSE (mean squared error) (most) efficient
By CLT: approximately: Thus: By algebra: So:
Interpreting confidence intervals The only probabilistic component that determines what occurs is. Everything else are constants. Simulations, examples Question: Why ``center’’ the interval?
Confidence Intervals $68, ± $12, ``C.I. = mean ± m.o.e’’ = ($56,745.32, $81,051.01)
Using similar logic, but different computing formulae, one can extend these methods to address further questions e.g., for standard deviations, equality of means across groups, etc.
Equality of Means: BAs SexCountMeanStd. Dev All ValueProbability
Equality of Means: PhDs SexCountMeanStd. Dev All ValueProbability