The Holly Seven Steps of Developing a Statistical Test Tugrul Temel Center for World Food Studies Free University, Amsterdam February 2000
Step 1. A Data- Generating Process Given a random sample of n i.i.d. random variables (r.v.) –Independence of obs. on a r.v –Identical (or constant) prob. dist. for r.v. –Stationary dist. for r.v. The Borel field –The history does not forget –sample space, events, prob. of a set of events (or test size) A Borel measurable function Integration over events
The Holy Seven Steps for a Statistical Test A data-generating process A model The hypotheses Asymptotic distributions of distance functions implied by the hypotheses A test statistic A critical region A decision rule
Step 2. A Model Model specification –Parametric, semi-parametric, and non-parametric Misspecification in –degree of polynomial, choice of variables, or both Measurement error in –dependent, independent, or both Nonlinearity in –policy variables, coefficients, or both
Step 3. The Hypotheses The null and alternative hypotheses The underlying distance function (d) The underlying loss and risk functions –Subjectivity of risk functions –Form of objective functions, i.e., Mean Squared Error A transformation of d, T=T(d)
Step 4. The Asymptotic Distribution of Tne Tn is an estimator of T Tne is the estimated Tn A value of Tne far from zero is evidence against the null. To tell how far Tne must be from zero to reject the null, we find its asymptotic distribution. Apply a CLT and a LLN to obtain a normal dist. for Tne Estimate the variance of Tne
Step 5. A Test Statistic (TS) Define TS=f(Tne) Using different random samples (i), find a distribution of TS(i) Is the TS sufficient? In reality, we have only one random sample, i.e., i=1. Obtain TS(i) by applying –The classical approach –The Bayesian approach –The bootstrap
Step 6. A Critical Region Construct a critical region given the distribution of TS(i) and the test size (alpha) The link between the critical region and the Borel field
Step 7. A Decision Rule Reject the null if TS falls in the critical region or accept the null if it falls in the confidence region Asymptotic versus the bootstraped values of critical region
The Bootstrap The bootstrap to detect –violations of parametric distributional assumptions, like a non-normal error structure assumed in OLS regression – characteristics of TS