ESTIMATING RATIOS OF MEANS IN SURVEY SAMPLING Olivia Smith March 3, 2016
Overview Introduction and fundamental concepts Problem 1: Use simple random sampling without replacement to estimate a population ratio. Problem 2: Use simple random sampling without replacement to estimate a population total. Problem 3: Use stratified random sampling without replacement to estimate a population total.
Introduction Survey sampling is used to estimate characteristics of a population that cannot be fully observed Certain sampling methods and estimators are more reliable – more accurate and/or more precise Focus of the Talk Estimating a ratio of means and its application to other estimation problems Sampling Methods Simple random sampling without replacement Stratified random sampling
Ratio of Means Ω, finite population, #(Ω) = N, population size X: Ω R, Y: Ω R, random variables ω j Ω, jth individual of the population X(ω j ) = x j, Y(ω j ) = y j q, Ratio of population means
Mean of Ratios vs. Ratio of Means We will be discussing the Ratio of Means
Problem 1 Use simple random sampling without replacement to estimate
Simple Random Sampling w/o Replacement Finite population, size N The sample space is made up of n-tuples of distinct units. There are possible samples, all equally likely. We will denote the sample mean by Random VariableObserved Variable
Ratio Estimator Obtain measurements for both variables, X and Y, from each individual in the sample. Ratio Estimator Ratio Estimate
Example Application Estimate the percent of registered Independents who (X) are voting for a Republican and (Y) are going to vote for Donald Trump. (Code “1” for “Yes,” “0” for “No.”) Number of Independents voting for Trump is unknown. Number of Independents voting Republican is unknown. Since, use the ratio estimator.
Mean Squared Error (MSE) Measure of the quality of an estimator Value of depends on sample n-tuple, average over all possible n-tuples
Lemma 1: MSE of the Sample Mean Simple Random Sampling without replacement Sample mean,
Proof of Lemma 1 We start with If we square the first equation, we get
Proof of Lemma 1
Recall that completing the square yields the following results where in our case
Proof of Lemma 1 After completing the square on the cross-product term The second sum equals zero so that term disappears. Dividing by we get
Properties of Expectation and Variance
Covariance and Correlation These are measures of the statistical dependence of random variables X and Y on each other:
Behavior of the Sample Mean for large n Example N = 100,000, n = 1,000
MSE of the Ratio Estimator Theorem 1.Assuming simple random sampling without replacement, if n is large, the MSE and variance of satisfy where, and is the population correlation between X and Y.
MSE of the Ratio Estimator Proof. If n is large enough,, so
MSE of the Ratio Estimator Therefore,
MSE of the Ratio Estimator so This completes the proof.
Problem 2 Use simple random sampling without replacement to estimate
Simple Estimator of Y Total
MSE for Simple Estimator of Y Total
Ratio Estimator of Y Total
Example Application Estimate the number of registered Democrats (X) who are (Y) going to vote for Bernie Sanders. (Code “1” for “Yes,” “0” for “No.”) Number of Democrats voting for Bernie is unknown. Number of registered Democrats IS known. The population mean of X,, is also known.
MSE for Ratio Estimate of Y Total
Comparing Estimators of Y Total In large samples, with simple random sampling without replacement,
Comparing Estimators of Y Total (Proof) Hence has the smaller variance if
Problem 3 Use stratified random sampling without replacement to estimate
Stratified Sampling Ratio Estimator for Y Total Divide population into strata of size N h, h = 1, 2, …, L. Use SRS w/o replacement in each stratum. Use sample size n h in stratum h, h = 1, 2, …, L. Suppose known for each stratum.
MSE of Stratified Sampling Ratio Estimator of Y Total Assume large n h in each stratum.
Stratification and Correlation Var(X)Var(Y)ρ(X, Y) Population Stratum Stratum
In summary Ratio estimators can be used to estimate a ratio characteristic of a population Ratio estimators can also be used to estimate totals of a variable in a population Of the estimators we discussed Simple estimators would be the least reliable Then ratio estimators with simple random sampling And ratio estimators with stratified random sampling would be the most reliable
Acknowledgments Friends Family (my Mom) Math Department Professor Buckmire Professor Knoerr
References Cochran, William G. Sampling Techniques. New York, NY: Wiley, Print. Hansen Morris H. Sample Survey Methods and Theory. New York, NY: Wiley, Print. Konijn, H.S. Statistical Theory of Sample Survey Design and Analysis. Amsterdam: North-Holland, Print. Larsen, Richard J., and Morris L. Marx. An Introduction to Mathematical Statistics and Its Applications. Boston: Prentice Hall, Print.