Presentation is loading. Please wait.

Presentation is loading. Please wait.

STAT262: Lecture 5 (Ratio estimation)

Similar presentations


Presentation on theme: "STAT262: Lecture 5 (Ratio estimation)"— Presentation transcript:

1 STAT262: Lecture 5 (Ratio estimation)

2 Introduction Today we introduce new estimation methods
Ratio estimators Ratio estimators involve two characteristics: Y: a characteristic we are interested X: a characteristic that is related with Y The same sampling method: Sample Random Sampling (SRS)

3 Why consider ratio estimation?
In many situations we want to estimate the ratio of two population characteristics. E.g., average yield of corn per acre. Use ratio to assist the estimation of a correlated quantity N is unknown. We can estimate it by Use ratio estimators to increase the precision of estimated means or totals

4 Examples E.g.1: The average yield of corns per acre
E.g.2: The number of hummingbirds in a national forest Sample a few regions, record the number (yi) and area (xi) for each region. Calculate sample ratio Total area of the national forest is tx Estimate of ty is

5 Examples E.g.3: Laplace wanted to know the number of persons living in France in There was no census on that year Two candidate estimators Which was Laplace’s choice? # persons # registered births Sample: 30 counties 2,037,615 71,866 France: N (known) ty (???) tx (known)

6 Examples Laplace reasoned that using ratio estimator is more accurate.
Large counties have more registered births. Number of registered births and number of persons are positively correlated. Thus, using information in x is likely to improve our estimate of y.

7 Examples E.g. 4. McDonal Corp. The average of annual sale of this year
One can use information from last year. Details will be discussed later

8 Ratio estimators in SRS
Sampling method: SRS Two quantities (xi, yi) are measured in each sampled unit, where xi is an auxiliary variable

9 Population quantities
Size: N Totals: Means: Ratio: Variances and covariance: Correlation coefficient:

10 Example of population quantities

11 Contents Ratio estimator in SRS Bias – the exact expression
Bias – an approximated formula MSE and variance Examples Efficiency

12 Bias – the exact expression
Ratio estimators are usually biased

13 Bias – the exact expression

14 Bias – the exact expression
Exact, but not easy to use with data.

15 Bias – an approximation

16 Bias – an approximation

17 The bias is usually small if

18 Variance and Mean Squared Error
The bias is usually small, thus can be ignored

19 A hypothetical example
Population. N=8

20 A hypothetical example

21 A hypothetical example
Mean estimate = Bias = Bias approx: Mean estimate = 40

22 Estimate variance

23 Estimate variance

24 Example 1

25 Example 1

26 Example 2

27 Example 2

28 Example 2

29 Efficiency of ratio estimation

30 STAT262: Lecture 6 (Regression estimation)

31 Regression estimation
Ratio estimation works well if the data are well fit by a straight line through the origin Often, data are scattered around a straight line that does not go through the origin

32 Regression estimation
The regression estimator of the population mean is

33 Bias For large SRS, the bias is usually small

34 Variance and MSE Bias is small

35 Variance

36 Standard error

37 The McDonald Example

38 The McDonald Example

39 Relative Efficiencies

40 Relative Efficiencies

41 Relative Efficiencies

42 Relative Efficiencies

43 Summary We introduced two new estimators:
Ratio estimator: Regression estimator: Both exploit the association between x and y The regression estimator is the most efficient (asymptotically) The ratio estimator is more efficient than the SRS estimator if R is large

44 Estimation in Domains: A motivating example
We are often interested in separate estimatef for subpopulations (also called domains) E.g. after taking an SRS of 1000 persons, we want to estimate the average salary for men and the average salary for women

45 Estimation in Domains: A motivating example

46 Estimation in Domains: A motivating example
The calculation in the previous slide treats as a constant. But it is not. We should take the randomness into consideration The formulas we derived for ratio estimators can be used

47 Estimations in Domains

48 Estimations in Domains

49 Estimations in Domains
If the sample is large

50 A new sampling method: stratified sampling
In stratified sampling, we conduct SRS in each stratum Outline Definition and motivation Statistical inference (theory of stratified sampling) Advantages of stratified sampling Sample size calculation

51 Stratified sampling: definition and motivation
A motivating example: average number of words in save messages of people in this room What is stratified sampling? Stratify: make layers Strata: subpopulations Strata do not overlap Each sampling unit belongs to exactly one stratum Strata constitute the whole population

52 Why do we use stratified sampling?
Be protected from obtaining a really bad sample. Example Population size is N=500 (250 women and 250 men) SRS of size n=50 It is possible to obtain a sample with no or a few males Pr(less than or equal to 15 men in an SRS)=0.003 Pr(less than or equal to 20 men in an SRS)=0.10 In stratified sampling, we can sample 25 men and 25 women

53 Why do we use stratified sampling?
Stratified sampling allows us to compare subgroups Convenient, reduce cost, easy to sample More precise. See the following example

54 Total number of farm acres (3078 counties)
SRS of 300 counties from the Census of Agriculture Estimate: , standard error: Stratified sampling: about 10% stratum (region)

55 Total number of farm acres (3078 counties)
Estimate: Standard error:

56 Theory of stratified sampling

57 Notation for Stratification: Population

58 Notation for Stratification: Sample

59 Stratified sampling: estimation

60 Statistical Properties: Bias and Variance

61 Variance Estimates for stratified samples

62 Confidence intervals for stratified samples
Some books use t distribution with n-H degrees of freedom

63 Sampling probabilities and weights
In a population with 1600 men and 400 women and the stratified sample design specifies sampling 200 men and 200 women, Each man in the sample has weight 8 and woman has weight 2 Each woman in the sample represents herself and 1 other woman not selected Each man represents himself and 7 other men not in the sample

64 Sampling probabilities and weights
The sampling probability for the jth unit in the hth stratum is Sampling weight: The sum of sampling weight is N

65 Sampling probabilities and weights

66 Sampling probabilities and weights
example

67 Sampling probabilities and weights in proportional allocation
In proportional allocation, the number of sampled units in each stratum is proportional to the size of the stratum, i.e., Every unit in the sample has the same weight and represents the same number of units in the population. The sample is called self-weighting

68 Sampling probabilities and weights in proportional allocation
Sampling probability for all units is about 10% All the weights are the same: 10


Download ppt "STAT262: Lecture 5 (Ratio estimation)"

Similar presentations


Ads by Google