Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.

Similar presentations


Presentation on theme: "Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad."— Presentation transcript:

1 Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad

2 Review 2

3 Regression Estimation We observed that the ratio estimator is most appropriate when the relationship between y and x is linear through the origin. If there is evidence of a linear relationship between the observed y’s and x’s, but not necessarily one that would pass through the origin, then this extra information provided by the auxiliary variable x may be taken into account through a regression estimator of the mean µ y. 3

4 One must still have knowledge of µ x before the estimator can be employed, as it was in the case of ratio estimation of µ y. The underlying line that shows the basic relationship between y’s and x’s is sometimes referred to as the regression line of y upon x. Thus the subscript L in the ensuing formulas is used to denote linear regression. 4

5 The estimator given in next section assumes the x’s to be fixed in advance and the y’s to be random variable. We can think of the x values as something that has already been observed, like last year’s first quarter earnings, and the y response as a random variable yet to be observed, such as the current quarterly earnings of a company for which x is already known. The probabilistic properties of the estimator then depend only on y for a given set of x’s. 5

6 If stratum sample sizes are very small, or if the within-stratum ratios are all approximately equal, then the combined ratio estimator may perform better. Of course, an estimator of the population total can be found by multiplying either of the estimators above by the population size N, and the variances can be adjusted accordingly. Thus we might use the notation 6

7 Estimators Regression estimator of the population mean µ y. (3.28) Estimated Variance of (3.29) 7

8 Estimator Bound of the error of estimation: (3.30) When calculating b from observed pairs (y 1,x 1 ),…,(y n, x n ), we may use the fact that 8

9 Example 3.9 A mathematical achievement test was given to 486 students prior to their entering a certain college. From these students a simple random sample of n=10 students was selected and their progress in calculus observed. Final calculus grades were then reported, as given in the accompanying table. It is known that µ x =52 for all 486 students taking the achievement test. Estimate µ y for this population, and place a bound on the error of estimation. 9

10 Data for problem StudentAchievement test score, xFinal Calculus grade, y 13965 24378 32152 46482 55792 64789 72873 87598 93456 105275 10

11 Solution 11

12 Solution 12

13 A close examination of the data on sugar content and weight of oranges given in example 3.2 might suggest that a regression estimator is more appropriate than ratio estimator. A plot of the points will show that the regression line does not appear to go through the origin. However, the regression estimator of a total is of the form, specifically requiring knowledge of N. Since the ratio estimator also works well in this case, determining the number of oranges in the truckload may not be worth the extra cost and time 13

14 In other cases N may be known or easily found. Thus one should carefully consider the choice between ratio and regression estimators when estimating population means or totals. 14

15 Difference Estimation The difference method of estimating a population mean or total is similar to the regression method in that it adjusts the value up or down by an amount depending on the difference ( ). However, the regression coefficient b is not computed. In effect, b is set equal to unity. The difference method is, then, easier to employ than the regression method and frequently works just as well. 15

16 It is commonly employed in auditing procedures, and we will consider such an example in this section. The following formulas hold provided that simple random sampling was employed. 16

17 Estimators Difference estimator of a population µ y : (3.31) Estimated variance of : (3.32) 17

18 Estimators Bound on the error of estimation (3.33) 18

19 Example 3.10 Auditors are often interested in comparing the audited value of item with the book value. Generally, book values are known for every item in the population, and audit values are obtained for a sample of these items. The book values can be used to obtain a good estimate of the total or average audit value for the population. Suppose a population contains 180 inventory items with a stated book value of $13,320. Let xi denote the book value and y i the audit value of the i th item. A simple random sample of n=10 items yields the results shown in the accompanying table. Estimate the mean audit value of µ y by the difference method and estimate the variance of. 19

20 Data for Problem SampleAudit Value, y i Book Value, x i didi 1910 214122 378 429263 54547-2 6109112-3 740364 8238240-2 960591 101701673 20

21 Solution 21

22 22 Systematic Sampling

23 23 Session Objectives To introduce basic sampling concepts in systematic sampling Demonstrate how to select a random sample using systematic sampling design Estimation of different parameters in systematic random sampling

24 24 Sample Selection Procedure List all the units in the population from 1,2,…,N – Sampling frame Select a random number g in the interval 1 g K, using a random mechanism e.g. random number tables, where K = K is called the Sampling Interval N is the population size; n is the sample size The random number g is called the random start and constitutes the first unit of the sample

25 25 Sample Selection Procedure Take every k th unit after the random start The selected units will be g, g+k, g+2k, g+3k, g+4k, …,g+(n-1)k Until we have n units Example N =10000, n=100 k = =100 Suppose g=87

26 26 Sample Selection Procedure We select the following units 87, 187, 287, 387,…, 9987 NB: This procedure is however only valid if k is an integer (whole number) If k is not an integer (whole number) there are a number of methods we can use. We will consider just two of them

27 27 Sample Selection Procedure Method 1: Use Circular Sampling Treat the list as circular so that the last unit is followed by the first Select a random start g between 1 and N, using a random mechanism Add the intervals k until n units are selected Any convenient interval k will result into a random sample

28 28 Sample Selection Procedure One suitable suggestion is to choose the integer k closest to the ratio Method 2: Use Fractional Intervals Suppose we want to select a sample of 100 units from a population of 21,156. Calculate k = =211.56 Select a random start g between 1 and 21156 using a random mechanism

29 29 Sample Selection Procedure Suppose g = 582 Add the interval 21156 successively obtaining exactly 100 numbers The numbers will be 582, 21738, 42894, … Divide each number by 100 and round to the nearest whole number to get the selected sample, i.e. 6, 217, 429, etc

30 30 Advantages and Disadvantages of Systematic sampling Advantages: – The major advantage is that it is easy, almost foolproof and flexible to implement – It is especially easy to give instructions to fieldworkers – If we order our list prior to taking the sample, the sample will reflect the ordering and as such can easily give a proportionate sample

31 31 Advantages and Disadvantages of Systematic sampling Disadvantages: – The main disadvantage is that if there is an ordering (monotonic trend or periodicity) in the list which is unknown to the researcher, this may bias the resulting estimates – There is a problem of estimating variance from systematic sampling- variance is biased


Download ppt "Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad."

Similar presentations


Ads by Google