Statistics Stratification.

Slides:

Advertisements

Similar presentations

Statistical Sampling.

Advertisements

Chapter 5 Stratified Random Sampling n Advantages of stratified random sampling n How to select stratified random sample n Estimating population mean and.

PROBABILITY SAMPLING: CONCEPTS AND TERMINOLOGY

A new sampling method: stratified sampling

Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.

Sampling Designs Avery and Burkhart, Chapter 3 Source: J. Hollenbeck.

Statistics Stratification.

Determining Sample Size

Chapter 1: Introduction to Statistics

QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.

Statistical Power 1. First: Effect Size The size of the distance between two means in standardized units (not inferential). A measure of the impact of.

Agricultural and Biological Statistics. Sampling and Sampling Distributions Chapter 5.

© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.

1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,

Data Collection & Sampling Dr. Guerette. Gathering Data Three ways a researcher collects data: Three ways a researcher collects data: By asking questions.

1 Chapter 2: Sampling and Surveys. 2 Random Sampling Exercise Choose a sample of n=5 from our class, noting the proportion of females in your sample.

 When every unit of the population is examined. This is known as Census method.  On the other hand when a small group selected as representatives of.

RESEARCH METHODS Lecture 28. TYPES OF PROBABILITY SAMPLING Requires more work than nonrandom sampling. Researcher must identify sampling elements. Necessary.

Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.

©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Estimation and Confidence Intervals Chapter 9.

Statistical Sampling. Sample  A subset of units selected from the population to represent it.  Hopefully it is representative.

SAMPLING Purposes Representativeness “Sampling error”

Variability. The differences between individuals in a population Measured by calculations such as Standard Error, Confidence Interval and Sampling Error.

Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a goodness-of-fit.

Virtual University of Pakistan

AC 1.2 present the survey methodology and sampling frame used

Sampling and Sampling Distribution

Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative

The Effect of the 2016 Presidential Election on Humana Stock

Chapter 7 Confidence Interval Estimation

Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides

Research Methods and Statistics

Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.

Confidence Interval Estimation

Two-Sample Hypothesis Testing

Chapter 7 (b) – Point Estimation and Sampling Distributions

ECO 173 Chapter 10: Introduction to Estimation Lecture 5a

RESEARCH METHODS Lecture 28

Sample Size Determination

Sampling Population: The overall group to which the research findings are intended to apply Sampling frame: A list that contains every “element” or.

SAMPLING Purposes Representativeness “Sampling error”

Graduate School of Business Leadership

SAMPLING (Zikmund, Chapter 12.

Chapter 25 Comparing Counts.

Sampling: Theory and Methods

Introduction to Statistics and Research

ECO 173 Chapter 10: Introduction to Estimation Lecture 5a

By C. Kohn Waterford Agricultural Sciences

Sampling and Sampling Distributions

Introduction to Summary Statistics

CONCEPTS OF ESTIMATION

Introduction to Summary Statistics

Inferential Statistics

2. Stratified Random Sampling.

Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka

Confidence Interval Estimation

Virtual University of Pakistan

SAMPLING (Zikmund, Chapter 12).

Introduction to Estimation

Sampling and Power Slides by Jishnu Das.

Chapter 26 Comparing Counts.

Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.

Chapter 26 Comparing Counts.

Frequency Distributions

Presentation transcript:

Statistics Stratification

Stratification the process of dividing members of the population into homogeneous subgroups before sampling In general, stratification is used to gain efficiency. If variability is primarily between strata rather than within strata, it could mean a smaller number of samples need to be taken.

Bimodal Distributions Great case for stratification!

Stratification Strategies Proportionate allocation uses a sampling fraction in each of the strata that is proportional to that of the total population. For instance, if the population X consists of m in the male stratum and f in the female stratum (where m + f = X), then the relative size of the two samples (x1 = m/K males, x2 = f/K females) should reflect this proportion. Optimum allocation (or Disproportionate allocation) - Each stratum is proportionate to the standard deviation of the distribution of the variable. Larger samples are taken in the strata with the greatest variability to generate the least possible sampling variance.

The Nature of Risk

The Nature of Risk Statements (or inferences) about things are based on the best information at hand. In a forestry context, statements are made about stand volumes based upon a sample, rather than from all of the trees in the stand. There is a risk of making any statement, particularly in today's litigious society. Decisions are made as best as possible with consideration given to the probability the statement is right or wrong and the cost of being wrong. Being proactive in finding potential problems in timber sales is essential to efficient cruising. A forestry example would be where part of the stand is in low value pulp with the rest of the stand in high value sawtimber. Good information about volumes by product is necessary rather than just a total volume. Accurate representation of what is being sold is important (within reasonable cruising cost guidelines), in fairness to both the purchaser and the seller.

Unit refers to a cutting unit (a physical piece of ground). There are two levels of stratification, the strata and sub-strata or sample group. Stratification groups similar things together into a population, in forestry, the typical unit of observation is a plot or a tree. Typically, volume is the variable of interest which affects the CV. Units and Strata

Strata versus Attribute Stratification is used to group similar individuals together into populations. These populations are the basis for statistical calculations and the error standards in the handbook are written for these strata. These attributes or categorical variables can be used to aggregate total volumes in different ways. While averages and totals are available for these different groupings, it would be a violation of statistics to post stratify and calculate confidence intervals about those numbers. The user always has the option of creating strata using these attributes to place individuals into those populations.

Rules for using 1) Only use one sampling method for a stratum (if point/plot cruising only one BAF or plot size per stratum). 2) Change the frequency of the sampling method within a stratum by defining sub-stratification of sample group.

User Defined Populations Defining populations is the crux of cruise design. Before effectively designing a cruise, the prescriptions must be finalized. It is necessary to know what kind of information is needed in the prospectus in order to design the cruise. For example, if there are big differences in the value of a tree because of size or species then probably need to stratify based on those characteristics. Once the populations are defined then the next task is to decide how best to sample that population.

Sampling and Attributes each individual (could be a tree or a plot) needs to be identified by what population (stratum) the individual is in and where it is located. An individual can belong to only one population and can be located in only one unit. That unit may be used as a stratification variable to place an individual into a population. Membership in a population determines if and when this individual is a measured sample, and the rules for selecting samples will vary with the cruise method. A unit may be a stratification variable which is recorded and this attribute is used to summarize volumes by unit. Other attributes, such as species and logging methods, can also be recorded and used to summarize volumes. The key point is although averages or totals can be calculated by these other attributes, if it is not a stratification variable, sampling errors and confidence intervals cannot legitimately be calculated.

Expanding Samples Each tree sampled represents other trees which were not sampled. Since the sample selection takes place at the population level, the expanded volumes, sampling errors and statistics are also at the population level. For sample tree cruises, the apportioning of the volume to the unit is in proportion to the percent of trees (tally by species) for a unit. For area based sampling, the population volume per acre is multiplied by the unit acreage. This, of course, results in all units within a strata having the same species and volume per acre.

Simple Example Two units, single stratum. Calculate the expansion factor as the number of counted trees divided by the number sampled or 10 divided by 2 equals 5. Each measured tree represents 5 others in the population that were not measured Since stratification was not by unit, the volume needs to be prorated back to the unit. Six out of the 10 trees observed were in unit 1, so 60% of the PP, WF, and total volume would be assumed to be in unit 1. Similarly, 4 out of 10 trees observed were in unit 2, so 40% of the PP, WF, and total volume would be assumed to be in unit 2.

Volume Expansion Side Effects

Precaution for Point/FIX plot Sampling Take a stand with two components where the S's could represent sawtimber or a species such as spruce, and the P's could represent pulp or pine. Two sample groups could be created, an 'S' and a 'P', and sample separately for the S's and P's on the points/plots. Looking at the S sample group, the error is based on plot volume of S's so the variability would be very high since some plots have all the volume in S's while others have none. Not only that but the presence of S's on the plot means there isn't as much room left for P's and vice versa. The volume of S's and P's is inversely correlated. And, of course, this high variability and resultant high C.V.'s results in the call for more plots to meet sampling error. However, adding more plots could result in driving the C.V.'s higher if the variability increases even more, which, of course, would indicate even more plots are needed and so on. This type of stand needs to be sampled in a different manner. Either sample minor or highly variable species or products with a separate method in a separate stratum, or split out portions of units which are highly variable and sample them separately.

Point/FIX Plot Expansion and Proration Remember, if the unit is not used as a stratification variable then there needs to be some assumptions to allocate volumes back to the unit level. In this example, a single stratum composed of two units is established. Then plots are placed in the population and the volume per acre calculated for each plot. Suppose this results in an average volume of 1000 CF per acre. The unit volumes are calculated by multiplying the volume per acre by the unit acres. In prorating point/plot sample volumes, the number of points/plots is used at the stratum level to calculate volume per acre for the stratum. So the number of points/plots are not considered at the unit level. One precaution in using sample groups with point/plot sampling: don't use sample groups to try to get unit volumes. Since the point/plot count and expansion is at the stratum level (looking back at the previous example it would appear), if units were sample groups then sample group 1 would have eight plots with volume and four without, and vice versa for sample group 2. This increases the variability and also results in strange looking expanded volumes for units.