Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sampling l External Validity External Validity l Sampling Terminology Sampling Terminology l Probability in Sampling Probability in Sampling l Probability.

Similar presentations


Presentation on theme: "Sampling l External Validity External Validity l Sampling Terminology Sampling Terminology l Probability in Sampling Probability in Sampling l Probability."— Presentation transcript:

1 Sampling l External Validity External Validity l Sampling Terminology Sampling Terminology l Probability in Sampling Probability in Sampling l Probability Sampling Designs Probability Sampling Designs l Nonprobability Sampling Designs Nonprobability Sampling Designs

2

3 ExternalValidity External Validity Review - Types of studies Descriptive Relational Causal Review - General forms of validity Conclusion, Internal, Construct, External – cumulative Definition External validity refers to the ability to generalize the results of a study to the broader population of interest. Two models can be used to establish & argue for external validity Model 1 (Sampling) & Model 2 (Proximal Similarity) Let’s review before examining these two models for “External Validity” Review “Validity Graphic” – Causal Context See next slide

4 Validity Graphic: Causal Context - Review Validity Graphic: Causal Context - Review The External Validity Question (next slide) Theory Observation CauseconstructEffectconstruct ProgramObservations What you doWhat you see What you think What you test In this study Cause-effect Construct Program-outcome Relationship OperationalizeOperationalize

5 The External Validity Question Can we generalize to other persons, places, OR times? Theory CauseconstructEffectconstruct Cause-effect construct Observation ProgramObservations What you do What you see Program-outcome relationship Observation ProgramObservations What you do What you see Program-outcome relationship ObservationProgramObservations What you do What you see Program-outcome relationship Observation ProgramObservations What you do What you see program-outcome relationship Observation

6 How Do We Establish “External Validity?” -Model I: Sampling Identify population of interest Specified persons, places, & times. Population

7 How Do We Establish “External Validity?” -Model I: Sampling Population Sample Draw Sample from Population Draw Sample

8 How Do We Establish “External Validity?” -Model I: Sampling Population Sample What is learned from the study of the sample is generalized back to the population. Generalize Back to Population

9 How Do We Establish “External Validity?” -Model II: Proximal Similarity How Do We Establish “External Validity?” -Model II: Proximal Similarity Relabeling External Validity External Validity The Principle of Proximal Similarity See next slide First suggested by: Donald Campbell, 1986

10 Principle of Proximal Similarity l A reaction to the sampling theory of generalizability l Based on the natural science model of high internal validity l Places current study within a generalizability space l Views generalization as a question of degree

11 How Do We Establish “External Validity?” -Model II: Proximal Similarity OurStudy TimesPeople Places Settings Less similar Less similar Less similar Less similar Gradients of similarity

12 Threats to External Validity Interaction of selection and treatment Maybe it is just these people. Interaction of setting and treatment Maybe it is just these places. Interaction of history and treatment Maybe it is just these times.

13 External Validity - Evidence? Population Sample OurStudy People Places Times OurStudy timespeople places settings Random Sampling And… Replicate, Replicate, Replicate Use theory

14

15 Sampling Terminology

16 Q: Who do you want to generalize to? Groups in Sampling

17 A: The theoretical population

18 Q: What population can you get access to? Groups in Sampling The theoretical population

19 Groups in Sampling The Theoretical Population A: The study population

20 Q: How can you get access to them? Groups in Sampling The theoretical population The study population

21 Groups in Sampling The theoretical population The study population A: The sampling frame

22 Q: Who is in your study? Groups in Sampling The theoretical population The study population The sampling frame

23 Groups in Sampling The theoretical population The study population The sampling frame A: The sample

24 Threats to External Validity – Where Can We Go Wrong? The theoretical population The study population The sampling frame The sample

25 Threats to External Validity – Where Can We Go Wrong? The theoretical population The study population The sampling frame The sample

26 Threats to External Validity – Where Can We Go Wrong? The theoretical population The study population The sampling frame The sample

27 Threats to External Validity – Where Can We Go Wrong? The theoretical population The study population The sampling frame The sample

28 Two Major Types of Sampling Methods Uses some form of random selection. Requires that each unit have a known (often equal) probability of being selected. Selection is systematic or haphazard, but not random. Probability sampling Nonprobability sampling

29

30 Terms, Theory, & Probability in Sampling

31 Sampling Terms & Theory l l Sample – –Group from which information is obtained. – –Larger group to which one hopes to generalize the results is called the population. – –The larger the sample size, the better chance of making accurate generalizations (External Validity). – –Usually limited resources dictate limits to the sample size. – –If data is collected from the entire population it is references as a “census” and NOT a sample.

32 Sampling Terms & Theory l l Sample Size – –More the better, no clear cut answer but there are some general rules of thumb. – –Depends upon the type of study, minimum recommended: » »Descriptive Study = 100 » »Relational Study = 50 (within group / between group considerations) » »Causal = 30 per group n n In experimental designs groups as small as 15 may be valid. – –Due to tight controls, random selection and random assignment – –Sample size is reflected in the results of some statistical procedures (e.g., standard error).

33 Sampling Terms & Theory l l Terms – –Statistic – computed from a sample & used to estimate a parameter of the population. » »What the investigator knows. – –Parameter » »Characteristic of the population of interest. » »What the investigator wants to know. l l When estimating a parameter – –Accuracy – how close is the estimate? » »Commonly seen reported as a +/- figure (48% +/- 2%) – –For population mean estimation » »Need sample mean and sample standard deviation

34 12345123451234512345 Statistical Terms in Sampling Variable Statistic Parameter Self-esteem Average = 3.72 Average = 3.75 Sample Population

35 Sampling: Important Concepts (slide 1 of 3) l Normal Distribution –Pattern for the distribution of a set of data which follows a bell shaped curve, a.k.a., the Gaussian distribution (Carl Gauss). –The bell shaped curve has several properties: »Concentrated in the center and decreases on either side. »Symmetric. –Probability of deviations from the mean - comparable in either direction. »68%, 95%, 99% rule (based on standard deviation or standard error)

36 Sampling: Important Concepts (slide 2 of 3) l The Sampling Distribution –Important concept for explaining how a statistic can be used to estimate a population parameter –Variability of a statistic over repeated sampling from the population –Established by statistic measured on infinite # of samples (theoretical) l Standard Error (SE) –Measure of variance (data distribution) of the Sampling Distribution –Standard deviation of a sampling distribution. –Can’t be measured directly but can be estimated from sample data. l Confidence Interval (CI) –Based on the Sampling Distribution –A statistic, plus or minus an error measure –Commonly reported in political poles »CI = 48% +/- 2% (46% - 50%)

37 The Sampling Distribution AverageAverageAverage 4.44.24.03.83.63.43.23.0 15 10 5 0 The sampling distribution......is the distribution of a statistic across an infinite number of samples. Sample 4.44.24.03.83.63.43.23.0 5 0 5 0 Sample 4.44.24.03.83.63.43.23.0 5 0 5 0 Sample 4.44.24.03.83.63.43.23.0 5 0 5 0

38 Sampling: Important Concepts (slide 3 of 3) l Sampling Error –In sampling context, Standard Error is called Sampling Error – same thing. –Sampling error is the error caused by observing a sample instead of the whole population. error »Remember, there is some “true” value that represents the actual population measure (parameter). Your sample statistic is likely to vary from this value. This variance is considered error. –Estimated from the sample mean and standard deviation l Standard Deviation (s) –Measure of variance (standard deviation) within the sample data. »Sample of 100 values. »Each of these values is likely to vary (be different from each other) »s - a measure that represents the extent of this variance. »68%, 95%, 99% rule represents distribution of sample data (Normal Dist.) –Used as an estimate of the Standard Error

39 Sample standard deviation & Standard Error l l Refresher – –Sample standard deviation l l Standard Error (SE) l l *Note the larger the sample size (n) the smaller the SE – leads to more precise confidence interval.

40 Standard Error – Relationship Between Population and Sample Results 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self-esteem Frequency The population has a mean of 3.75.

41 Standard Error – Relationship Between Population and Sample Results 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self-esteem Frequency The population has a mean of 3.75......and a standard error of.25.

42 Standard Error – Relationship Between Population and Sample Results 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self-esteem FFrequency The population has a mean of 3.75......and a standard error of.25. This means that...

43 Standard Error – Relationship Between Population and Sample Results 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self-esteem Frequency The population has a mean of 3.75......and a standard error of.25. This means that... about 68% of our sample cases fall between 3.5 - 4.0.

44 Standard Error – Relationship Between Population and Sample Results 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self-esteem Frequency The population has a mean of 3.75......and a standard error of.25. This means... about 68% of our sample cases fall between 3.5 - 4.0. About 95% of sample cases fall between 3.25 - 4.25.

45 Standard Error – Relationship Between Population and Sample Results 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self esteem Frequency The population has a mean of 3.75......and a standard error of.25. This means About 68% of the sample cases fall between 3.5 - 4.0. About 95% of the sample cases fall between 3.25 - 4.25. About 99% of the sample cases fall between 3.0 - 4.5

46 Sampling Error (Standard Error in Sampling) 68%, 95%, 99% Rule The standard error is called the sampling error. 4.54.03.53.0 150 100 50 0 The sample of 1000 has a mean of 3.74 and a standard error of.0074. Self-esteem Frequency

47 Sampling Error 68%, 95%, 99% Rule 4.54.03.53.0 150 100 50 0 Self-esteem Frequency The sampling error shows that the odds are.95 that the population mean is 3.74 + 2(.0074). The sample of 1000 has a mean of 3.74 and a standard error of.0074. Confidence level

48 Sampling Error 68%, 95%, 99% Rule The sampling error shows that the odds are.95 that the population mean is 3.74 + 2(.0074). 4.54.03.53.0 150 100 50 0 Self-esteem Frequency The sample of 1000 has a mean of 3.74 and a standard error of.0074.

49

50 Probability Sampling Designs

51 l Uses some form of random selection. l Requires that each unit have a known (often equal) probability of being selected. l Primary characteristic – some form of random sampling is used lSimple random sampling lStratified sampling lSystematic sampling lCluster (area) sampling lMultistage sampling

52 Some Definitions l N = the number of cases in the sampling frame l n = the number of cases in the sample l N C n = the number of combinations (subsets) of n from N l f = n/N = the sampling fraction

53 Simple Random Sampling Objective: Select n units out of N such that every N C n has an equal chance. Procedure: Use table of random numbers, computer random number generator or mechanical device. MS Excel: Formula for random number generation =rand() f=n/N is the sampling fraction.

54 Simple Random Sampling l Small service agency. l Client assessment of quality of service. l Get list of clients over past year. l Draw a simple random sample of n/N. Example:

55 Simple Random Sampling List of clients

56 Simple Random Sampling List of clients Random subsample

57 Stratified Random Sampling Sometimes called "proportional" or "quota" random sampling. Objective: Population of N units divided into nonoverlapping strata N 1, N 2, N 3,... N i such that N 1 + N 2 +... + N i = N; then do simple random sample of n/N in each strata.

58 Stratified Sampling - Purposes: To insure representation of each strata, oversample smaller population groups. Sampling problems may differ in each strata. Increase precision (lower variance) if strata are homogeneous within.

59 Stratified Random Sampling List of clients

60 Stratified Random Sampling List of clients Strata African-AmericanOthersHispanic-American

61 Stratified Random Sampling List of clients Random subsamples of n/N Strata African-AmericanOthersHispanic-American

62 Proportionate vs. Disproportionate Stratified Random Sampling l Proportionate: If sampling fraction is equal for each stratum –Results represent population l Disproportionate: Unequal sampling fraction in each stratum –Needed to enable better representation of smaller (minority groups)

63 Systematic Random Sampling l Number units in population from 1 to N. l Decide on the n that you want or need. l N/n=k the interval size. l Randomly select a number from 1 to k. l Take every kth unit. Procedure:

64 Systematic Random Sampling l Assumes that the population is randomly ordered. l Example: The library (ACM) study.

65 Systematic Random Sampling 1265176 2275277 3285378 4295479 5305580 6315681 7325782 8335883 9345984 10356085 11366186 12376287 13386388 14396489 15406590 16416691 17426792 18436893 19446994 20457095 21467196 22477297 23487398 24497499 255075100 N = 100

66 Systematic Random Sampling 1265176 2275277 3285378 4295479 5305580 6315681 7325782 8335883 9345984 10356085 11366186 12376287 13386388 14396489 15406590 16416691 17426792 18436893 19446994 20457095 21467196 22477297 23487398 24497499 255075100 N = 100 Want n = 20

67 Systematic Random Sampling 1265176 2275277 3285378 4295479 5305580 6315681 7325782 8335883 9345984 10356085 11366186 12376287 13386388 14396489 15406590 16416691 17426792 18436893 19446994 20457095 21467196 22477297 23487398 24497499 255075100 N = 100 want n = 20 N/n = 5

68 Systematic Random Sampling 1265176 2275277 3285378 4295479 5305580 6315681 7325782 8335883 9345984 10356085 11366186 12376287 13386388 14396489 15406590 16416691 17426792 18436893 19446994 20457095 21467196 22477297 23487398 24497499 255075100 N = 100 Want n = 20 N/n = 5 Select a random number from 1-5: chose 4

69 Systematic Random Sampling 1265176 2275277 3285378 4295479 5305580 6315681 7325782 8335883 9345984 10356085 11366186 12376287 13386388 14396489 15406590 16416691 17426792 18436893 19446994 20457095 21467196 22477297 23487398 24497499 255075100 N = 100 Want n = 20 Interval size: N/n = 5 Interval size: N/n = 5 Select a random number from 1-5: chose 4 Start with #4 and take every 5th unit

70 Cluster (Area) Random Sampling l Divide population into clusters. l Randomly sample clusters. l Measure all units within sampled clusters. Procedure:

71 Cluster (Area) Random Sampling l Advantages: Administratively useful, especially when you have a wide geographic area to cover. l Examples: Randomly sample from city blocks and measure all homes in selected blocks.

72 Multi-Stage Sampling l Cluster (area) random sampling can be multi-stage. l Any combinations of single-stage methods.

73 Multi-Stage Sampling l Select all schools; then sample within schools. l Sample schools; then measure all students. l Sample schools; then sample students. Example: Choosing students from schools

74

75 Nonprobability Sampling Designs l Non-random designs l Major issues Likely to misrepresent the population May be difficult or impossible to detect this misrepresentation

76 Types of Nonprobability Samples l Accidental, haphazard, convenience l Purposive lModal instance lExpert lQuota lSnowball lHeterogeneity sampling

77 Accidental, Haphazard or Convenience Sampling l One of the most common methods of sampling. Collect data from easy sources. l“Man on the street” lCollege psychology majors lAvailable or accessible clients lVolunteer samples l Problem: No evidence for representativeness of the population of interest.

78 Purposive Sampling Multiple Subcategories on Following Slides l Researchers use their judgment to select a sample they believe, based on prior information, will provide the data they need. l Issues: lResearcher’s judgment may be wrong. lNeed theory to correctly sample. lInference

79 Modal Instance Sampling In statistics, the mode is the most frequently occurring value in a distribution, therefore:In statistics, the mode is the most frequently occurring value in a distribution, therefore: Modal Instance Sampling is sampling for the typical case. Example: Public Opinion Pole - Typical voter? Will it play in Peoria? Problem: Criteria used to ID Modal group may not be correct.

80 Expert Sampling l Involves assembling a sample of persons with expertise in some area. l"panel of experts.“ l Two potential reasons to use expert sampling lBest way to elicit views of persons with specific expertise. lProvide evidence for the validity of another sampling approach. lYou do modal instance sampling & are concerned about the criteria used for defining the modal instance. Expert panel – examine modal definitions and comment on their appropriateness and validity. l Advantage: Expert judgment supports the sampling. l Problem: The “experts” may be wrong.

81 Quota Sampling l Select nonrandomly according to quota. Two types. l Proportional quota sampling lWant to represent the major characteristics of the population by sampling a proportional amount of each. l Nonproportional quota sampling lSpecify minimum # of sampled units in each category. lNot concerned with #s that match proportions. lInstead, enough to assure that small groups in the population can be studied. lNonprobabilistic analogue of stratified random sampling - typically used to assure smaller groups are represented in the sample.

82 Snowball Sampling l Begin by identifying someone who meets the criteria for inclusion in your study. l Then ask them to recommend others who also meet the criteria. lOne person recommends another, who recommends another, who recommends…. l Good way to identify hard-to-reach populations lHomeless persons lDrug users lAnd others.

83 Heterogeneity Sampling Use whenUse when Want to include all opinions or viewsWant to include all opinions or views Aren't concerned about representing these views proportionately.Aren't concerned about representing these views proportionately. Another term – diversity.Another term – diversity. Brainstorming (e.g., concept mapping) Brainstorming (e.g., concept mapping) Use some form of heterogeneity samplingUse some form of heterogeneity sampling Primary interest is in getting broad spectrum of ideas, not identifying the "average" or "modal instance" ones.Primary interest is in getting broad spectrum of ideas, not identifying the "average" or "modal instance" ones. Sampling is about ideas, not peopleSampling is about ideas, not people.


Download ppt "Sampling l External Validity External Validity l Sampling Terminology Sampling Terminology l Probability in Sampling Probability in Sampling l Probability."

Similar presentations


Ads by Google