Sampling Sampling is choosing a sample. It is a very critical step in designing researches as it determines the power of the study and the degree of confidence for the results. Any defect in sampling will cause a defect in the accuracy of results and will question the internal and external validity of the whole study. The perfect way in research work is to study the whole population, but this is usually (but not always) difficult to what feasible, so we take a sample of the population, this is done in three steps:
1- Define the target population: large set of all patients or people throughout the world (to whom the results will be generalized) depending on clinical and demographic characteristics. (ex. All teenagers with asthma).
2- Define the accessible population (subset of the target population that is available for the study depending on geographic and temporal characteristics. (ex. Teenagers with Asthma living in Mosul city during the year 2013).
3- Define the sample: which is selected from the accessible population by either randomized (probability) or non-randomized (non-probability) method of sampling. It should be representative for the accessible population and easy to do.
Sampling in Epidemiology Why? Unable to study all members of a population Reduce bias Save time and money Measurements may be better in sample than in entire population Feasibility
Sampling Techniques I- Randomized Sampling (probability sample):- 1- Simple random sampling: using random Numbers’ tables, We use this method when we have a defined population.
2- Systematic Sampling: used when the population is not defined, by choosing every (nth) person, (n) could be any No. according to the size of the sample that we need. Sometimes considered as (not fully random).
3- Stratified Sampling: by dividing the population into groups, then take equal samples from each group. we use it in case that each group has a specific property that determines it and may effect the data. ex.: Sex, age, economic, occupation, geographical distribution,…….
4- Cluster Sampling:- used when the population is very large and we need to take many samples from different parts of the population, so we take a number of clusters (ex. 30 clusters and 30 or 40 individuals or households from each). 5- Multi-stage sampling:- we take samples from provinces, then from districts of these provinces then from villages of these districts,….and so on. Also used in samples from schools.
II-Non randomized Sampling (non probability sampling) 1- Inclusion-exclusion criteria. 2- Convenient sampling 3. Consecutive sampling 4- Judgmental sampling. 5- Quota sampling.
Specification Done by inclusion and exclusion criteria: Inclusion criteria: Define the main characteristics of the target and accessible pop. On basis of demographic and clinical characteristics and also geographic temporal character (ex. a 5y. trial calcium supplementation for preventing osteoporosis; inclusion criteria will be white female 45 - 50y. of age in good general health attending Al-Yarmouk hospital, between Jan/1st to Dec. 31st/2013) Exclusion criteria:- indicate subset of individuals who meet the eligibility criteria, but are likely to interfere with the quality of the data or the interpretation of the findings ex. (for the same above sample) alcoholics, or plan to move out of province, disoriented or have a language barrier, have illness, unwilling to accept possibility of random allocation to placebo group,….
Convenience sample A non-random collection of sampling units from an undefined sampling frame Advantages Convenient and easy to perform Disadvantages Not statistical justification for sample
Convenience sample Case series of patients with a particular condition at a certain hospital “Normal” graduate students walking down the hall are asked to donate blood for a study Children with febrile seizures reporting to an emergency room Investigator decides who is enrolled in a study
Consecutive sample A case series of consecutive patients with a condition of interest Consecutive series means ALL patients with the condition within hospital or clinic, not just the patients the investigators happen to know about Advantages Removes investigator from deciding who enters a study Requires protocol with definitions of condition of interest Straightforward way to enroll subjects Disadavantage Non-random
Consecutive sample Outcome of 1000 consecutive patients presenting to the emergency room with chest pain Natural history of all 125 patients with HIV-associated TB during 5 year period Explicit efforts must be made to identify and recruit ALL persons with the condition of interest
Other non randomized Sampling: Volunteer sample seeding sampling Capture-recapture
Capture-recapture sampling A non-random method of sampling that relies on lists of sampling units obtained from multiple sources. The overlap in the lists allows one to estimate the number of individuals not ‘captured’
Uses of this method Estimate parameter when incomplete information is available from 2 sources Refine of prevalence or incidence estimates from population surveys Assess completeness of event reporting Derive plausible upper and lower limits on total population affected
Advantages Does not require random sample Can give more precise estimate of parameter than probability sample Easy to perform in the field Useful in estimating events in difficult to access populations Disadvantages Analysis of lists may be complicated Need to be able to match individuals across lists Assumptions regarding probability of being listed by a source Unfamiliar to epidemiologists
Which sampling design is best? Choose the method that gives the greatest degree of accuracy and precision for a given cost.