for epidemiological studies

Slides:



Advertisements
Similar presentations
17th EPIET introductory course Lazareto, Menorca, Spain
Advertisements

Sampling techniques & sample size
Faculty of Allied Medical Science Biostatistics MLST-201
1 1 Slide 2009 University of Minnesota-Duluth, Econ-2030(Dr. Tadesse) Chapter 7, Part B Sampling and Sampling Distributions Other Sampling Methods Other.
Taejin Jung, Ph.D. Week 8: Sampling Messages and People
MISUNDERSTOOD AND MISUSED
Who and How And How to Mess It up
Sampling.
Why sample? Diversity in populations Practicality and cost.
Sampling Prepared by Dr. Manal Moussa. Sampling Prepared by Dr. Manal Moussa.
Sampling and Sampling Distributions: Part 2 Sample size and the sampling distribution of Sampling distribution of Sampling methods.
Responding driven sampling Principles of Sampling Session 1.
11 Populations and Samples.
Sampling and Sampling Procedures.  In most epidemiologic studies, we deal with a sample of the population  The study population may be:  An entire.
Sampling Concepts Population: Population refers to any group of people or objects that form the subject of study in a particular survey and are similar.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect5_1.
Sampling Moazzam Ali.
Sampling Designs and Sampling Procedures
Sample Design.
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Sampling Basics Jeremy Kees, Ph.D.. Conceptually defined… Sampling is the process of selecting units from a population of interest so that by studying.
IB Business and Management
Chapter 7 Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to.
CHAPTER 12 – SAMPLING DESIGNS AND SAMPLING PROCEDURES Zikmund & Babin Essentials of Marketing Research – 5 th Edition © 2013 Cengage Learning. All Rights.
1 1 Slide Chapter 7 (b) – Point Estimation and Sampling Distributions Point estimation is a form of statistical inference. Point estimation is a form of.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Sampling Methods. Definition  Sample: A sample is a group of people who have been selected from a larger population to provide data to researcher. 
Research Methodology Lecture No :14 (Sampling Design)
Shooting right Sampling methods FETP India. Competency to be gained from this lecture Select a sample from a population to generate precise and valid.
1 Chapter 7 Sampling and Sampling Distributions Simple Random Sampling Point Estimation Introduction to Sampling Distributions Sampling Distribution of.
CLUSTER SAMPLING a survey of selected groups within a population the entire population is divided into groups (clusters), and a random sample of these.
Sampling “Sampling is the process of choosing sample which is a group of people, items and objects. That are taken from population for measurement and.
Chapter Twelve Chapter 12.
Chapter Twelve. Figure 12.1 Relationship of Sampling Design to the Previous Chapters and the Marketing Research Process Focus of This Chapter Relationship.
Lecture 9 Prof. Development and Research Lecturer: R. Milyankova
1 UNIT 10: POPULATION AND SAMPLE. 2 Population The entire set of people, things or objects to be studied An element is a single member of the population.
SAMPLING TECHNIQUES. Definitions Statistical inference: is a conclusion concerning a population of observations (or units) made on the bases of the results.
Sampling Methods. Probability Sampling Techniques Simple Random Sampling Cluster Sampling Stratified Sampling Systematic Sampling Copyright © 2012 Pearson.
Tahir Mahmood Lecturer Department of Statistics. Outlines: E xplain the role of sampling in the research process D istinguish between probability and.
Business Project Nicos Rodosthenous PhD 04/11/ /11/20141Dr Nicos Rodosthenous.
Chapter Eleven The entire group of people about whom information is needed; also called the universe or population of interest. The process of obtaining.
Sampling Sources: -EPIET Introductory course, Thomas Grein, Denis Coulombier, Philippe Sudre, Mike Catchpole -IDEA Brigitte Helynck, Philippe Malfait,
The Sampling Design. Sampling Design Selection of Elements –The basic idea of sampling is that by selecting some of the elements in a population, we may.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
Chapter 10 Sampling: Theories, Designs and Plans.
LIS 570 Selecting a Sample.
Ch. 11 SAMPLING. Sampling Sampling is the process of selecting a sufficient number of elements from the population.
 When every unit of the population is examined. This is known as Census method.  On the other hand when a small group selected as representatives of.
Sampling Presented by: Dr. Amira Yahia INTRODUCTION It might be impossible to investigate everybody in a population.Thus, you need to select a sample.
Sampling Design and Procedure
Chapter Eleven Sampling: Design and Procedures © 2007 Prentice Hall 11-1.
Statistics – Chapter 1 Data Collection
Graduate School of Business Leadership
Population and samples
Sampling.
Slides by JOHN LOUCKS St. Edward’s University.
محيط پژوهش محيط پژوهش كه قلمرو مكاني نيز ناميده مي شود عبارت است از مكاني كه نمونه هاي آماري مورد مطالعه از آنجا گرفته مي شود .
Welcome.
Sampling techniques & sample size.
Sampling Sampling is choosing a sample.
BUSINESS MARKET RESEARCH
Sample-Sampling-Pengelompokan Data
Sampling Sampling is choosing a sample.
Sampling: How to Select a Few to Represent the Many
Sampling Techniques Assist. Prof. Maha A. AL-Nuaimi Ph.D/ Comm. Med
Presentation transcript:

for epidemiological studies Sampling Techniques for epidemiological studies Biagio Pedalino

Objectives To decide whether to conduct sampling To choose among a list of sampling techniques To define sampling To describe sampling techniques

Approach We are normally interested in: - Distribution of a variable of interest in a specific population

Definition of population A population is defined by: Its nature (an individual, housing, a firm etc.) Its intrinsic characteristics (gender, housing type, industries) Its localisation (city, neighbourhood etc.) who, where and when

Examples of populations The inhabitants of London in 2010 French nationality women living in Paris in 2010 Children in elementary schools in France in 2010 HIV seropositive patients in hospital centers in France in 2006 Individuals recently entered in the French prisons in 2010

Variable of interest We are interested in a (non random) variable y in a population U (of k units) It must be defined carefully and accurately (i.e. vaccination status) Example We are interested in HIV (variable y) prevalence in a population. The variable of interest is defined, for each unit k in the population by: y= 1 (HIV seropositive); 0 (otherwise, i.e. negative, not known, etc)

Example of research question What is the proportion of individuals vaccinated against Hepatitis B in Lazareto, in October 2012 ? Variable of interest? Population? Time? How to obtain the information about the variable ? Census Build a sample

Example of research question What is the proportion of individuals vaccinated against Hepatitis B in Lazareto, in October 2012 ? Variable of interest? Population? Time? How to obtain the information about the variable ? Census Build a sample

Example of research question What is the proportion of children vaccinated against Hepatitis B in Minorca, in 2012 ? Variable of interest? Population? Time? How to obtain the information about the variable ? Census Build a sample

Example of research question What is the proportion of children vaccinated against Hepatitis B in Minorca, in 2012 ? Variable of interest? Population? Time? How to obtain the information about the variable ? Census Build a sample

Definition of sampling Sampling is the process of selecting units from a specific population to collect information on a variable of interest

Sample Sampling frame Target population

Why bother in the first place? Get information from large populations with: Reduced costs Reduced field time Increased accuracy

Definition of sampling terms Sampling frame List of all the sampling units from which sample is drawn Lists: e.g. all children < 5 years of age, households, health care units… Sampling scheme Method of selecting sampling units from sampling frame Randomly, convenience sample…

Definition of sampling terms Sampling unit (element) Subject under observation from whom information is collected Example: children <5 years, hospital discharges, health events… Sampling fraction Ratio between sample size and population size Example: 100 out of 2000 (5%)

Sampling errors Systematic error (or bias) Representativeness (validity) Information bias Sampling error (random error) Precision

Validity Sample should accurately reflect the distribution of relevant variable in population Person (age, sex) Place (urban vs. rural) Time (seasonality) Representativeness essential to generalise

Representativeness Often used as synonym of validity of a sample General rule: to build a sample representative of the whole population of interest Erroneous

Which is the correct sample? Populations Women Men Women Men Women Men 50% 50% 50% 50% 50% 50% All of them are correct !!! Samples Women Men Women Men Women Men 50% 50% 30% 70% 70% 30%

Example: question Aim of the study is to estimate the national prevalence of elevated Blood Lead Level (BLL 100μg/L) determine the risk factors associated to elevated BLL Among children aged 1 to 6 years in Minorca in 2008-2009 We want to recruit 3000 children through hospitals Is the best design to select each unit with the same inclusion probability?

Example: answer No! ... because if the expected prevalence of elevated BLL is 1% With a sample size = 3000, we expect to have 30 children with an elevated BLL in the sample Small number to achieve the second objective of the study, i.e. identification of risk factors It is one of the reasons why we do not perform surveys with equally-represented individuals in epidemiology

Example: solution We want to over-represent children with an elevated BLL in the sample If we know that some hospitals stand in areas where the risk of lead exposure in the dwellings is high, then we will over-represent hospitals in these high risk areas

Representativeness: take home messages A sample is correct if randomly built It is not necessary that the distributions in the sample and in the population are the same

Information bias Systematic problem in collecting information Inaccurate measuring Scales (weight), ultrasound, lab tests (dubious results) Badly asked questions Ambiguous, not offering right options…

Sampling error (random error) No sample is an exact mirror image of the population Standard error depends on size of the sample distribution of character of interest in population Size of error can be measured in probability samples standard error

Quality of a sampling estimate No precision Random error Precision but no validity Systematic error (bias) Precision & validity 7

Survey errors: example Measuring height: Measuring tape held differently by different investigators → loss of precision → large standard error Tape too short → systematic error → bias (cannot be corrected retrospectively)

Types of sampling Non-probability samples Probability samples Convenience samples Biased Subjective samples Based on knowledge In the presence of time/resource constraints Probability samples Random only method that allows valid conclusions about population and measurements of sampling error 28

Non-probability samples Convenience samples (ease of access) Snowball sampling (friend of friend….etc.) Purposive sampling (judgemental) You chose who you think should be in the study Probability of being chosen is unknown Cheaper- but unable to generalise, potential for bias

Example of a non-probability sample Take a sample of the population of Minorca to ask about possible exposures following a gastroenteritis outbreak Sampling frame: people walking around the Es Castel harbour at noon on a Monday

Probability samples Random sampling Each unit has a known probability of being selected Allows application of statistical sampling theory to results in order to: Generalise Test hypotheses

Methods used in probability samples Simple random sampling Systematic sampling Stratified sampling Multi-stage sampling Cluster sampling

Simple random sampling Principle Equal chance/probability of each unit being drawn Procedure Take sampling population Need listing of all sampling units (“sampling frame”) Number all units Randomly draw units

Simple random sampling 5 20 27 29 32 40

Simple random sampling Advantages Simple Sampling error easily measured Disadvantages Need complete list of units Units may be scattered and poorly accessible Heterogeneous population  important minorities might not be taken into account

Systematic sampling Principle Procedure Select sampling units at regular intervals (e.g. every 20th unit) Procedure Arrange the units in some kind of sequence Divide total sampling population by the designated sample size (eg 1200/60=20) Choose a random starting point (for 20, the starting point will be a random number between 1 and 20) Select units at regular intervals (in this case, every 20th unit), i.e. 4th, 24th, 44th etc. 37

Systematic sampling Advantages Disadvantages Ensures representativity across list Easy to implement Disadvantages Need complete list of units Periodicity-underlying pattern may be a problem (characteristics occurring at regular intervals) 38

More complex sampling methods

Stratified sampling When to use Procedure Population with distinct subgroups Procedure Divide (stratify) sampling frame into homogeneous subgroups (strata) e.g. age-group, urban/rural areas, regions, occupations Draw random sample within each stratum 40

Selecting a sample with probability proportional to size Stratified sampling Selecting a sample with probability proportional to size Area Population Proportion Sample size Sampling size fraction Urban 7000 70% 1000 x 0.7 = 700 10 % Rural 3000 30% 1000 x 0.3 = 300 10 % 1000 Total 10000 41

Stratified sampling Advantages Disadvantages Can acquire information about whole population and individual strata Precision increased if variability within strata is smaller (homogenous) than between strata Disadvantages Sampling error is difficult to measure Different strata can be difficult to identify Loss of precision if small numbers in individual strata (resolved by sampling proportional to stratum population) 42

Multiple stage sampling Principle: Consecutive sampling Example : sampling unit = household 1st stage: draw neighbourhoods 2nd stage: draw buildings 3rd stage: draw households 8

Cluster sampling Principle Whole population divided into groups e.g. neighbourhoods A type of multi-stage sampling where all units at the lower level are included in the sample Random sample taken of these groups (“clusters”) Within selected clusters, all units e.g. households included (or random sample of these units) Provides logistical advantage

Number of cluster needed=25

Number of cluster needed=25

All third-stage units might be included in the sample Stage 3: Selection of the sampling unit All third-stage units might be included in the sample 54

Stage 3: Selection of the sampling unit Second-stage units => Households Third-stage unit => Individuals 55

Cluster sampling Advantages Simple as complete list of sampling units within population not required Less travel/resources required Disadvantages Cluster members may be more alike than those in another cluster (homogeneous) This needs to be taken into account in the sample size and in the analysis (“design effect”)

Selecting a sampling method Population to be studied Size/geographical distribution Heterogeneity with respect to variable Availability of list of sampling units Level of precision required Resources available

Conclusions Probability samples are the best Ensure Validity Precision …..within available constraints

Conclusions If in doubt… Call a statistician !!!!

Questions?