Essentials of Marketing Research Chapter 13: Determining Sample Size
WHAT DO STATISTICS MEAN? DESCRIPTIVE STATISTICS NUMBER OF PEOPLE TRENDS IN EMPLOYMENT DATA INFERENTIAL STATISTICS MAKE AN INFERENCE ABOUT A POPULATION FROM A SAMPLE
POPULATION PARAMETER VERSUS SAMPLE STATISTICS
POPULATION PARAMETER VARIABLES IN A POPULATION MEASURED CHARACTERISTICS OF A POPULATION GREEK LOWER-CASE LETTERS AS NOTATION, e.g. m, s, etc.
SAMPLE STATISTICS VARIABLES IN A SAMPLE MEASURES COMPUTED FROM SAMPLE DATA ENGLISH LETTERS FOR NOTATION e.g., or S
MAKING DATA USABLE Data must be organized into: FREQUENCY DISTRIBUTIONS PROPORTIONS CENTRAL TENDENCY MEAN, MEDIAN, MODE MEASURES OF DISPERSION range, deviation, standard deviation, variance
Frequency Distribution of Deposits
MEASURES OF CENTRAL TENDENCY MEAN - ARITHMETIC AVERAGE MEDIAN - MIDPOINT OF THE DISTRIBUTION MODE - THE VALUE THAT OCCURS MOST OFTEN
Number of Sales Calls Per Day by Salespersons Salesperson Sales calls Mike 4 Patty 3 Billie 2 Bob 5 John 3 Frank 3 Chuck 1 Samantha 5 26
Sales for Products A and B, Both Average 200 Product A Product B 196 150 198 160 199 176 199 181 200 192 200 200 200 201 201 202 201 213 201 224 202 240 202 261
MEASURES OF DISPERSION THE RANGE STANDARD DEVIATION
Low Dispersion Versus High Dispersion 5 4 3 2 1 Low Dispersion Frequency 150 160 170 180 190 200 210 Value on Variable
5 4 3 2 1 High dispersion Frequency 150 160 170 180 190 200 210 Value on Variable
Standard Deviation 2 2 (X - X) n - 1 S = S =
THE NORMAL DISTRIBUTION NORMAL CURVE BELL-SHAPED ALMOST ALL OF ITS VALUES ARE WITHIN PLUS OR MINUS 3 STANDARD DEVIATIONS I.Q. IS AN EXAMPLE
NORMAL DISTRIBUTION MEAN Conventional Product Adoption Life Cycle: Five types of customers who will end up adopting a product INNOVATORS (2.5%): People who are the first to adopt a product. They are trend-setting, risk-taking, and are not typical consumers. Example: See a movie first weekend it’s out or in a preview. EARLY ADOPTERS (13.5%): People who are among the first but not as risk-taking. They adopt ideas early but with consideration, and they enjoy roles as opinion leaders. They spread the word about the product. Example: See a movie the first week of its release. EARLY MAJORITY (34%): Deliberate customers; adopt earlier than most customers but are not leaders. Example: See a movie after a few weeks, after reading all the reviews and getting recommendations from early adopters. LATE MAJORITY (34%): Skeptical customers, will only adopt an idea if the majority of people have tried it. Example: See a movie after it has been nominated for an Oscar. LAGGARDS (16%): Tradition-bound, suspicious of change; will adopt an idea only after it has been around long enough. Example: See a movie after it has come out on video. MEAN
Normal Distribution 13.59% 13.59% 34.13% 34.13% 2.14% 2.14% Conventional Product Adoption Life Cycle: Five types of customers who will end up adopting a product INNOVATORS (2.5%): People who are the first to adopt a product. They are trend-setting, risk-taking, and are not typical consumers. Example: See a movie first weekend it’s out or in a preview. EARLY ADOPTERS (13.5%): People who are among the first but not as risk-taking. They adopt ideas early but with consideration, and they enjoy roles as opinion leaders. They spread the word about the product. Example: See a movie the first week of its release. EARLY MAJORITY (34%): Deliberate customers; adopt earlier than most customers but are not leaders. Example: See a movie after a few weeks, after reading all the reviews and getting recommendations from early adopters. LATE MAJORITY (34%): Skeptical customers, will only adopt an idea if the majority of people have tried it. Example: See a movie after it has been nominated for an Oscar. LAGGARDS (16%): Tradition-bound, suspicious of change; will adopt an idea only after it has been around long enough. Example: See a movie after it has come out on video. 2.14% 2.14%
An example of the distribution of Intelligence Quotient (IQ) scores 13.59% 13.59% 34.13% 34.13% 2.14% 2.14% 70 85 100 115 130 IQ
STANDARDIZED NORMAL DISTRIBUTION SYMMETRICAL ABOUT ITS MEAN MEAN IDENTIFIES HIGHEST POINT INFINITE NUMBER OF CASES - A CONTINUOUS DISTRIBUTION AREA UNDER CURVE HAS A PROBABILITY DENSITY = 1.0 MEAN OF ZERO, STANDARD DEVIATION OF 1
A STANDARDIZED NORMAL CURVE Conventional Product Adoption Life Cycle: Five types of customers who will end up adopting a product INNOVATORS (2.5%): People who are the first to adopt a product. They are trend-setting, risk-taking, and are not typical consumers. Example: See a movie first weekend it’s out or in a preview. EARLY ADOPTERS (13.5%): People who are among the first but not as risk-taking. They adopt ideas early but with consideration, and they enjoy roles as opinion leaders. They spread the word about the product. Example: See a movie the first week of its release. EARLY MAJORITY (34%): Deliberate customers; adopt earlier than most customers but are not leaders. Example: See a movie after a few weeks, after reading all the reviews and getting recommendations from early adopters. LATE MAJORITY (34%): Skeptical customers, will only adopt an idea if the majority of people have tried it. Example: See a movie after it has been nominated for an Oscar. LAGGARDS (16%): Tradition-bound, suspicious of change; will adopt an idea only after it has been around long enough. Example: See a movie after it has come out on video. 1 2 -2 -1
STANDARDIZED SCORES
POPULATION DISTRIBUTION SAMPLE DISTRIBUTION SAMPLING DISTRIBUTION
-s s m x POPULATION DISTRIBUTION Conventional Product Adoption Life Cycle: Five types of customers who will end up adopting a product INNOVATORS (2.5%): People who are the first to adopt a product. They are trend-setting, risk-taking, and are not typical consumers. Example: See a movie first weekend it’s out or in a preview. EARLY ADOPTERS (13.5%): People who are among the first but not as risk-taking. They adopt ideas early but with consideration, and they enjoy roles as opinion leaders. They spread the word about the product. Example: See a movie the first week of its release. EARLY MAJORITY (34%): Deliberate customers; adopt earlier than most customers but are not leaders. Example: See a movie after a few weeks, after reading all the reviews and getting recommendations from early adopters. LATE MAJORITY (34%): Skeptical customers, will only adopt an idea if the majority of people have tried it. Example: See a movie after it has been nominated for an Oscar. LAGGARDS (16%): Tradition-bound, suspicious of change; will adopt an idea only after it has been around long enough. Example: See a movie after it has come out on video. -s s m x
SAMPLE DISTRIBUTION _ C X S Conventional Product Adoption Life Cycle: Five types of customers who will end up adopting a product INNOVATORS (2.5%): People who are the first to adopt a product. They are trend-setting, risk-taking, and are not typical consumers. Example: See a movie first weekend it’s out or in a preview. EARLY ADOPTERS (13.5%): People who are among the first but not as risk-taking. They adopt ideas early but with consideration, and they enjoy roles as opinion leaders. They spread the word about the product. Example: See a movie the first week of its release. EARLY MAJORITY (34%): Deliberate customers; adopt earlier than most customers but are not leaders. Example: See a movie after a few weeks, after reading all the reviews and getting recommendations from early adopters. LATE MAJORITY (34%): Skeptical customers, will only adopt an idea if the majority of people have tried it. Example: See a movie after it has been nominated for an Oscar. LAGGARDS (16%): Tradition-bound, suspicious of change; will adopt an idea only after it has been around long enough. Example: See a movie after it has come out on video. _ C X S
SAMPLING DISTRIBUTION Conventional Product Adoption Life Cycle: Five types of customers who will end up adopting a product INNOVATORS (2.5%): People who are the first to adopt a product. They are trend-setting, risk-taking, and are not typical consumers. Example: See a movie first weekend it’s out or in a preview. EARLY ADOPTERS (13.5%): People who are among the first but not as risk-taking. They adopt ideas early but with consideration, and they enjoy roles as opinion leaders. They spread the word about the product. Example: See a movie the first week of its release. EARLY MAJORITY (34%): Deliberate customers; adopt earlier than most customers but are not leaders. Example: See a movie after a few weeks, after reading all the reviews and getting recommendations from early adopters. LATE MAJORITY (34%): Skeptical customers, will only adopt an idea if the majority of people have tried it. Example: See a movie after it has been nominated for an Oscar. LAGGARDS (16%): Tradition-bound, suspicious of change; will adopt an idea only after it has been around long enough. Example: See a movie after it has come out on video. ¾ C µX SX
STANDARD ERROR OF THE MEAN STANDARD DEVIATION OF THE SAMPLING DISTRIBUTION
CENTRAL LIMIT THEOREM
PARAMETER ESTIMATES POINT ESTIMATES CONFIDENCE INTERVAL ESTIMATES
RANDOM SAMPLING ERROR AND SAMPLE SIZE ARE RELATED
SAMPLE SIZE VARIANCE (STANDARD DEVIATION) MAGNITUDE OF ERROR CONFIDENCE LEVEL
Determining Sample Size Recap
Sample Accuracy How close the sample’s profile is to the true population’s profile Sample size is not related to representativeness, Sample size is related to accuracy
Methods of Determining Sample Size Compromise between what is theoretically perfect and what is practically feasible. Remember, the larger the sample size, the more costly the research. Why sample one more person than necessary?
Methods of Determining Sample Size Arbitrary Rule of Thumb (ex. A sample should be at least 5% of the population to be accurate Not efficient or economical Conventional Follows that there is some “convention” or number believed to be the right size Easy to apply, but can end up with too small or too large of a sample
Methods of Determining Sample Size Cost Basis based on budgetary constraints Statistical Analysis certain statistical techniques require certain number of respondents Confidence Interval theoretically the most correct method
Notion of Variability Little variability Great variability Mean
Notion of Variability Standard Deviation approximates the average distance away from the mean for all respondents to a specific question indicates amount of variability in sample ex. compare a standard deviation of 500 and 1000, which exhibits more variability?
Measures of Variability Standard Deviation: indicates the degree of variation or diversity in the values in such as way as to be translatable into a normal curve distribution Variance = (x-x)2/ (n-1) With a normal curve, the midpoint (apex) of the curve is also the mean and exactly 50% of the distribution lies on either side of the mean. i
Normal Curve and Standard Deviation
Notion of Sampling Distribution The sampling distribution refers to what would be found if the researcher could take many, many independent samples The means for all of the samples should align themselves in a normal bell-shaped curve Therefore, it is a high probability that any given sample result will be close to but not exactly to the population mean.
Normal, bell-shaped curve Midpoint (mean)
Notion of Confidence Interval A confidence interval defines endpoints based on knowledge of the area under a bell-shaped curve. Normal curve 1.96 times the standard deviation theoretically defines 95% of the population 2.58 times the standard deviation theoretically defines 99% of the population
Notion of Confidence Interval Example Mean = 12,000 miles Standard Deviation = 3000 miles We are confident that 95% of the respondents’ answers fall between 6,120 and 17,880 miles 12,000 + (1.96 * 3000) = 17,880 12,000 - (1.96 * 3000) = 6.120
Notion of Standard Error of a Mean Standard error is an indication of how far away from the true population value a typical sample result is expected to fall. Formula S X = s / (square root of n) S p = Square root of {(p*q)/ n} where S p is the standard error of the percentage p = % found in the sample and q = (100-p) S X is the standard error of the mean s = standard deviation of the sample n = sample size
Computing Sample Size Using The Confidence Interval Approach To compute sample size, three factors need to be considered: amount of variability believed to be in the population desired accuracy level of confidence required in your estimates of the population values
Determining Sample Size Using a Mean Formula: n = (pqz2)/e2 Formula: n = (s2z2)/e2 Where n = sample size z = level of confidence (indicated by the number of standard errors associated with it) s = variability indicated by an estimated standard deviation p = estimated variability in the population q = (100-p) e = acceptable error in the sample estimate of the population
Determining Sample Size Using a Mean: An Example 95% level of confidence (1.96) Standard deviation of 100 (from previous studies) Desired precision is 10 (+ or -) Therefore n = 384 (1002 * 1.962) / 102
Practical Considerations in Sample Size Determination How to estimate variability in the population prior research experience intuition How to determine amount of precision desired small samples are less accurate how much error can you live with?
Practical Considerations in Sample Size Determination How to calculate the level of confidence desired risk normally use either 95% or 99%
Determining Sample Size Higher n (sample size) needed when: the standard error of the estimate is high (population has more variability in the sampling distribution of the test statistic) higher precision (low degree of error) is needed (i.e., it is important to have a very precise estimate) higher level of confidence is required Constraints: cost and access
Notes About Sample Size Population size does not determine sample size. What most directly affects sample size is the variability of the characteristic in the population. Example: if all population elements have the same value of a characteristic, then we only need a sample of one!