Reasoning in Psychology Using Statistics 2017
Announcements Quiz 2 Don’t forget Exam 1 is coming up (Feb 8) Today Quiz 2 due Fri. Feb. 3 (11:59 pm) Don’t forget Exam 1 is coming up (Feb 8) In class part – multiple choice, closed book In labs part – open book/notes Today Sampling and basic probability Announcements
Ann Landers to readers, “If you had to do it again, would you have children?” (1975-76) 70% said kids not worth it! Nearly 10,000 responses Do you believe results? Does it reflect population of parents? Is the sample representative of all parents? Sampling Discussion of the 1976 Landers survey
Sampling Those research is about Population Subset that participates in research (giving us our data) Sample Sampling
Sampling Population Sampling to make data collection manageable Inferential statistics to generalize back Sampling to make data collection manageable Sample Sampling
Sampling Population (N=25) For rate hikes Against rate hikes Local politician wants to know opinions on proposed rate hikes For rate hikes Against rate hikes Proportion “for hikes” in population # “for hikes” Total # = 10 25 0.4 Sampling
Expect to get sample that matches population exactly? Population (N=25) Sample (n=5) Expect to get sample that matches population exactly? If not: SAMPLING ERROR Proportion “for hikes” in sample # “for hikes” Total # = 2 5 0.4 Sampling
Sampling Goals of sampling: Reduce: Sampling error Maximize: Representativeness Minimize: Bias Sampling
Sampling Goals of sampling: Reduce: Sampling error difference between population parameter and sample statistic BUT we usually don’t know what the population parameter is! Maximize: Representativeness Minimize: Bias Sampling
Sampling Error Population (N=25) Sample (n=5) # “for hikes” Proportion “ for hikes” in population # “for hikes” Total # = 10 25 0.4 Proportion “for hikes” in sample # “for hikes” Total # = 2 5 0.4 parameter statistic Sampling error = 0.4 - 0.4 = 0 Sampling Error
Sampling Error Population (N=25) Sample (n=5) # “for hikes” Proportion “ for hikes” in population # “for hikes” Total # = 10 25 0.4 Proportion “for hikes” in sample # “for hikes” Total # = 3 5 0.6 parameter statistic Sampling error = 0.6 - 0.4 = 0.2 Sampling Error
Sampling Error: Games of chance Population (N=52) Lots of Samples (hands n=5) Sampling Error: Games of chance http://www.intmath.com/counting-probability/poker.php Lucky numbers: Marcus du Sautoy (~14 mins)
Sampling Error: Games of chance Population (N=52) Sample (n=5) 13 Proportion of spades = 52 in deck = 0.25 1 Proportion of spades = 5 in a draw = 0.20 parameter statistic Sampling error = 0.25 – 0.20 = 0.05 Sampling Error: Games of chance
Sampling Error: Games of chance Population (N=52) Sample (n=5) 13 Proportion of any suit = 52 in deck = 0.25 5 Proportion of suit = 5 in a draw = 1.0 parameter statistic Sampling error = 0.25 – 1.0 = 0.75 Sampling Error: Games of chance
Formula we will learn later: SE = SD/√n Use sample (statistic) to estimate population (parameter) Problem: Samples vary different estimates depending on sample But we know what affects size of sampling error (can prove mathematically) Variability in population (+ relationship) As variability increases, sampling error increases Size of sample (- or inverse relationship) As sample size increases, sampling error decreases Formula we will learn later: SE = SD/√n Parameter, Greek for besides the measure (compare paralegal, paramilitary) Sampling Error
Sampling Goals of sampling Reduce: Sampling error difference between population parameter and sample statistic to what extent do characteristics of sample reflect those in population systematic difference between sample and population Maximize: Representativeness Minimize: Bias Sampling
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling Sampling Methods
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling Every individual has equal & independent chance of being selected from population 3 2 Sampling Methods
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling Step 1: compute K = population size/sample size Step 2: randomly select Kth person 22/6 K = 4 4 1 Sampling Methods
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling Step 2: randomly select from each group (proportional to size of group: 8/23=.35 11/23=.48 4/23=.17) Step 1: Identify groups (strata) blue green red If n =5, 2 1 Sampling Methods
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling Step 1: Identify groups blue green red Step 2: pick first # from each group (not proportional) If n =6, 2 Sampling Methods
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling 70% of parents say kids not worth it! Convenience sampling: voluntary response method of sampling Using easily available participants Results typically biased Typical respondents with very strong opinions (NOT representative of population) Newsday random sample (n = 1373) found 91% said “yes” For more discussion: David Bellhouse Sampling Methods
Sampling Methods Probability sampling Non-probability sampling Simple random sampling Systematic random sampling Stratified sampling Convenience sampling Quota sampling Good Poor Representativeness Stacked Deck Bias Sampling Methods
Inferential statistics Where does “probability” fit in? Population Randomness in sampling leads to variability in sampling error “Randomness” in short run is unpredictable but in long run is predictable! Odds in games of chance Allows predictions about likelihood of getting particular samples Possible Samples Inferential statistics
Inferential statistics Where does “probability” fit in? Probability of 4 of a kind = 0.00024 Probability of a sample with particular characteristics If we know the proportions in the population And we know how we sampled: Deal 5 cards Allows predictions about likelihood of getting particular samples Inferential statistics Tools that use our estimates of sampling error to generalize from observations from samples to statements about the populations
Basics of probability: Derived from games with all outcomes known Draw lettered tiles from bag Bag contains: A’s B’s and C’s. Both upper and lower case letters A a b B c C What is the probability of getting an A (upper or lower case)? Total number of outcomes classified as A Prob. of A = p(A) = Total number of possible outcomes Sample space Basics of probability: Derived from games with all outcomes known
Flipping a coin example: 1 flip What are odds of getting heads? One outcome classified as heads = 1 2 Total of two outcomes = 0.5 This simplest case is known as the binomial 2n = 21 = 2 total outcomes pn=(0.5)1= the prob of a single outcome Flipping a coin example: 1 flip
Flipping a coin example: 2 flips What are the odds of getting all heads? Number of heads 2 Four total outcomes One 2 heads outcome 1 = 0.25 1 2n = 22 = 4 total outcomes pn = (0.5)2 = 0.25 for 1 outcome twice in a row Flipping a coin example: 2 flips All heads on 3 flips? 23 = 8 outcomes p3 = (0.5)3 = 0.125 or ⅛
Flipping a coin example: 2 flips What are the odds of getting only one heads? Number of heads 2 Four total outcomes 1 Two 1 heads outcome = 0.50 1 Flipping a coin example: 2 flips
Flipping a coin example: 2 flips What are the odds of getting at least one heads? Number of heads 2 Four total outcomes Three at least one heads outcome 1 = 0.75 1 Flipping a coin example: 2 flips
Flipping a coin example: 2 flips What are the odds of getting no heads? Number of heads 2 Four total outcomes 1 One no heads outcome = 0.25 1 Flipping a coin example: 2 flips
Odds in Poker What are the odds of being dealt a “Royal Flush”? Total number of possible outcomes Total number of outcomes classified as A Prob. of A = p(A) = 4 p(Royal Flush) = = 0.000001539 2,598,960 ~1.5 hands out of every million hands Odds in Poker
Odds in Poker What are the odds of being dealt a “Straight Flush”? Total number of possible outcomes Total number of outcomes classified as A Prob. of A = p(A) = 40 p(straightflush) = = 0.00001539 2,598,960 ~15 hands out of every million hands Odds in Poker
Odds in Poker What are the odds of being dealt a …? Total number of possible outcomes Total number of outcomes classified as A Prob. of A = p(A) = Odds in Poker
Inferential statistics Where does “probability” fit into statistics? Most research uses samples rather than populations. The predictability in the long run, allows us to know quantify the probable size of the sampling error. Inferential statistics use our estimates of sampling error to generalize from observations from samples to statements about the populations. Inferential statistics
Wrap up Today’s lab: Try out sampling and probability Questions? Breaking down probability sampling (~4 mins) Sampling: Simple Random, Convenience, systematic, cluster, stratified (~4 mins) Non-Probability Sampling (~4 mins) Basics Probability and Statistics | Khan Academy (~8 mins) Example 2 | Probability and Statistics | Khan Academy (~10 mins) Probability with playing cards | Khan Academy (~10 mins) Wrap up