Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2015 Room 150 Harvill Building 8:00 - 8:50 Mondays, Wednesdays & Fridays.
Labs continue this week with Exam 2 review
Schedule of readings Before next exam (March 6 th ) Please read chapters 5, 6, & 8 in Ha & Ha Please read Chapters 10, 11, 12 and 14 in Plous Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability and Risk Chapter 14: The Perception of Randomness
Exam 2 – This Friday (3/6/15) Study guide is online now Bring 2 calculators (remember only simple calculators, we can’t use calculators with programming functions) Bring 2 pencils (with good erasers) Bring ID
By the end of lecture today 3/2/15 Use this as your study guide Connecting raw scores, and z scores to probability, proportion and area of curve Percentiles Central Limit Theorem Law of Large Numbers Dan Gilbert Readings
Homework due – Wednesday (March 4 th ) On class website: Please print and complete homework worksheet #13 Dan Gilbert Reading - Law of Large Numbers
Preview of Homework Worksheet just in case of questions
Homework review Based on apriori probability – all options equally likely – not based on previous experience or data Based on expert opinion - don’t have previous data for these two companies merging together 2 5 =.40 Based on frequency data (Percent of rockets that successfully launched)
Homework review Based on apriori probability – all options equally likely – not based on previous experience or data Based on frequency data (Percent of times that pages that are “fake”) =.30 Based on frequency data (Percent of times at bat that successfully resulted in hits)
Homework review 5 50 =.10 Based on frequency data (Percent of students who successfully chose to be Economics majors)
= = z of 1.5 = area of =.8276 z of 1.25 = area of = = = area of = = z of.5 = area of =.2029 z of 1.25 = area of
Homework review = 0.45 z of 0.45 = area of = = 0.45 z of 0.45 = area of = = 1.22 z of 1.22 = area of = -.32 z of = area of = = 1.22 z of 1.22 = area of
Homework review = 1.43 z of 1.43 = area of = = 1.43 z of 1.43 = area of.4236 z of = area of –.3051 =.1185 z of -.86 = area of = = =
Comments on Dan Gilbert Reading
Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true (theoretical) probability As the number of observations (n) increases or the number of times the experiment is performed, the estimate will become more accurate.
Law of large numbers: As the number of measurements increases the data becomes more stable and a better approximation of the true signal (e.g. mean) As the number of observations (n) increases or the number of times the experiment is performed, the signal will become more clear (static cancels out) With only a few people any little error is noticed (becomes exaggerated when we look at whole group) With many people any little error is corrected (becomes minimized when we look at whole group)
Sampling distributions of sample means versus frequency distributions of individual scores XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Melvin Eugene Distribution of raw scores: is an empirical probability distribution of the values from a sample of raw scores from a population Frequency distributions of individual scores derived empirically we are plotting raw data this is a single sample Population Take a single score Repeat over and over x x x x x x x x Preston
Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Sampling distributions of sample means theoretical distribution we are plotting means of samples Population Take sample – get mean Repeat over and over important note: “fixed n” Mean for 1 st sample
Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Population Distribution of means of samples Sampling distributions of sample means theoretical distribution we are plotting means of samples Take sample – get mean Repeat over and over important note: “fixed n”
Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX 2 nd sample 23 rd sample Sampling distributions sample means theoretical distribution we are plotting means of samples Frequency distributions of individual scores derived empirically we are plotting raw data this is a single sample Melvin Eugene
Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Sampling distribution for continuous distributions XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Melvin Eugene Sampling Distribution of Sample means Distribution of Raw Scores 2 nd sample 23 rd sample
An example of a sampling distribution of sample means µ = 100 σ = 3 = 1 XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Sampling distribution: is a theoretical probability distribution of the possible values of some sample statistic that would occur if we were to draw an infinite number of same-sized samples from a population Mean = Standard Deviation = 3 µ = 100 Mean = 100 Standard Error of the Mean = 1 Notice: Standard Error of the Mean (SEM) is smaller than SD – especially as n increases Melvin Eugene 2 nd sample 23 rd sample
Proposition 1: If sample size ( n ) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population Central Limit Theorem Proposition 2: If sample size ( n ) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population XXXXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXX XXXXXX XXXX XXXX X X XXXXXXXXXX XXXXXXXXXX X XXXXXXXXXX Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ x will approach µ As n ↑ curve will approach normal shape As n ↑ curve variability gets smaller