Welcome to Week 07 College Statistics

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Chapter 6 – Normal Probability Distributions
Normal Distribution Sampling and Probability. Properties of a Normal Distribution Mean = median = mode There are the same number of scores below and.
Statistics and Quantitative Analysis U4320
Sampling Distributions (§ )
Central Limit Theorem.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Chapter 13 Conducting & Reading Research Baumgartner et al Data Analysis.
Descriptive Statistics
HIM 3200 Normal Distribution Biostatistics Dr. Burton.
Sampling Distributions
Central Tendency and Variability
Inferential Statistics
Standard error of estimate & Confidence interval.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Review of normal distribution. Exercise Solution.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Chapter 11: Estimation Estimation Defined Confidence Levels
A Sampling Distribution
AP Statistics Chapter 9 Notes.
The normal distribution Binomial distribution is discrete events, (infected, not infected) The normal distribution is a probability density function for.
Measures of Dispersion & The Standard Normal Distribution 2/5/07.
Measures of Dispersion & The Standard Normal Distribution 9/12/06.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Determination of Sample Size: A Review of Statistical Theory
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Confidence Intervals Target Goal: I can use normal calculations to construct confidence intervals. I can interpret a confidence interval in context. 8.1b.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
Vocab Normal, Standard Normal, Uniform, t Point Estimate Sampling distribution of the means Confidence Interval Confidence Level / α.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Sampling Distributions Chapter 18. Sampling Distributions If we could take every possible sample of the same size (n) from a population, we would create.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Active Learning Lecture Slides For use with Classroom Response Systems
GOVT 201: Statistics for Political Science
And distribution of sample means
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
INF397C Introduction to Research in Information Studies Spring, Day 12
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Research Methods in Psychology PSY 311
Psychology Unit Research Methods - Statistics
Sampling Distributions and Estimation
Statistics.
Distribution of the Sample Means
Welcome to . Week 08 Thurs . MAT135 Statistics.
Statistical Reasoning in Everyday Life
Week 10 Chapter 16. Confidence Intervals for Proportions
Research Statistics Objective: Students will acquire knowledge related to research Statistics in order to identify how they are used to develop research.
Introduction to Summary Statistics
Confidence Interval Estimation and Statistical Inference
Introduction to Summary Statistics
Econ 3790: Business and Economics Statistics
Inferential Statistics
Statistics Confidence Intervals
Essential Statistics Introduction to Inference
Confidence Intervals: The Basics
Welcome to . Week 05 Tues . MAT135 Statistics.
Calculating Probabilities for Any Normal Variable
Chapter 7: The Distribution of Sample Means
Estimates and Sample Sizes Lecture – 7.4
Sampling Distributions (§ )
GENERALIZATION OF RESULTS OF A SAMPLE OVER POPULATION
Advanced Algebra Unit 1 Vocabulary
Chapter 5: Sampling Distributions
How Confident Are You?.
Presentation transcript:

Welcome to Week 07 College Statistics http://media.dcnews.ro/image/201109/w670/statistics.jpg

Non-normal Distributions Last class we studied a lot about the normal distribution Some distributions are not normal …

Non-normal Distributions Kurtosis - how tall or flat your curve is compared to a normal curve

Non-normal Distributions Curves taller than a normal curve are called “Leptokurtic” Curves that are flatter than a normal curve are called ”Platykurtic”

Non-normal Distributions Platykurtic Leptokurtic W.S. Gosset 1908

Sampling Distributions PROJECT QUESTION Which is platykurtic? Which is leptokurtic?

Non-normal Distributions Skewness – the data are “bunched” to one side vs a normal curve

Non-normal Distributions Scores that are "bunched" at the right or high end of the scale are said to have a “negative skew”

Non-normal Distributions In a “positive skew”, scores are bunched near the left or low end of a scale

Non-normal Distributions Note: this is exactly the opposite of how most people use the terms!

Sampling Distributions PROJECT QUESTION Which is positively skewed? Which is negatively skewed?

Normal Distributions We use normal distributions a lot in statistics because lots of things have graphs this shape! heights weights IQ test scores bull’s eyes

Normal Distributions Also, even data which are not normally distributed have averages which DO have normal distributions

Normal Distributions If you take a gazillion samples and find the means for each of the gazillion samples You would have a new population: the gazillion means

Normal Distributions

Normal Distributions If you plotted the frequency of the gazillion mean values, it is called a SAMPLING DISTRIBUTION

Sampling Distributions The shape of the plot of the gazillion sample means would have a normal-ish distribution NO MATTER WHAT THE ORIGINAL DATA LOOKED LIKE

Sampling Distributions But … the shape of the distribution of your gazillion means changes with the size “n” of the samples you took

Sampling Distributions Graphs of a gazillion means for different n values

Sampling Distributions As “n” increases, the distributions of the means become closer and closer to normal

Sampling Distributions This also works for discrete data

Sampling Distributions as “n” increases, variability (spread) also decreases

Sampling Distributions We usually say the sample mean will be normally distributed if n is ≥ 20 (the “good-enuff” value)

Sampling Distributions The statistical principle that allows us to conclude that sample means have a normal distribution if the sample size is 20 or more is called the Central Limit Theorem

Sampling Distributions If you can assume the distribution of the sample means is normal, you can use the normal distribution probabilities for making probability statements about µ

Sampling Distributions Sample means from platykurtic, leptokurtic, and bimodal distributions become “normal enough” when your sample size n is 20 or more

Sampling Distributions Means from samples of skewed populations do not become “normal enough” very easily You sometimes need a mega-huge sample size to “normalize” a badly skewed distribution

Sampling Distributions A wild outlier might indicate a badly skewed distribution

Sampling Distributions PROJECT QUESTION From which of these would you expect the distribution of the sample means to be normal? Original population normal Samples taken of size 10 Sample taken of size 50 Highly skewed population

Questions?

Graphs of 𝒙 Graph of 𝒙 values

Graphs of 𝒙 Averages (measures of central tendency) show where the data tend to pile up Graph of 𝒙 values

The place where 𝒙 tends to pile up is at μ Graphs of 𝒙 The place where 𝒙 tends to pile up is at μ Graph of 𝒙 values

Graphs of 𝒙 So, the most likely value for 𝒙 is μ Graph of 𝒙 values

Graphs of 𝒙 As you move away from μ on the graph, 𝒙 is less likely to have these values Graph of 𝒙 values

Estimation

Estimation PROJECT QUESTION Population Sample mean mean μ 𝒙 What is the best estimate we have for the unknown population mean µ ?

𝒙 is the best estimate we have for the unknown population mean µ Estimation PROJECT QUESTION 𝒙 is the best estimate we have for the unknown population mean µ

The mean of all of the gazillion 𝒙 values will be µ Estimation The mean of all of the gazillion 𝒙 values will be µ

Graph of likely values for µ: Estimation PROJECT QUESTION Graph of likely values for µ: ?

Graph of likely values for µ: Estimation PROJECT QUESTION Graph of likely values for µ: 𝒙

Estimation We will use the sample mean 𝒙 to estimate the unknown population mean µ

Estimation Using the sample mean 𝒙 to estimate the unknown population mean µ is called “making inferences”

Estimation The sample standard deviation “s” is the best estimate we have for the unknown population standard deviation “σ”

Using s to estimate σ is also an inference Estimation Using s to estimate σ is also an inference

Estimation 𝒙 -3s 𝒙 -2s 𝒙 -s 𝒙 𝒙 +s 𝒙 +2s 𝒙 +3s You would think, since we use 𝒙 to estimate µ and s to estimate σ that the graph of 𝒙 would be: 𝒙 -3s 𝒙 -2s 𝒙 -s 𝒙 𝒙 +s 𝒙 +2s 𝒙 +3s

Estimation It’s not…

Remember, as “n” increases, the variability decreases: Estimation Remember, as “n” increases, the variability decreases:

Estimation While s is a good estimate for the original population standard deviation σ, s IS NOT the measure of variability in the new population of 𝒙 s

It needs to be decreased to take sample size into account! Estimation It needs to be decreased to take sample size into account!

Estimation We use: s/ n for the measure of variability in the new population of 𝒙 s

Estimation The standard deviation of the 𝒙 s: s/ n is called the “standard error” abbreviated “se”

Estimation BTW: you now know ALL of the items on the Descriptive Statistics list in Excel

Estimation So our curve is: 𝒙 -3se 𝒙 -2se 𝒙 -se 𝒙 𝒙 +se 𝒙 +2se 𝒙 +3se

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 Can we assume the population of 𝒙 s forms a normal distribution?

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 Because the sample size 49 is above the usual “good-enuff” value of 20, unless the original distribution is very skewed, it will be normal

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What is our best estimate of the original population mean?

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What is our best estimate of the original population mean? 150

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What is our best estimate of the original population standard deviation?

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What is our best estimate of the original population standard deviation? 56

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What is our estimate of the standard error?

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What is our estimate of the standard error? 56/ 49 = 56/7 = 8

Estimation PROJECT QUESTION Suppose we have a population of 𝒙 s from samples of size 49 The mean of the 𝒙 s is 150 The standard deviation is 56 What will be the normal curve for the 𝒙 s

Estimation PROJECT QUESTION Our curve is: 126 134 142 150 158 166 174

Questions?

Inferences About μ Although our normal curve graphs the 𝒙 s, we will use it to make inferences about what value μ actually has

Inferences About μ About 95% of the possible values for μ will be within 2 SE of 𝒙

This allows us to create a “confidence interval” for values of μ Confidence Intervals This allows us to create a “confidence interval” for values of μ

Confidence Intervals Confidence interval formula: 𝒙 - 2s/ n ≤ μ ≤ 𝒙 + 2s/ n or 𝒙 - 2se ≤ μ ≤ 𝒙 + 2se With a confidence level of 95%

The “2” in the equations is called the “critical value” Confidence Intervals The “2” in the equations is called the “critical value”

It comes from the normal curve, which gives us the 95% Confidence Intervals It comes from the normal curve, which gives us the 95%

2s/ n or 2se is called the “margin of error” Confidence Intervals 2s/ n or 2se is called the “margin of error”

What if we wanted a confidence level of 99% Confidence Intervals PROJECT QUESTION What if we wanted a confidence level of 99%

Confidence Intervals PROJECT QUESTION What if we wanted a confidence level of 99% We’d use a value of “3” rather than 2

Confidence Intervals For most scientific purposes, 95% is “good-enuff” In the law, 98% is required for a criminal case In medicine, 99% is required

Confidence Intervals For a 95% confidence interval, 95% of the values of μ will be within 2se of 𝒙

Confidence Intervals If we use the confidence interval to estimate a likely range for true values of μ, we will be right 95% of the time

For a 95% confidence interval, we will be WRONG 5% of the time Confidence Intervals For a 95% confidence interval, we will be WRONG 5% of the time

For a 99% confidence interval, how much of the time will we be wrong? Confidence Intervals PROJECT QUESTION For a 99% confidence interval, how much of the time will we be wrong?

Confidence Intervals PROJECT QUESTION For a 99% confidence interval, how much of the time will we be wrong? we will be wrong 1% of the time

Confidence Intervals The percent of time we are willing to be wrong is called “α” (“alpha”) or “the α-level”

Confidence Intervals Everyday use of confidence intervals: You will frequently hear that a poll has a candidate ahead by 10 points with a margin of error of 3 points

Confidence Intervals This means: 10-3 ≤ true difference ≤ 10+3 Or, the true difference is between 7 and 13 points (with 95% likelihood)

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 Can we assume normality?

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 Can we assume normality? yes, because n>20

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the α-level?

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the α-level? 5%

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the critical value?

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the critical value? 2, because we want a 95% confidence interval

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the standard error?

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the standard error? s/ n = 5/ 25 = 5/5 = 1

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the margin of error?

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the margin of error? 2se = 2(1) = 2

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the confidence interval?

Confidence Intervals PROJECT QUESTION Find a 95% confidence interval for μ given: 𝒙 = 7 s = 5 n = 25 What is the confidence interval? 𝒙 - 2s/ n ≤ μ ≤ 𝒙 + 2s/ n 7 – 2 ≤ μ ≤ 7 + 2 5 ≤ μ ≤ 9 with 95% confidence

Confidence Intervals PROJECT QUESTION Interpreting confidence intervals: If the 95% confidence interval is: 5 ≤ µ ≤ 9 Is it likely that µ = 10?

Confidence Intervals PROJECT QUESTION No, because it’s outside of the interval That would only happen 5% of the time

Find the 95% confidence interval for μ given: 𝒙 = 53 s = 14 n = 49 Confidence Intervals PROJECT QUESTION Find the 95% confidence interval for μ given: 𝒙 = 53 s = 14 n = 49

Confidence Intervals PROJECT QUESTION Find the 95% confidence interval for μ given: 𝒙 = 53 s = 14 n = 49 49 ≤ µ ≤ 57

Can you say with 95% confidence that µ ≠ 55? Confidence Intervals PROJECT QUESTION Can you say with 95% confidence that µ ≠ 55?

Confidence Intervals PROJECT QUESTION Can you say with 95% confidence that µ ≠ 55? Nope… it’s in the interval It IS a likely value for µ

Find the 95% confidence interval for μ given: 𝒙 = 481 s = 154 n = 121 Confidence Intervals PROJECT QUESTION Find the 95% confidence interval for μ given: 𝒙 = 481 s = 154 n = 121

Confidence Intervals PROJECT QUESTION Find the 95% confidence interval for μ given: 𝒙 = 481 s = 154 n = 121 453 ≤ µ ≤ 509

Can you say with 95% confidence that µ might be 450? Confidence Intervals PROJECT QUESTION Can you say with 95% confidence that µ might be 450?

Confidence Intervals PROJECT QUESTION Can you say with 95% confidence that µ might be 450? µ is unlikely to be 450 – that value is outside of the confidence interval and would only happen 5% of the time

What if you have a sample size smaller than 20??? Confidence Intervals What if you have a sample size smaller than 20???

Confidence Intervals What if you have a sample size smaller than 20??? You must use a different (bigger) critical value W.S. Gosset 1908

You will have a smaller interval if you have a larger value for n Confidence Intervals You will have a smaller interval if you have a larger value for n

So you want to take the LARGEST sample you can Confidence Intervals So you want to take the LARGEST sample you can

This is called the “LAW OF LARGE NUMBERS” Confidence Intervals This is called the “LAW OF LARGE NUMBERS”

Confidence Intervals LAW OF LARGE NUMBERS The larger your sample size, the better your estimate

Questions?