Econ 140 Lecture 31 Univariate Populations Lecture 3.

Slides:



Advertisements
Similar presentations
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Advertisements

Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Estimation in Sampling
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
5 - 1 © 1997 Prentice-Hall, Inc. Importance of Normal Distribution n Describes many random processes or continuous phenomena n Can be used to approximate.
Physics 114: Lecture 7 Uncertainties in Measurement Dale E. Gary NJIT Physics Department.
Probability Probability; Sampling Distribution of Mean, Standard Error of the Mean; Representativeness of the Sample Mean.
Econ 140 Lecture 61 Inference about a Mean Lecture 6.
Econ 140 Lecture 41 More on Univariate Populations Lecture 4.
Chapter 7 Introduction to Sampling Distributions
Classical Regression III
Chapter 7 Sampling and Sampling Distributions
1 Sociology 601, Class 4: September 10, 2009 Chapter 4: Distributions Probability distributions (4.1) The normal probability distribution (4.2) Sampling.
Econ 140 Lecture 41 More on Univariate Populations Lecture 4.
Inference about a Mean Part II
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Probability and Statistics in Engineering Philip Bedient, Ph.D.
Binomial Probability Distribution.
Econ 140 Lecture 31 Univariate Populations Lecture 3.
Standard Error of the Mean
Hypothesis Testing. Distribution of Estimator To see the impact of the sample on estimates, try different samples Plot histogram of answers –Is it “normal”
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Chapter 6 The Normal Probability Distribution
1 More about the Sampling Distribution of the Sample Mean and introduction to the t-distribution Presentation 3.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
© 2003 Prentice-Hall, Inc.Chap 6-1 Business Statistics: A First Course (3 rd Edition) Chapter 6 Sampling Distributions and Confidence Interval Estimation.
Chapter 8: Confidence Intervals
Introduction to Inferential Statistics. Introduction  Researchers most often have a population that is too large to test, so have to draw a sample from.
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Chapter 6: Random Errors in Chemical Analysis CHE 321: Quantitative Chemical Analysis Dr. Jerome Williams, Ph.D. Saint Leo University.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Chapter 7 Probability and Samples: The Distribution of Sample Means
The Central Limit Theorem and the Normal Distribution.
Biostatistics Unit 5 – Samples. Sampling distributions Sampling distributions are important in the understanding of statistical inference. Probability.
What does Statistics Mean? Descriptive statistics –Number of people –Trends in employment –Data Inferential statistics –Make an inference about a population.
Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
Two Main Uses of Statistics: 1)Descriptive : To describe or summarize a collection of data points The data set in hand = the population of interest 2)Inferential.
Central Tendency & Dispersion
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 6: Random Variables Section 6.1 Discrete and Continuous Random Variables.
Q1: Standard Deviation is a measure of what? CenterSpreadShape.
Chapter 7: The Distribution of Sample Means. Frequency of Scores Scores Frequency.
Introduction to Inference Sampling Distributions.
Probability & Statistics Review I 1. Normal Distribution 2. Sampling Distribution 3. Inference - Confidence Interval.
Review Day 2 May 4 th Probability Events are independent if the outcome of one event does not influence the outcome of any other event Events are.
Theoretical distributions: the Normal distribution.
Normal Distribution and Parameter Estimation
Chapter 7 Sampling and Sampling Distributions
Random Variables.
Elementary Statistics
BIOS 501 Lecture 3 Binomial and Normal Distribution
Section 6-4 – Confidence Intervals for the Population Variance and Standard Deviation Estimating Population Parameters.
Continuous Random Variable
Econ 3790: Business and Economics Statistics
Econometric Models The most basic econometric model consists of a relationship between two variables which is disturbed by a random error. We need to use.
Chapter 6 Confidence Intervals.
Chapter 7: The Distribution of Sample Means
Statistical analysis and its application
M248: Analyzing data Block A UNIT A3 Modeling Variation.
Statistical Inference for the Mean: t-test
Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005
Presentation transcript:

Econ 140 Lecture 31 Univariate Populations Lecture 3

Econ 140 Lecture 32 Today’s Plan Univariate statistics - distribution of a single variable Making inferences about population parameters from sample statistics - (For future reference: how can we relate the ‘a’ and ‘b’ parameters from last lecture to sample data) Dealing with two types of probability –‘A priori’ classical probability – Empirical classical

Econ 140 Lecture 33 A Priori Classical Probability Characterized by a finite number of known outcomes The expected value of Y can be defined as The expected value will always be the mean value µ Y is the population mean is the sample mean The outcome of an experiment is a randomized trial

Econ 140 Lecture 34 Flipping Coins Example: flipping 2 fair coins –Possible outcomes are: HH, TT, HT, TH –we know there are only 4 possible outcomes –we get discreet outcomes because there are a finite number of possible outcomes –We can represent known outcomes in a matrix

Econ 140 Lecture 35 Flipping Coins (2) The probability of some event A is –where m is the number of events keeping with event A and n is the total number of possible events. –If A is the number of heads when flipping 2 coins we can represent the probability distribution function like this:

Econ 140 Lecture 36 Flipping Coins (3) If we graph the PDF we get The expected value is = 0(0.25) + 1(0.5) + 2(0.25)

Econ 140 Lecture 37 Empirical Classical Probability Characterized by an infinite number of possible outcomes With empirical classical probability, we use sample data to make inferences about underlying population parameters –Most of the time, we don’t know what the population values are, so we need to use a sample Example: GPAs in the Econ 140 population –We can take a sample of every 5th person in the room –Assuming that our sample is random (that Econ 140 does not sit in some systematic fashion), we’ll have a representative sample of the population

Econ 140 Lecture 38 Empirical Classical Probability Statisticians/economists collect sample data for many other purposes CPS is another example: sampling occurs at the household level CPS uses weights to correct data for oversampling –Over-sampling would be if we picked 1 in 3 in front of the room and only 1 in 5 in the back of the room. In that case we would over-sample the front –There’s a spreadsheet example on the course website (the weighted mean is our best guess of the population mean, whereas the unweighted mean is the sample mean)

Econ 140 Lecture 39 Empirical Classical Probability On the course website you’ll find an Excel spreadsheet that we will use to calculate the following: –Expected value –PDF and CDF –Weights to translate sample data into population estimates –Examine the difference between the sample (unweighted) mean and the estimated population (weighted) mean: Weighted mean = sum(EARNWKE*EARNWT)/sum(EARNWT) This approximates the population mean estimate

Econ 140 Lecture 310 Empirical Classical Probability(3) So how do we construct a PDF for our spreadsheet example? –Pick sensible earnings bands (ie 10 bands of $100) –We can pick as many bands as we want - the greater the number of bands, the more accurate the shape of the PDF to the ‘true population’. More bands = more calculation!

Econ 140 Lecture 311 Empirical Classical Probability(2) Constructing PDFs: –Count the number of observations in each band to get an absolute frequency –Use weights to translate sample frequencies into estimates of the population frequencies –Calculate relative frequencies for each band by dividing the absolute frequency for the band by the total frequency

Econ 140 Lecture 312 Empirical Classical Probability(4) –A weighted way to approximate the PDF: –When we have k bands, always check: if the probabilities don’t sum to 1, we’ve made a mistake!

Econ 140 Lecture 313 Empirical Classical Probability(5) Going back to our expected value… The expected value of Y will be: –The p k are frequencies and they can be weighted or not –The Y k are the earnings bands midpoints (50, 150, 250, and so on in the spreadsheet) From our spreadsheet example our weighted mean was $ and the unweighted mean was $ –Since the sample is so large, there is little difference between the sample (unweighted) mean and the population (weighted) mean

Econ 140 Lecture 314 Empirical Classical Probability(6) We can also calculate the weighted and unweighted expected values: E(Weighted value): $ E(Unweighted value:$ Why are the expected values different from the means? –We lose some information (bands for the wage data) in calculating the expected values! So why would we want to weight the observations? –With a small sample of what we think is a large population, we might not have sampled randomly. We use weights to make the sample more closely resemble the population.

Econ 140 Lecture 315 Empirical Classical Probability(7) The mean is the first moment of distribution of earnings We may also want to consider how variable earnings are –we can do this by finding the variance, or standard error Calculate the variance –In our example, the unweighted variance is: –The weighted variance is –The difference between the two is

Econ 140 Lecture 316 Empirical Classical Probability(8) The weighted PDF is pink It’s tough to see, but the weighting scheme makes the population distribution tighter

Econ 140 Lecture 317 Empirical Classical Probability(9) We can use our PDF to answer: –What is the probability that someone earns between $300 and $400? But we can’t use this PDF to answer: –What is the probability that someone earns between $253 and $316? Why? –The second question can’t be answered using our PDF because $253 and $316 fall somewhere within the earnings bands, not at the endpoints

Econ 140 Lecture 318 Standard Normal Curve We need to calculate something other than our PDF, using the sample mean, the sample variance, and an assumption about the shape of the distribution function Examine the assumption later The standard normal curve (also known as the Z table) will approximate the probability distribution of almost any continuous variable as the number of observations approaches infinity

Econ 140 Lecture 319 Standard Normal Curve (2) The standard deviation (measures the distance from the mean) is the square root of the variance: 68% area under curve 95% 99.7%

Econ 140 Lecture 320 Standard Normal Curve (3) Properties of the standard normal curve –The curve is centered around –The curve reaches its highest value at and tails off symmetrically at both ends –The distribution is fully described by the expected value and the variance You can convert any distribution for which you have estimates of and to a standard normal distribution

Econ 140 Lecture 321 Standard Normal Curve (4) A distribution only needs to be approximately normal for us to convert it to the standardized normal. The mass of the distribution must fall in the center, but the shape of the tails can be different or

Econ 140 Lecture 322 Standard Normal Curve (5) If we want to know the probability that someone earns at most $C, we are asking: We can rearrange terms to get: Properties for the standard normal variate Z: –It is normally distributed with a mean of zero and a variance of 1, written in shorthand as Z~N(0,1)

Econ 140 Lecture 323 Standard Normal Curve (5) If we have some variable Y we can assume that Y will be normally distributed, written in shorthand as Y~N(µ,  2 ) We can use Z to convert Y to a normal distribution Look at the Z standardized normal distribution handout –You can calculate the area under the Z curve from the mean of zero to the value of interest –For example: read down the left hand column to 1.6 and along the top row to.4 you’ll find that the area under the curve between Z=0 and Z=1.64 is

Econ 140 Lecture 324 Standard Normal Curve (6) Going back to our earlier question: What is the probability that someone earns between $300 and $400 [P(300  Y  400)]? P(300  Y  400) Z1Z1 Z2Z2

Econ 140 Lecture 325 What we’ve done ‘A priori’ empirical classical probability –There are a finite number of possible outcomes –Flipping coins example Empirical classical probability –There are an infinite number of possible outcomes –Difference between sample and population means –Difference between sample and population expected values –Difference in calculating PDF’s of a Univariate population. Use of standard normal distribution.