Econ 3790: Business and Economics Statistics

Econ 3790: Business and Economics Statistics
Instructor: Yogesh Uppal

Normal Probability Distribution
Characteristics Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and .5 to the right). .5 .5 x Mean m

How to find probabilities of a random variable (x) which has a normal distribution.
Convert the x values into the z scores or more formally, standardize x. After the conversion, we can use the z-scores to find probabilities from a table (called table of standard normal probabilities).

Standardizing the Normal Values or the
z-scores Z-scores can be calculated as follows: We can think of z as a measure of the number of standard deviations x is from .

Standard Normal Probability Distribution
A standard normal distribution is a normal distribution with mean of 0 and variance of 1. If x has a normal distribution with mean (μ) and Variance (σ), then z is said to have a standard normal distribution. s = 1 z

Characteristics of Standard Normal Distribution
It is a type of the normal distribution. Its mean is zero and variance is one. Z-values on the left side of the mean are negative and right side of the mean are positive. Important point is what symmetry means in this kind of distribution? How do you interpret the values in the Standard Normal Table?

Example: Air Quality I collected this data on the air quality of various cities as measured by particulate matter index (PMI). A PMI of less than 50 is said to represent good air quality. The data is available on the class website. Suppose the distribution of PMI is approximately normal.

Example: Air Quality Suppose I want to find out the probability of air quality being good? What is the probability that PMI is greater than 80? What is the probability that PMI is with 2 standard deviations from the mean?

Computing x from a given z-score:
Suppose I tell you that in our air quality example, the probability is 40% that standardized value of the PMI is between -z and +z. What are the corresponding x values?

Chapter 7, Part A Sampling and Sampling Distributions
Simple Random Sampling Point Estimation Introduction to Sampling Distributions Sampling Distribution of

Statistical Inference
The sample results provide only estimates of the values of the population characteristics. With proper sampling methods, the sample results can provide “good” estimates of the population characteristics. A parameter is a numerical characteristic of a population.

Statistical Inference
The purpose of statistical inference is to obtain information about a population from information contained in a sample. A population is the set of all the elements of interest. A sample is a subset of the population.

Simple Random Sampling: Finite Population
A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.

Simple Random Sampling: Finite Population
Replacing each sampled element before selecting subsequent elements is called sampling with replacement. Sampling without replacement is the procedure used most often. In large sampling projects, computer-generated random numbers are often used to automate the sample selection process.

Point Estimation In point estimation we use the data from the sample
to compute a value of a sample statistic that serves as an estimate of a population parameter. We refer to as the point estimator of the population mean . s is the point estimator of the population standard deviation . is the point estimator of the population proportion p.

Sampling Error When the expected value of a point estimator is equal
to the population parameter, the point estimator is said to be unbiased. The absolute value of the difference between an unbiased point estimate and the corresponding population parameter is called the sampling error. Sampling error is the result of using a subset of the population (the sample), and not the entire population. Statistical methods can be used to make probability statements about the size of the sampling error.

Sampling Error The sampling errors are: for sample mean
for sample standard deviation for sample proportion

Air Quality Example Let us suppose that the population of air quality data consists of 191 observations. How would you determine the following population parameters: mean, standard deviation, proportion of cities with good air quality.

Air Quality Example How about picking a random sample from this population representing the air quality? We shall use SPSS to do this random sampling for us. How would you use this sample to provide point estimates of the population parameters?

Summary of Point Estimates Obtained from a Simple Random Sample
Population Parameter Parameter Value Point Estimator Point Estimate m = Population mean SAT score 40.9 = Sample mean SAT score …. s = Population std. deviation for SAT score 20.5 s = Sample std. deviation for SAT score ….. p = Population proportion .62 = Sample proportion wanting campus housing ….

of n elements is selected
Process of Statistical Inference Population with mean m = ? A simple random sample of n elements is selected from the population. The value of is used to make inferences about the value of m. The sample data provide a value for the sample mean .

Sampling Distribution of
The sampling distribution of is the probability distribution of all possible values of the sample mean . Expected Value of E( ) =  where:  = the population mean

Standard Deviation of Finite Population Infinite Population A finite population is treated as being infinite if n/N < .05. is the finite correction factor. is also referred to as the standard error of the mean.

The Shape of Sampling Distribution of
If the shape of the distribution of x in the population is normal, the shape of the sampling distribution of is normal as well. If the shape of the distribution of x in the population is approximately normal, the shape of the sampling distribution of is approximately normal as well. If the shape of the population is not approximately normal then If n is small, the shape of the sampling distribution of is unpredictable. If n is large (n≥ 30), the shape of the sampling distribution of can be assumed to be approximately normal.

Sampling Distribution of for the air quality example when the population is (almost) infinite

Sampling Distribution of for the air quality example when the population is finite

Relationship Between the Sample Size
and the Sampling Distribution of E( ) = m regardless of the sample size. In our example, E( ) remains at 40.9. Whenever the sample size is increased, the standard error of the mean is decreased. With the increase in the sample size to n = 100, the standard error of the mean decreases.

and the Sampling Distribution of

If we use a large random sample (n>30), then the sampling distribution of can be approximated by the normal distribution. If the sample is small, then the sampling distribution of can be normal only if we assume that our population has a normal distribution.

Sampling Distribution of for the air quality Index when n = 5.
What is the probability that a simple random sample of 5 applicants will provide an estimate of the population mean air quality index that is within +/-2 of the actual population mean, μ? In other words, what is the probability that will be between 38.9 and 42.9?

Sampling Distribution of for the air quality Index when n = 100.
What is the probability that a simple random sample of 100 applicants will provide an estimate of the population mean air quality index that is within +/-2 of the actual population mean, μ?

and the Sampling Distribution of Because the sampling distribution with n = 100 has a smaller standard error, the values of have less variability and tend to be closer to the population mean than the values of with n = 5. Basically, a given interval with smaller standard error (larger n) will cover more area under the normal curve than the same interval with larger standard error (smaller n).

Chapter 7, Part B Sampling and Sampling Distributions
Sampling Distribution of

of n elements is selected
Sampling Distribution of Making Inferences about a Population Proportion Population with proportion p = ? A simple random sample of n elements is selected from the population. The value of is used to make inferences about the value of p. The sample data provide a value for the sample proportion .

The sampling distribution of is the probability distribution of all possible values of the sample proportion . Expected Value of where: p = the population proportion

Standard Deviation of Finite Population Infinite Population is referred to as the standard error of the proportion.

Form of Sampling Distribution of
The sampling distribution of can be approximated by a normal distribution whenever the sample size is large: Central Limit Theorem (CLT). The sample size is considered large whenever these conditions are satisfied: np > 5 and n(1 – p) > 5

Chapter 8: Interval Estimation
Population Mean: s Known Population Mean: s Unknown

Margin of Error and the Interval Estimate
A point estimator cannot be expected to provide the exact value of the population parameter. An interval estimate can be computed by adding and subtracting a margin of error to the point estimate. Point Estimate +/- Margin of Error The purpose of an interval estimate is to provide information about how close the point estimate is to the value of the parameter.

Margin of Error and the Interval Estimate
The general form of an interval estimate of a population mean is In order to develop an interval estimate of a population mean, the margin of error must be computed using either: the population standard deviation s , or the sample standard deviation s These are also Confidence Interval.

Interval Estimate of a Population Mean: s Known
Interval Estimate of m where: is the sample mean 1 - is the confidence coefficient z/2 is the z value providing an area of /2 in the upper tail of the standard normal probability distribution s is the population standard deviation n is the sample size

Interval Estimation of a Population Mean: s Known
There is a 1 -  probability that the value of a sample mean will provide a margin of error of or less. Sampling distribution of 1 -  of all values /2 /2 

Summary of Point Estimates Obtained from a Simple Random Sample
Population Parameter Parameter Value Point Estimator Point Estimate m = Population mean 40.9 = Sample mean s = Population std. deviation 20.5 s = Sample std. deviation ……. p = Population proportion .62 = Sample proportion

Example: Air Quality Consider our air quality example. Suppose the population is approximately normal with μ = 40.9 and σ = This is σ known case. If you guys remember, we picked a sample of size 5 (n =5). Given all this information, What is the margin of error at 95% confidence level?

Example: Air Quality What is the margin of error at 95% confidence level. We can say with 95% confidence that population mean (μ) is between ± 18 of the sample mean. With 95% confidence, μ is between …. and …...

Interval Estimation of a Population Mean:s Unknown
If an estimate of the population standard deviation s cannot be developed prior to sampling, we use the sample standard deviation s to estimate s . This is the s unknown case. In this case, the interval estimate for m is based on the t distribution. (We’ll assume for now that the population is normally distributed.)

t Distribution The t distribution is a family of similar probability
distributions. A specific t distribution depends on a parameter known as the degrees of freedom. Degrees of freedom refer to the number of independent pieces of information that go into the computation of s.

t Distribution A t distribution with more degrees of freedom has
less dispersion. As the number of degrees of freedom increases, the difference between the t distribution and the standard normal probability distribution becomes smaller and smaller.

t Distribution t distribution (20 degrees Standard of freedom) normal
z, t

t Distribution For more than 100 degrees of freedom, the standard
normal z value provides a good approximation to the t value. The standard normal z values can be found in the infinite degrees ( ) row of the t distribution table.

t Distribution Standard normal z values

Interval Estimation of a Population Mean: s Unknown
Interval Estimate where: 1 - = the confidence coefficient t/2 = the t value providing an area of /2 in the upper tail of a t distribution with n - 1 degrees of freedom s = the sample standard deviation

Example: Air quality when σ is unknown
Now suppose that you did not know what σ is. You can estimate using the sample and then use t-distribution to find the margin of error. What is 95% confidence interval in this case? The sample size n =5. So, the degrees of freedom for the t-distribution is 4. The level of significance ( ) is s = ……

Summary of Interval Estimation Procedures
for a Population Mean Can the population standard deviation s be assumed known ? Yes No Use the sample standard deviation s to estimate σ s Known Case Use Use s Unknown Case

Interval Estimation of a Population Proportion
The general form of an interval estimate of a population proportion is

Interval Estimation of a Population Proportion
Interval Estimate where:  is the confidence coefficient z/2 is the z value providing an area of /2 in the upper tail of the standard normal probability distribution is the sample proportion

Econ 3790: Business and Economics Statistics

Similar presentations

Presentation on theme: "Econ 3790: Business and Economics Statistics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Econ 3790: Business and Economics Statistics

Similar presentations

Presentation on theme: "Econ 3790: Business and Economics Statistics"— Presentation transcript:

Similar presentations

About project

Feedback