26134 Business Statistics Autumn 2017 Tutorial 6: Sampling Distribution & Confidence Intervals bstats@uts.edu.au B MathFin (Hons) M Stat (UNSW) PhD (UTS) mahritaharahap.wordpress.com/ teaching-areas business.uts.edu.au
Revision: Continuous Distributions
A sample is a subset of the population of interest. In statistics we usually want to statistically analyse a population but collecting data for the whole population is usually impractical, expensive and unavailable. That is why we collect samples from the population (sampling) and make inferences about the population parameters using the statistics of the sample (inferencing) with some level of accuracy (confidence level). Std Deviation Sample Size N n Proportion p 𝑝 A population is a collection of all possible individuals, objects, or measurements of interest. A sample is a subset of the population of interest. A parameter is a number that describes some aspect of a population. A statistic is a number that is computed from data in a sample.
The sampling distribution of a statistic tells us how close the statistic is to the parameter. A sampling distribution is created by, as the name suggests, sampling. Sampling reduces the potential of errors in inference since a sample cannot be fully representative of the population. https://youtu.be/FXZ2O1Lv-KE?t=8m34s
Sampling Distributions
Central Limit Theorem Under CLT: Sampling distribution of sample mean will be normally distributed if: (1) the original distribution is normally distributed (sample size is irrelevant). (2) the original distribution is NOT normally distributed but the sample size is >30. Note that only ONE of these conditions need to be satisfied for this conclusion to be reached. In the case of the sampling distribution of sample proportion, sample size must be greater than 5/p and 5/q where q=1-p.
1.If a random sample is drawn from a normal population, then the sampling distribution of the sample mean 𝑋 is normally distributed for all values of n (sample size). 2.If a random sample is drawn from any population, the sampling distribution of the sample mean 𝑋 is approximately normal for a sufficiently large sample size (usually n > 30).
Activity 1: Sampling Distribution
Standard Error 𝜎 𝑥 = 𝜎 𝑥 =
Activity 2: Standard Error
Motivation-Point estimator The objective of estimation is to determine the value of a population parameter on the basis of a sample statistic. EXAMPLE: “What is the average time taken by customers on a single shopping trip? Parameter () Population distribution (of X) We can use sample data to calculate the sample mean (xbar). Using this single value, we can infer about population parameter (mu).The sample mean, a single value, is referred to as the point estimate. Explain- as compared to point estimate there is greater accuracy/confidence in inference when we use a range of values Sample distribution (of 𝑋 ) Point estimate (e.g., 𝑋 ) @ Dr. Sonika Singh, BSTATS, UTS
Motivation-Interval estimator To make statements about unknown population parameter with greater accuracy/confidence, we can develop an interval estimator. Parameter Population distribution Sampling distribution Interval estimate An interval estimate draws inferences about a population by estimating the value of an unknown population parameter using an interval. This interval is called as the Confidence Interval (CI). @ Dr. Sonika Singh, BSTATS, UTS
Confidence Intervals
Interval Estimates - Interpretation Confidence Interval: a range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability. The specified probability is called the level of confidence. Common levels of confidence used by analysts are 90%, 95%, and 99%. Explain:Confidence Interval is a range of values constructed from the occurrence of a particular value (such as the mean) in sample data (many samples) so that the population parameter is likely to occur within that range A 95% confidence interval indicates that approximately 95 of the 100 confidence intervals would contain the population mean. @ Dr. Sonika Singh, BSTATS, UTS
Confidence Intervals
Confidence Intervals
Confidence Intervals
Confidence Intervals Mean (if σ known) Mean (if σ unknown) 𝑥 ± 𝑧 𝛼/2 ∗ σ 𝑛 𝑥 ± 𝑡 𝛼 2 ,𝑛−1 ∗ s 𝑛
Activity 3: Confidence Intervals
Activity 4: Confidence Intervals
Activity 5: Confidence Intervals
Estimating Sample Sizes MEAN: PROPORTION: p*q
Activity 6: Estimating Sample Sizes
SEE YOU ALL NEXT WEEK!