Review What you have learned in QA 128 Business Statistics I
Key Definitions A population (universe) is the collection of things under consideration A sample is a portion of the population selected for analysis A parameter is a summary measure computed to describe a characteristic of the population A statistic is a summary measure computed to describe a characteristic of the sample
Population and Sample PopulationSample Use parameters to summarize features Use statistics to summarize features Inference on the population from the sample
Reasons for Drawing a Sample Less time consuming than a census Less costly to administer than a census Less cumbersome and more practical to administer than a census of the targeted population
Graphing Numerical Data: The Histogram Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 No Gaps Between Bars Class Midpoints Class Boundaries
Bar Chart (for an Investor’s Portfolio)
Measures of central tendency –Mean, median, mode, geometric mean, midrange Quartile Measure of variation –Range, interquartile range, variance and standard deviation, coefficient of variation Shape –Symmetric, skewed, using box-and-whisker plots
Probability Probability is the numerical measure of the likelihood that an event will occur Value is between 0 and 1 Sum of the probabilities of all mutually exclusive and collective exhaustive events is 1 Certain Impossible.5 1 0
The Normal Distribution “Bell shaped” Symmetrical Mean, median and mode are equal Interquartile range equals 1.33 Random variable has infinite range Mean Median Mode X f(X)
Many Normal Distributions By varying the parameters and , we obtain different normal distributions There are an infinite number of normal distributions
Finding Probabilities Probability is the area under the curve! c d X f(X)f(X)
Which Table to Use? An infinite number of normal distributions means an infinite number of tables to look up!
Solution: The Cumulative Standardized Normal Distribution Z Cumulative Standardized Normal Distribution Table (Portion) Probabilities Shaded Area Exaggerated Only One Table is Needed Z = 0.12
Standardizing Example Normal Distribution Standardized Normal Distribution Shaded Area Exaggerated
Example: Normal Distribution Standardized Normal Distribution Shaded Area Exaggerated
Z Cumulative Standardized Normal Distribution Table (Portion) Shaded Area Exaggerated Z = 0.21 Example: (continued)
Z Cumulative Standardized Normal Distribution Table (Portion) Shaded Area Exaggerated Z = Example: (continued)
.6217 Finding Z Values for Known Probabilities Z Cumulative Standardized Normal Distribution Table (Portion) What is Z Given Probability = ? Shaded Area Exaggerated.6217
Life will be Easier with Excel Statistical Functions
Why Study Sampling Distributions Sample statistics are used to estimate population parameters –e.g.: Estimates the population mean Problems: different samples provide different estimate –Large samples gives better estimate; Large samples costs more –How good is the estimate? Approach to solution: theoretical basis is sampling distribution
Effect of Large Sample Larger sample size Smaller sample size P(X)
Estimation Process Mean, , is unknown Population Random Sample Mean X = 50 Sample I am 95% confident that is between 40 & 60.
Point Estimates Estimate Population Parameters … with Sample Statistics Mean Proportion Variance Difference
Interval Estimates Provides range of values –Take into consideration variation in sample statistics from sample to sample –Based on observation from 1 sample –Give information about closeness to unknown population parameters –Stated in terms of level of confidence Never 100% sure
General Formula The general formula for all confidence intervals is:
Interval and Level of Confidence Confidence Intervals Intervals extend from to of intervals constructed contain ; do not. _ Sampling Distribution of the Mean
Factors Affecting Interval Width (Precision) Data variation –Measured by Sample size – Level of confidence – Intervals Extend from © T/Maker Co. X - Z to X + Z xx
Student’s t Distribution Z t 0 t (df = 5) t (df = 13) Bell-Shaped Symmetric ‘Fatter’ Tails Standard Normal
Student’s t Table Upper Tail Area df t t Values Let: n = 3 df = n - 1 = 2 =.10 /2 =.05 / 2 =.05
Welcome to the New World: Business Statistics II Hypothesis testing (one sample) Hypothesis testing (two samples) Analysis of variance (ANOVA) Chi-square test Linear regression Time-series analysis
Homework Get yourself familiar with Excel Play with four Excel Statistical functions NORMSDIST() NORMSINV() TDIST() TINV() Compare the results with Statistical tables