Number of observations in the population The population mean of a data set is the average of all the data values. Sum of the values of the N observations Measures of Location
The population mean of a data set is the average of all the data values. Sum of the values of the n observations The sample mean is the point estimator of the population mean . Number of observations in the sample Measures of Location
Example: Recall the Hudson Auto Repair example The manager of Hudson Auto would like to have better understanding of the cost of parts used in the engine tune-ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed below Measures of Location
For an odd number of observations: in ascending order observations the median is the middle value. 12 Measures of Location
in ascending order the median is the average of the middle two values. Median = ( )/2 = 22.5 For an even number of observations: observations Measures of Location
Averaging the 25th and 26th data values: = ( )/2 = 75.5 Note: Data is in ascending order Example: Hudson Auto Repair Measures of Location Median
= 62 Note: Data is in ascending order Example: Hudson Auto Repair Measures of Location Mode
First quartile = 25 th percentile = 13 th First quartile = Example: Hudson Auto Repair i th = ( p /100) n = ( 25 /100)50= 12.5 Note: Data is in ascending order. Measures of Location
i th = ( p /100) n = Average the 40 th and 41 st data values 80 th Percentile = Note: Data is in ascending order. Example: Hudson Auto Repair ( 80 /100)50= 40 th ( )/2 = 95 Measures of Location
Example: Hudson Auto Repair: 80 th Percentile 95 Note: Data is in ascending order. Measures of Location
data_pelican.xls Pelican Stores -- continued Pelican Stores is chain of women’s apparel stores. It recently ran a promotion in which discount coupons were set to customers of other National Clothing stores. Data collected for a sample of 100 in-store credit card transactions at Pelican Stores during one day while the promotion was running are shown in Table Customers who made a purchase using a discount coupon are referred to as promotional customers and customers who made a purchase but did not use a discount coupon are referred to as regular customers. Because the promotional coupons were not set to regular Pelican Stores customers, management considers the sales made to people presenting the promotional coupons as sales it would not otherwise make. Pelican’s management would like to use this sample data to learn about its customer base and to evaluate the promotion involving discounts. Managerial Report 1.Using graphs and tables, summarize the qualitative variables. 2.Using graphs and tables, summarize the quantitative variables. 3.Using pivot tables and scatter plots, summarize the variables. 4.Compute the mean, mode, median, and the 25 th and 75 th percentiles.
Range = maximum – minimum Range = 109 – 52 = 57 Note: Data is in ascending order Example: Hudson Auto Repair Measures of Variability
Note: Data is in ascending order Example: Hudson Auto Repair Measures of Variability 3rd Quartile ( Q 3) = 89 1st Quartile ( Q 1) = 69 = Q 3 – Q 1 = 89 – 69 = 20 Interquartile Range
The population mean The population variance is the average variation Measures of Variability
i th deviation from the population mean The population variance is the average variation Measures of Variability
i th squared deviation from the population mean The population variance is the average variation Measures of Variability
Sum of squared deviations from the population mean The population variance is the average variation Measures of Variability
Total variation of x The population variance is the average variation Measures of Variability
Number of observations in the population The population variance is the average variation Measures of Variability
The population variance is the average variation Measures of Variability The sample variance is an unbiased estimator of Number of observations in the sample
The population variance is the average variation Measures of Variability The sample variance is an unbiased estimator of
The population variance is the average variation Measures of Variability The sample variance is an unbiased estimator of Degrees of freedom
Measures of Variability
Sorted invoices Observed value Sqrd Dev from the mean Sum x = Measures of Variability
Variance Standard Deviation Example: Hudson Auto Repair Coefficient of variation Measures of Variability
Pelican Stores -- continued Pelican Stores is chain of women’s apparel stores. It recently ran a promotion in which discount coupons were set to customers of other National Clothing stores. Data collected for a sample of 100 in-store credit card transactions at Pelican Stores during one day while the promotion was running are shown in Table Customers who made a purchase using a discount coupon are referred to as promotional customers and customers who made a purchase but did not use a discount coupon are referred to as regular customers. Because the promotional coupons were not set to regular Pelican Stores customers, management considers the sales made to people presenting the promotional coupons as sales it would not otherwise make. Pelican’s management would like to use this sample data to learn about its customer base and to evaluate the promotion involving discounts. Managerial Report 1.Using graphs and tables, summarize the qualitative variables. 2.Using graphs and tables, summarize the quantitative variables. 3.Using pivot tables and scatter plots, summarize the variables. 4.Compute the mean, mode, median, and the 25 th and 75 th percentiles. 5.Compute the range, IQR, variance, and standard deviations. data_pelican.xls
Note: Data is in ascending order. Example: Hudson Auto Repair z-Score of Smallest Value Measures of Shape
Observed value Dev from the meanz-score Measures of Shape x = s =
An important measure of the shape of a distribution is called skewness. It is just the average of the n cubed z-scores when n is “large” Measures of Shape
Observed valuez-score cubed z-score Measures of Shape
Parts Cost ($) Frequency Tune-up Parts Cost $78.98 $75.50 $62 Measures of Shape
Moderately Skewed Left Symmetric Highly Skewed Right skew = 0 skew = .31 skew = 1.25 Measures of Shape
Chebyshev's Theorem: At least (1 - 1/z 2 ) of the data values are within z standard deviations of the mean. At least 75% of the data values are within 2 standard deviations of the mean At least 89% of the data values are within 3 standard deviations of the mean At least 94% of the data values are within 4 standard deviations of the mean Measures of Shape At least 0% of the data values are within 1 standard deviation of the mean
Empirical Rule: 95.44% of the data values are within 2 standard deviations of the mean 99.74% of the data values are within 3 standard deviations of the mean 99.99% of the data values are within 4 standard deviations of the mean Measures of Shape 68.26% of the data values are within 1 standard deviation of the mean
z-score Is the observation within 2 std dev? -1.93Yes -1.57Yes -1.21Yes -1.21Yes -1.21Yes -1.21Yes Yes 1.86Yes 2.15No 49 of the 50 data values are within 2 s of the mean = 98% 50 of the 50 data values are within 3 s of the mean = 100% None of the values are outliers Measures of Shape
data_pelican.xls Pelican Stores -- continued Pelican Stores is chain of women’s apparel stores. It recently ran a promotion in which discount coupons were set to customers of other National Clothing stores. Data collected for a sample of 100 in-store credit card transactions at Pelican Stores during one day while the promotion was running are shown in Table Customers who made a purchase using a discount coupon are referred to as promotional customers and customers who made a purchase but did not use a discount coupon are referred to as regular customers. Because the promotional coupons were not set to regular Pelican Stores customers, management considers the sales made to people presenting the promotional coupons as sales it would not otherwise make. Pelican’s management would like to use this sample data to learn about its customer base and to evaluate the promotion involving discounts. Managerial Report 1.Using graphs and tables, summarize the qualitative variables. 2.Using graphs and tables, summarize the quantitative variables. 3.Using pivot tables and scatter plots, summarize the variables. 4.Compute the mean, mode, median, and the 25 th and 75 th percentiles. 5.Compute the range, IQR, variance, and standard deviations. 6.Compute the z-scores and skew, find the outliers, and count the observations that are within 1, 2, & 3 standard deviations of the mean.
The covariance is computed as follows: (for samples) (for populations) Measures of the relationship between 2 variables
i th deviation from x’s means The covariance is computed as follows: (for samples) (for populations) Measures of the relationship between 2 variables
i th deviation from y’s means The covariance is computed as follows: (for samples) (for populations) Measures of the relationship between 2 variables
The sizes of the sample and population The covariance is computed as follows: (for samples) (for populations) Measures of the relationship between 2 variables
Degrees of freedom The covariance is computed as follows: (for samples) (for populations) Measures of the relationship between 2 variables
The covariance is computed as follows: Measures of the relationship between 2 variables
Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown below. Example: Reed Auto Sales Number of TV Ads (x) Number of Cars Sold (y) Measures of the relationship between 2 variables
TV Ads Cars sold Example: Reed Auto Sales Measures of the relationship between 2 variables
x y x – x (x – x) y – y (y – y) (y – y) (x – x) = 2 = 20 = 28.5 = 1 = 5 x y s xx s yy s xy Example: Reed Auto Sales (ads) (cars) (ads squared) (cars squared) (ads-cars) = 5.34 = 1 sxsx sysy (ads) (cars) Measures of the relationship between 2 variables
= 5 s xy Example: Reed Auto Sales (ads-cars) = 5.34 = 1 sxsx sysy (ads) (cars) (ads-cars) (ads) (cars) Measures of the relationship between 2 variables
data_pelican.xls Pelican Stores -- continued Pelican Stores is chain of women’s apparel stores. It recently ran a promotion in which discount coupons were set to customers of other National Clothing stores. Data collected for a sample of 100 in-store credit card transactions at Pelican Stores during one day while the promotion was running are shown in Table Customers who made a purchase using a discount coupon are referred to as promotional customers and customers who made a purchase but did not use a discount coupon are referred to as regular customers. Because the promotional coupons were not set to regular Pelican Stores customers, management considers the sales made to people presenting the promotional coupons as sales it would not otherwise make. Pelican’s management would like to use this sample data to learn about its customer base and to evaluate the promotion involving discounts. Managerial Report 1.Using graphs and tables, summarize the qualitative variables. 2.Using graphs and tables, summarize the quantitative variables. 3.Using pivot tables and scatter plots, summarize the variables. 4.Compute the mean, mode, median, and the 25 th and 75 th percentiles. 5.Compute the range, IQR, variance, and standard deviations. 6.Compute the z-scores and skew, find the outliers, and count the observations that are within 1, 2, & 3 standard deviations of the mean. 7.Compute the covariances and correlations.