Download presentation
Presentation is loading. Please wait.
1
1 Business 260: Managerial Decision Analysis Professor David Mease Lecture 1 Agenda: 1) Course web page 2) Greensheet 3) Numerical Descriptive Measures (Stats Book P. 107) 4) Simple Linear Regression (Stats Book P. 387)
2
2 Business 260: Managerial Decision Analysis Professor David Mease Course web page: http://www.cob.sjsu.edu/mease_d/bus260 It is linked from my San Jose State web page http://www.cob.sjsu.edu/mease_d/ It is also linked from my personal page which is easily found by querying “David Mease” or simply “Mease” on Google
3
3 Business 260: Managerial Decision Analysis Professor David Mease Greensheet: You should have a hard copy of the greensheet. It can also be found on the course web page: http://www.cob.sjsu.edu/mease_d/bus260
4
4 Numerical Descriptive Measures (P. 107) Statistics for Managers Using Microsoft ® Excel 4 th Edition
5
5 Chapter Topics Measures of central tendency, variation, and shape Mean, median, mode, geometric mean Quartiles Range, interquartile range, variance and standard deviation, coefficient of variation Symmetric and skewed distributions Population summary measures Mean, variance, and standard deviation The empirical rule Five number summary and box-and-whisker plots Coefficient of correlation
6
6 Summary Measures Arithmetic Mean Median Mode Describing Data Numerically Variance Standard Deviation Coefficient of Variation Range Interquartile Range Geometric Mean Skewness Central TendencyVariationShapeQuartiles
7
7 In class exercise #1: A sample of n=9 runners were asked how many miles they ran last week. Here is the data: 43 17 21 3 32 37 10 26 28 Describe the center of this data. (What are the mean, median and mode?)
8
8 In class exercise #2: How would your answer change for ICE #1 if the first runner actually ran 143 miles instead of 43? Here is the data: 143 17 21 3 32 37 10 26 28
9
9
10
10 In class exercise #3: How would your answer change for ICE #1 if there was a 10 th runner who also ran 17 miles? Here is the data: 43 17 21 3 32 37 10 26 28 17
11
11 Summary Measures Arithmetic Mean Median Mode Describing Data Numerically Variance Standard Deviation Coefficient of Variation Range Interquartile Range Geometric Mean Skewness Central TendencyVariationShapeQuartiles
12
12 Quartiles Quartiles split the ranked data into 4 segments with an equal number of values per segment 25% The first quartile, Q 1, is the value for which 25% of the observations are smaller and 75% are larger Q 2 is the same as the median (50% are smaller, 50% are larger) Only 25% of the observations are greater than the third quartile Q1Q2Q3
13
13 In class exercise #4: Compute the quartiles for the n=9 runners. 43 17 21 3 32 37 10 26 28
14
14 In class exercise #5: Compute the quartiles for the n=10 runners. 43 17 21 3 32 37 10 26 28 17
15
15 Quartile Formulas Find a quartile by determining the value in the appropriate position in the ranked data, where First quartile position: Q 1 at (n+1)/4 Second quartile position: Q 2 at (n+1)/2 (the median) Third quartile position: Q 3 at 3(n+1)/4 where n is the number of observed values
16
16 In class exercise #6: Redo ICE #4 and ICE #5 using these formulas and check that the answers are the same.
17
17 Five Number Summary 1) Minimum 2) Q 1 3) Q 2 (median) 4) Q 3 5) Maximum A plot of the 5 number summary like the one below is called a “box-and-whisker plot” Median (Q 2 ) X maximum X minimum Q1Q1 Q3Q3 25% 25%
18
18 In class exercise #7: Compute the five number summary for the sample of n=9 runners and draw the box-and-whisker plot. 43 17 21 3 32 37 10 26 28
19
19 Summary Measures Arithmetic Mean Median Mode Describing Data Numerically Variance Standard Deviation Coefficient of Variation Range Interquartile Range Geometric Mean Skewness Central TendencyVariationShapeQuartiles
20
20 Same center, different variation Measures of Variation Variation Variance Standard Deviation Coefficient of Variation RangeInterquartile Range Measures of variation give information on the spread or variability of the data values.
21
21 Range Simplest measure of variation Difference between the largest and the smallest observations: Disadvantages = ignores distribution of data and sensitive to outliers Range = X largest – X smallest
22
22 In class exercise #8: Compute the range for the sample of n=9 runners. 43 17 21 3 32 37 10 26 28
23
23 Interquartile Range Can eliminate some outlier problems by using the interquartile range Eliminate some high- and low-valued observations and calculate the range from the remaining values Interquartile range = 3 rd quartile – 1 st quartile = Q 3 – Q 1
24
24 In class exercise #9: Compute the IQR (interquartile range) for the sample of n=9 runners. 43 17 21 3 32 37 10 26 28
25
25 Average (approximately) of squared deviations of values from the mean Advantages = each value in the data set is used in the calculation and values far from the mean are given extra weight (because they’re squared) Sample variance: Variance Where = arithmetic mean n = sample size X i = i th value of the variable X
26
26 In class exercise #10: Compute the variance for this sample of n=5. 10 20 30 40 50
27
27 Standard Deviation Most commonly used measure of variation in business application It is simply the square root of the variance Shows variation about the mean Has the same units as the original data Sample standard deviation:
28
28 In class exercise #11: Compute the standard deviation for this sample of n=5. 10 20 30 40 50
29
29 Coefficient of Variation Measures relative variation Always in percentage (%) Shows variation relative to mean Can be used to compare two or more sets of data measured in different units
30
30 In class exercise #12: Compute the coefficient of variation for this sample of n=5. 10 20 30 40 50
31
31 Using Excel Many of these are available by doing insert > function > statistical
32
32 Using Excel Many of these are available by doing insert > function > statistical Examples: sample mean = AVERAGE minimum = MIN maximum = MAX median = MEDIAN sample standard deviation = STDEV sample variance = VAR quartiles = QUARTILE *doesn’t really work* mode = MODE *doesn’t really work*
33
33 In class exercise #13: Check the mean and median for the sample of n=9 runners using Excel. 43 17 21 3 32 37 10 26 28
34
34 In class exercise #14: Check the variance and standard deviation for the sample of n=5 using Excel. 10 20 30 40 50
35
35 Summary Measures Arithmetic Mean Median Mode Describing Data Numerically Variance Standard Deviation Coefficient of Variation Range Interquartile Range Geometric Mean Skewness Central TendencyVariationShapeQuartiles
36
36 Shape of a Distribution Describes how data are distributed Measures of shape Symmetric or skewed Mean = Median Mean < Median Median < Mean Right-Skewed Left-SkewedSymmetric
37
37 Distribution Shape and Box-and-Whisker Plot Right-SkewedLeft-SkewedSymmetric Q1Q2Q3Q1Q2Q3 Q1Q2Q3
38
38 In class exercise #15: Below is the histogram for the 1500 California house prices at http://www.cob.sjsu.edu/mease_d/bus260/houses.xls. How would you describe the shape of the data based on the histogram? Confirm this by A) comparing the mean and median B) making the box-and-whisker plot
39
39 If the data distribution is close to being bell- shaped, then the interval: contains about 68% of the values in the population or the sample The Empirical Rule 68%
40
40 contains about 95% of the values in the population or the sample contains about 99.7% of the values in the population or the sample The Empirical Rule 99.7%95%
41
41 In class exercise #16: Give the empirical rule for a population for which the mean is 100 and the standard deviation is 10.
42
42 In class exercise #17: Compare the empirical rule to the observed percentages for the 1500 house prices (houses.xls).
43
43 Y X Y X Y X Y X r = -1 r = -.6 r = +.3 r = +1 Coefficient of Correlation (r) To understand the coefficient of correlation, it helps to start with scatter diagrams…
44
44 Scatter Diagrams are used for bivariate numerical data Bivariate data consists of paired observations taken from two numerical variables The Scatter Diagram: one variable is measured on the vertical axis and the other variable is measured on the horizontal axis Scatter Diagrams
45
45 Scatter Diagram Example Volume per day Cost per day 23125 26140 29146 33160 38167 42170 50188 55195 60200
46
46 Scatter Diagrams in Excel Select “Insert” > “Chart” 1 2 Select XY(Scatter) option, then click “Next” The data range is the y values and the x values go under the “Series” tab Important: Don’t include column names 3
47
47 In class exercise #18: The file http://www.cob.sjsu.edu/mease_d/football.xls gives the total number of wins for each of the 117 Division 1A college football teams for the 2003 and 2004 seasons. Use Excel to make a scatter diagram for this data. Put 2003 wins on the x-axis.
48
48 In class exercise #18: The file http://www.cob.sjsu.edu/mease_d/football.xls gives the total number of wins for each of the 117 Division 1A college football teams for the 2003 and 2004 seasons. Use Excel to make a scatter diagram for this data. Put 2003 wins on the x-axis. ANSWER:
49
49 Measures the relative strength and direction of the linear relationship between two variables Is equal to the square root of R-squared but will be negative if the relationship is negative I will NOT make you compute this by hand, but this is the formula if you are curious Coefficient of Correlation (r)
50
50 Features of Correlation Coefficient, r Unit free Ranges between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker any linear relationship
51
51 In class exercise #19: Match each plot with its correct coefficient of correlation. Choices: r=-3.20, r=-0.98, r=0.86, r=0.95, r=1.20, r=-0.96, r=-0.40 A) B) C) D)E)
52
52 Simple Linear Regression (P. 387) Statistics for Managers Using Microsoft ® Excel 4 th Edition
53
53 Described on pages 387-398 It is the line that fits the data the best as determined by minimizing squared vertical differences The coefficient of correlation (r) measures the strength and direction of the linear relationship (positive=up, negative=down) R-squared also measures the strength of the linear relationship, but not the direction The Least Squares Regression Line
54
54 Adding the Least Squares Regression Line Using Excel Click once on the graph 1
55
55 Adding the Least Squares Regression Line Using Excel From the “Chart” menu select “Add Trendline” 2
56
56 Adding the Least Squares Regression Line Using Excel Choose the first choice (“Linear”) and press “OK” 3
57
57 Adding the Least Squares Regression Line Using Excel The line should now appear on your scatter diagram. Double click on the line then under the “Options” tab check the last two boxes. 4
58
58 In class exercise #20: A) Graph the least squares regression line for the football data on the scatter diagram using Excel. B) Give the equation of the least squares regression line using Excel. C) What is the slope of the least squares regression line? D) Interpret the slope of the least squares regression line. E) What is the coefficient of correlation? F) What is the value of R-squared? G) Use the least squares regression line to predict the number of 2004 wins for a team that won 12 games in 2003.
59
59 In class exercise #20: A) Graph the least squares regression line for the football data on the scatter diagram using Excel. ANSWER for Part A:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.