Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 03 Numerical Descriptive Techniques
Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient of Variation Measures of Relative Standing Percentiles, Quartiles Measures of Linear Relationship Covariance, Correlation, Least Squares Line Numerical Descriptive Techniques Statistics Lecture Notes – Chapter 03
Arithmetic Mean It is computed by simply adding up all the observations and dividing by the total number of observations Measures of Central Location Statistics Lecture Notes – Chapter 03 is seriously affected by extreme values called “outliers”. E.g. as soon as a billionaire moves into a neighborhood, the average household income increases beyond what it was previously
Ex3.1 – Weights of the students of a classroom in elementary school are given below. Calculate the arithmetic mean. If a new student with weight of 140lbs came in, what the new mean of the classroom would be. Discuss the validity of results. Arithmetic Mean Statistics Lecture Notes – Chapter
Median The value that falls in the middle of the pre-ordered (ascending or descending) observation list Where there is an even number of observations, the median is determined by averaging the two observation in the middle. Measures of Central Location Statistics Lecture Notes – Chapter 03
Measures of Central Location Statistics Lecture Notes – Chapter
Mode Observation that occurs with the greatest frequency. There are several problems with using the mode as a measure of central location In a small sample it may not be a very good measure It may not unique Measures of Central Location Statistics Lecture Notes – Chapter 03
Mode Statistics Lecture Notes – Chapter
Geometric Mean Statistics Lecture Notes – Chapter
Measures of central location fail to tell the whole story about the distribution; that is, how much are the observations spread out around the mean value? Measures of Variability Statistics Lecture Notes – Chapter 03 For example, two sets of class grades are shown. The mean (=50) is the same in each case… But, the red class has greater variability than the blue class.
Measures of Variability Statistics Lecture Notes – Chapter 03
Variance: A measure of the average distance between each of a set of data points and their mean value. Variance is the difference between what is expected (Budget) and the actuals (Expenditure). it is the difference between "should take" and "did take". Measures of Variability Statistics Lecture Notes – Chapter 03
Ex3.6 – The following are the number of summer jobs a sample of six student applied for. Find the mean and variance of these data. Variance Statistics Lecture Notes – Chapter
Standart Deviation: shows how much variation or "dispersion" exists from the average (mean, or expected value). A low standard deviation indicates that the data points tend to be very close to the mean, whereas high standard deviation indicates that the data points are spread out over a large range of values Measures of Variability Statistics Lecture Notes – Chapter 03
Ex3.7 – Find the standard deviation of the values of Ex3.6 Standard Deviation Statistics Lecture Notes – Chapter
Interpreting the Standard Deviation Statistics Lecture Notes – Chapter 03 Approximately 68% of all observations fall within one standard deviation of the mean. Approximately 95% of all observations fall within two standard deviations of the mean. Approximately 99.7% of all observations fall within three standard deviations of the mean If the histogram is bell-shaped
Chebysheff’s Theorem Statistics Lecture Notes – Chapter 03
Chebysheff’s Theorem Statistics Lecture Notes – Chapter 03
Chebysheff’s Theorem Statistics Lecture Notes – Chapter 03
Ex3.9 – if we know that the distribution of the values in Ex3.8 is bell-shaped, new answers are a.66 and 78? Approximately 68% of the grades fall between this interval (1 standard deviation) b.60 and 84? Approximately 95% of the grades fall between this interval (2 standard deviation) c.54 and 90? Approximately 99.7% of the grades fall between this interval (3 standard deviation) Chebysheff’s Theorem Statistics Lecture Notes – Chapter 03
Measures of Relative Standing Statistics Lecture Notes – Chapter 03
Ex3.10 – The weights in pounds of a group of workers are as follows: Measures of Relative Standing Statistics Lecture Notes – Chapter a.Find the 25th percentile of the weights b.Find the 50th percentile of the weights. c.Find the 75th percentile of the weights.
A3.10a Measures of Relative Standing Statistics Lecture Notes – Chapter 03 A3.10b A3.10c Range Weights Range Weights
Box Plots Statistics Lecture Notes – Chapter 03
Box Plots Statistics Lecture Notes – Chapter 03
Ex3.11 – Use the weights in pounds of a group of workers in Ex3.10 and,Construct a box plot for these weights. Compute the interquartile range and identify any outliers. A3.11 – The first step in constructing a box plot is to rank the data, in order to determine the numerical values of the five descriptive statistics to be plotted. Box Plots Statistics Lecture Notes – Chapter 03 Descriptive StatisticsNumerical Value and 203,5 * There is no outlier
Wendy’s service time is shortest and least variable Hardee’s has the greatest variability, Jack-in-the-Box has the longest service times. Box Plots Statistics Lecture Notes – Chapter 03
Covariance: a measure of how much two variables change together Measures of Linear Relationship Statistics Lecture Notes – Chapter 03
Coefficient of Correlation: Correlation is a scaled version of covariance that takes on values in [−1,1] with a correlation of ±1 indicating perfect linear association and 0 indicating no linear relationship. Measures of Linear Relationship Statistics Lecture Notes – Chapter 03
Ex Based on the sample of data shown below. Measure how these two variables are related by computing their covariance and coefficient of correlation. Measures of Linear Relationship Statistics Lecture Notes – Chapter
Measures of Linear Relationship Statistics Lecture Notes – Chapter , , , , , , , , , ,
Measures of Linear Relationship Statistics Lecture Notes – Chapter 03
Least Square Method: Produces a straight line drawn through the points so that the sum of squared deviations between the points and the line is minimized. Measures of Linear Relationship Statistics Lecture Notes – Chapter 03
Ex Find the least squares (regression) line for the data in Ex3.12 using the shortcut formulas for the coefficients. Measures of Linear Relationship Statistics Lecture Notes – Chapter 03
Q3.1 - Consider the following sample of measurements Compute each of the following: a.the mean b.the median c.the mode Exercises Statistics Lecture Notes – Chapter
Q3.2 - Consider the following sample of data Compute each of the following for this sample: a.the mean b.the range c.the variance d.the standard deviation Exercises Statistics Lecture Notes – Chapter
Q3.3 - Consider the following sample of data, a.Construct a box plot for these data b.Compute the interquartile range and identify any outliers Exercises Statistics Lecture Notes – Chapter
Q3.4 – based on the sample of data shown below, a.Calculate Covariance b.Calculate the coefficient of Correlation c.Find the equation of least square line Exercises Statistics Lecture Notes – Chapter