Presentation is loading. Please wait.

Presentation is loading. Please wait.

E370 Statistical Analysis for Bus & Econ Chapter 3: Summary Statistics.

Similar presentations


Presentation on theme: "E370 Statistical Analysis for Bus & Econ Chapter 3: Summary Statistics."— Presentation transcript:

1 E370 Statistical Analysis for Bus & Econ Chapter 3: Summary Statistics

2 Objectives:  Be able to import data into Excel.  Be able to generate useful summary statistics from the data using Excel. (Data Crunch)  Be able to interpret summary statistics and use them to describe a dataset. (Value Added)

3 Why do we care? These days the problem is not lack of data, but an overwhelming amount of data. We need to extract information out of the data sets we have. To do that we  condense data into tables and graphs  summarize data into descriptive statistics

4 Overview: Summary of a dataset (3 dimensions) Center Where are the data values concentrated? What seem to be typical or middle data values? Dispersion The scattering or spread of data around its center. How much variation is there in the data? Shape Are the data values distributed symmetrically? Skewed?

5 Measures of Center: StatisticFormulaExcel CommandPros and Cons Mean=AVERAGE(Array)Pros: use all the information Cons: sensitive to extreme value MedianMiddle value in sorted array =MEDIAN(Array)Pros: robust to extreme value Cons: not use all the information ModeMost frequently occurring data value =MODE.MULT (Array) Pros: the only measure of center for nominal data Cons: unreliable

6 Measures of Dispersion: StatisticFormulaExcel Command Range=MAX(Array)- MIN(Array) VariancePopulation: Sample: =VAR.P(Array) VAR.S(Array) Standard Deviation Population: Sample: =STDEV.P(Array) =STDEV.S(Array) Coefficient of Variance(CV) Population: Sample: =STDEV.P(Array)/AVE RAGE(Array)*100 =STDEV.S(Array)/AVE RAGE(Array)*100

7 Measures of Dispersion: 1.Range: the distance between the largest value and the smallest value in the dataset. 2.Variance: the average squared distances of observations from their mean. “Squared” units difficult to interpret. 3.Standard Deviation: a type of average distance of observations from their mean. (Calculated by taking square root of the variance.) 4.Coefficient of Variance(CV): a measure of “relative” dispersion (unit-free). It is useful for comparing dispersion of variables measured in different units or with different means.

8 Shape of Distribution: By comparing the three measures of center: a.Mean > Median (>Mode): positively or right-skewed b.Mean = Median (=Mode): symmetric c.Mean < Median (<Mode): negatively or left-skewed The tail points to the direction of skewness.

9 Shape of Distribution(cont’d): By using Pearson’s Second Skewness Coefficient: a.Pearson’s Second Skewness Coefficient > 0: positively or right-skewed b.Pearson’s Second Skewness Coefficient = 0: symmetric c.Pearson’s Second Skewness Coefficient < 0: negatively or left-skewed Pearson’s Second Skewness Coefficient=3*(mean- median)/standard deviation

10 Summary Statistics

11 Excel Commands: Excel CommandOutput AVERAGE(Array)Mean of the data MEDIAN(Array)Approximate median of the data MODE.MULT(Array)Mode(s) of the data VAR.P(Array)Population variance STDEV.P(Array)Population standard deviation VAR.S(Array)Sample variance STDEV.S(Array)Sample standard deviation MAX(Array)Largest number in the data MIN(Array)Smallest number in the data MAX(Array)-MIN(Array)Range of the data Data/Data Analysis/Descriptive Statistics Table of selected descriptive statistics


Download ppt "E370 Statistical Analysis for Bus & Econ Chapter 3: Summary Statistics."

Similar presentations


Ads by Google