Descriptive Statistics Prerequisite Material MGS 8110 Regression & Forecasting.

Slides:



Advertisements
Similar presentations
Computer Programming (TKK-2144) 13/14 Semester 1 Instructor: Rama Oktavian Office Hr.: T.12-14, Th
Advertisements

Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Intro to Descriptive Statistics
Slides by JOHN LOUCKS St. Edward’s University.
1 Business 260: Managerial Decision Analysis Professor David Mease Lecture 1 Agenda: 1) Course web page 2) Greensheet 3) Numerical Descriptive Measures.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Excel Data Analysis Tools Descriptive Statistics – Data ribbon – Analysis section – Data Analysis icon – Descriptive Statistics option – Does NOT auto.
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Stats & Excel Crash Course Jim & Sam April 8, 2014.
1 Spreadsheet Problem Solving  applied statistics  table lookup.
Basic Excel Capabilities MBA 7025 Statistical Business Analysis Reviewed at various places in Data Analysis and Decision Making with Microsoft Excel.
Describing Data: Numerical
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
Describing distributions with numbers
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
MGQ 201 WEEK 4 VICTORIA LOJACONO. Help Me Solve This Tool.
Data Collection & Processing Hand Grip Strength P textbook.
Numerical Descriptive Techniques
Chapter 3 – Descriptive Statistics
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
QBM117 Business Statistics Descriptive Statistics Numerical Descriptive Measures.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 2 Describing Data.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
Variation This presentation should be read by students at home to be able to solve problems.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Chapter 2: Descriptive Statistics Adding MegaStat in Microsoft Excel Measures of Central Tendency Mode: The most.
UTOPPS—Fall 2004 Teaching Statistics in Psychology.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
June 21, Objectives  Enable the Data Analysis Add-In  Quickly calculate descriptive statistics using the Data Analysis Add-In  Create a histogram.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Basic Excel Capabilities MBA 7025 Statistical Business Analysis Reviewed at various places in Data Analysis and Decision Making with Microsoft Excel.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics Descriptive Statistics. Statistics Introduction Descriptive Statistics Collections, organizations, summary and presentation of data Inferential.
Descriptive Statistics ( )
Business and Economics 6th Edition
Descriptive Statistics
MATH-138 Elementary Statistics
Analysis and Empirical Results
Chapter 3 Describing Data Using Numerical Measures
Teaching Statistics in Psychology
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
An Introduction to Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Numerical Descriptive Measures
Honors Statistics Review Chapters 4 - 5
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
St. Edward’s University
Business and Economics 7th Edition
Presentation transcript:

Descriptive Statistics Prerequisite Material MGS 8110 Regression & Forecasting

L00D MGS Descriptive Statistics 2 Descriptive Analysis of Data Given a bunch of data (simple data for only one variable), how is the best way to summarize the data?

L00D MGS Descriptive Statistics 3 Measures of Central Tendency Mean Median if n is odd if n is even Mode the x i value that occurs most frequently Mid Range Mid InterQuartile

L00D MGS Descriptive Statistics 4 Measures of Variability Av Deviate Av Absolute Deviate Variance Standard Deviation Range Inter Quartile Range (IQR)

5 Lets Review - Statistical Precepts Use the Mean for quantitative, symmetric data. Use the Medium for quantitative non-symmetric data. Use the mode for categorical data. Use the Variance when doing calculations. Use the Standard Deviation when presenting the results of the calculations. Major Teaching Points are frequently shown in green boxes L00D MGS Descriptive Statistics

6 Appropriate Statistics discussed in chapter 2

7 More Review – Need to memorize formules L00D MGS Descriptive Statistics Used for calculation, but not for presentations. Units are squared (e.g., inches squared). Used for presentations. Common units (e.g., inches). Divide by n-1 instead of n in order to get an unbiased estimate.

L00D MGS Descriptive Statistics 8 Interpretation of Standard Deviation (1 of 2) If the data is normally distributed Statistical Precepts Two-thirds of the data is contained in  one sigma. 95% of the data is contained in  two sigma. Almost all of the data is contained in  three sigma.

L00D MGS Descriptive Statistics 9 Interpretation of Standard Deviation (2 of 2) If ever asked to explain what the Standard Deviation means, say “two-thirds of the data will be within plus or minus one Standard Deviation from the mean”. If ever asked for the “worst case” or “best case” outcome calculate “mean – (2)sigma” and/or “mean + (2)sigma”. Statistical Precepts Definition of Standard Deviation - two-thirds of the data is contained in a range of values that are two sigma wide. Worst case outcome is  – . Best case outcome is  + 2 .

L00D MGS Descriptive Statistics 10 Other Measures Percentilesx p is the x i such that (measure of tails) p% of the x i < x p Quartilesare percentile where p = 25, 50 or 75 (measure of tails) the lower, middle or upper quartile.

L00D MGS Descriptive Statistics 11 Other Measures Coefficient of Variation (percentage measure of variability) Correlation Coefficient (measure of linear association)

L00D MGS Descriptive Statistics 12 Interpretation of Correlation Coefficient n=25  =0  =.8  =1

L00D MGS Descriptive Statistics 13 Interpretation of Correlation Coefficient n=25  =0  =-.8  =-1

L00D MGS Descriptive Statistics 14 Interpretation of Correlation Coefficient

L00D MGS Descriptive Statistics 15 Interpretation of Correlation Coefficient (1 of 2) Statistical tests of correlation coefficients are relatively meaningless. These tests are based on the hypothesis that “  =  ”. Based on the previous graph, knowing that a correlation coefficient is greater than zero is not necessarily a valuable piece of information. In terms of “Practical Significance” (compared to “Statistical Significance”), the correlation coefficient has to be at least greater than.5. From the previous graph it can be seen that  =  only explains 25% of the variability in the data. Statistical Precept  must be greater than.5 to be of Practical Significance

L00D MGS Descriptive Statistics 16 Other Measures (page 3 of 3) Skew (measure of symmetry) Kurtosis (measure of peakedness)

L00D MGS Descriptive Statistics 17 Verifying Bell Shape (Normal Distribution) Negative Skew if the distribution has a ‘long’ tail to the left, measured as skewness  -1 Positive Skew if the distribution has a ‘long’ tail to the right (more common situation), measured as skewness  +1 Symmetric if -1  skewness  +1 Peaked Distribution if Kurtosis is a large positive number (  +1). Flat Distribution if Kurtosis is a large negative number (  -1). Normal shape (proportionally S-shaped sides) if Kurtosis near zero.

L00D MGS Descriptive Statistics 18 Verifying Bell Shape (Normal Distribution) Statistical Precept Bell shaped (Normally distributed) if -1  skewness  +1 and if -1  kurtosis  +1

L00D MGS Descriptive Statistics 19 How is the best way to summarize data? (our original question) Central Tendency (Mean, Median & Mode) Variability (Variance & Standard Deviation) Shape (Percentiles, Skewness & Kurtosis) Association (correlation)

L00D MGS Descriptive Statistics 20 Notations

L00D MGS Descriptive Statistics 21 Standard Deviation of Sample Mean Called the “Standard Error” of the Mean

L00D MGS Descriptive Statistics 22 Insert / Function examples (1 of 3) Mean Average(A1:A10) Median Median(A1:A10) Mode Mode(A1:A10) Mid Range( MAX(A1:A10) + MIN(A1:A10) ) / 2 InterQuartile( Quartile(A1:A10,1) + Quartile(A1:A10,3) ) / 2

L00D MGS Descriptive Statistics 23 Insert / Function examples (2 of 3) Av DeviateNA Av Absolute DeviateAveDev(A1:A10) VarianceVar(A1:A10) Standard DeviationStDev(A1:A10) RangeMAX(A1:A10) - MIN(A1:A10) Inter Quartile Range (IQR) Quartile(A1:A10,3) - Quartile(A1:A10,1)

L00D MGS Descriptive Statistics 24 Insert / Function examples (3 of 3) PercentilesPercentile(A1:A10,.05) QuartilesQuartile(A1:A10,q) where q=0,1,2,3,4 Coef. of VariationStDev(A1:A10)/Average(A1:A10) CorrelationCorrel(A1:A10,B1:B10) SkewSkew(A1:A10) KurtosisKurt(A1:A10)

L00D MGS Descriptive Statistics 25 Example Calculations Q. Should I use the Mean or the Median to state the Central value of this data? -0.31=SKEW(Height) -1.27=KURT(Height) 0.17=SKEW(Weight) -1.03=KURT(Weight) Answer – Both variables have a somewhat peaked distributions (Kurtosis greater than 1), but both variables have very symmetric distributions (non-skewed distribution); hence, use Mean.

L00D MGS Descriptive Statistics 26 Example Calculations Q. The Standard Deviation for Height is almost 2 inches, what is the practical interpretation of this value? Answer – The height of 2/3 of the population will vary by less than 4 inches (3.87”). 67.1=B14-B =B14+B =F10-F9

L00D MGS Descriptive Statistics 27 Example Calculations Q. What is the height of the shortest person and the tallest person that I may meet today (worst case and best case)? Answer – The shortest person will be 5’-5” (65.2”) and the tallest person will be 6’-1” (72.9”). 65.2=AvHt-2*StDevHt 72.9=AvHt+2*StDevHt

L00D MGS Descriptive Statistics 28 Example Calculations Q. What is the height of the shortest person that I may meet over the next year? Answer – The shortest person that I am likely to meet in the foreseeable future will be 5’-6” (66.0”). 66.0=PERCENTILE(Height,0.01)

L00D MGS Descriptive Statistics 29 Example Calculations Q. The answers to the two previous questions are not consistent. The 5% values calculated as Mean – 2(Sigma) was 5’-5” where as the 1% value calculated as a Percentile was 5’-6”. Answer – These types of inconsistencies (i.e., errors) will occur with small samples. The procedure used by the PERCENTILE function is based on an interpolated calculation with the two smallest values in the sample.

L00D MGS Descriptive Statistics 30 Example Calculations Q. Which variable, Height or Weight, has the greatest relative variability? Answer – In agreement with our intuition, Weight is 3 to 4 times more variable than height (11/3 = 3.67). Coef of Variation 3%=StDevHt/AvHt 11%=StDevWt/AvWt

L00D MGS Descriptive Statistics 31 Example Calculations Q. Is there a relationship between Height and Weight and if so how large is the relationship? Answer – The correlation between Height and Weight is.78 which means that about 60% (.61) of the variability in weight is due to differences in Height =CORREL(Height,Weight) 0.61=G23^2

L00D MGS Descriptive Statistics 32 Example Calculations Q. Given that there is a relationship between Height and Weight, is the relationship linear or non- linear? Answer – Simple statistics cannot be used to determine linear versus non-linear, would need to plot the data. The correlation indicates that there is a relatively strong linear relationship, but a plot of the data (Weight vs. Height) may indicate that there is an even stronger non-linear relationship 0.783=CORREL(Height,Weight) 0.61=G23^2

L00D MGS Descriptive Statistics 33 Example Calculations Q. Are Height and Weight Normally distributed? Answer – Based on out Rule-of-Thumb test (-1 < Skew < +1 and -1 < Kurt < +1), neither of these variables are normally distributed =SKEW(Height) -1.27=KURT(Height) 0.17=SKEW(Weight) -1.03=KURT(Weight)

L00D MGS Descriptive Statistics 34 Example Calculations Q. Given that the variables are NOT Normally distributed, why do I care? Answer – You previous interpretation of the Standard Deviation maybe somewhat inaccurate (“The height of 2/3 of the population will vary by less than 4 inches “). Also, you previous interpretation of Worst Case and Best Case maybe somewhat inaccurate (“The shortest person will be 5’-5” and the tallest person will be 6’-1”).

L00D MGS Descriptive Statistics 35 Example Calculations Q. The average Height is estimated to be 69.1”, how good is that estimate? Answer – The true average height could be anywhere between 67.8 inches to 70.3 inches. A better estimate could be obtained if a large sample was available. 67.8=AvHt-2*StErrorHt 70.3=AvHt+2*StErrorHt

L00D MGS Descriptive Statistics 36 More about Variability Use StDev (or Var) in Excel Use StDevP (or VarP) in Excel Alternative formulation 1) if every item in the Universe is included in the Sample or 2) The Mean is know with certainty.

L00D MGS Descriptive Statistics 37 Normal Calculations NORMINV(0.95,68.8,2.6)=73.08

L00D MGS Descriptive Statistics 38 Normal Calculations NORMDIST(67,68.8,2.6,TRUE)=.244

L00D MGS Descriptive Statistics 39 Standardized Normal Calculations NORMSINV(0.95)=1.645 NORMSINV(.05)= X variable has mean and StDev of  and  which are estimated by x bar and s. Z variable has mean=0 and StDev=1. Z is a “standardized normal”.

L00D MGS Descriptive Statistics 40 Standardized Normal Calculations NORMSDIST(-1)=.159 NORMSDIST(+1)=.841

L00D MGS Descriptive Statistics 41 t-distribution t-distribution is needed if  is not know and estimated by s and n<30.

L00D MGS Descriptive Statistics 42 t-distribution Calculations one-tail TDIST(2,4,1)=.058 TDIST(X, d.f., # tails) “t” with tails=1 sums from + infinity. “Z” and “Normal” sums from – infinity.

L00D MGS Descriptive Statistics 43 t-distribution Calculations two-tail TDIST(2,4,2)=.116 TDIST(X, d.f., # tails) “t (2 tail)” sums simultaneously from both – infinity and + infinity. Undefined for negative values of t.

L00D MGS Descriptive Statistics 44 Loading “Data Analysis” in Office 2003 / Tools / Add-Ins / Will need to have original MS Office CD. Both NO

L00D MGS Descriptive Statistics 45 Example of Tools / Data Analysis / Descriptive Statistics

L00D MGS Descriptive Statistics 46 Example Output of Tools / Data Analysis / Descriptive Statistics

L00D MGS Descriptive Statistics 47 Loading “Data Analysis” in Office )Click the Office button in the upper left hand corner of the Excel. 2)Click the “Excel Options” tab in the bottom right- hand cornet of the drop- down menu gotten from step #1. Both NO 1) click

L00D MGS Descriptive Statistics 48 Loading “Data Analysis” in Office )Click Add-Ins in the left banner of the Excel Options menu. 4)Click “Analysis ToolPak” in the Add-ins menu. Then, select BOTH “Analysis ToolPak” and “Analysisi ToolPak – VBA” 5)Click the Go button at the bottom right hand corner of the Excel Options menu. Don’t click the “OK” button”. Both NO Both 5) click 4) click3) click 6) click

L00D MGS Descriptive Statistics 49 Loading “Data Analysis” in Office ) click

L00D MGS Descriptive Statistics 50 Example of Tools / Data Analysis / Descriptive Statistics

L00D MGS Descriptive Statistics 51 Precision of numerical results – state “3 Significant Digits”

L00D MGS Descriptive Statistics 52 Precision of numerical results – state “3 Significant Digits” (continued)

L00D MGS Descriptive Statistics 53 Data is a potential outlier Symmetric distribution x i mean + 3 s Skewed distribution x i Q 3 + (1.5)R Q Some Great Rules of Thumb Data is Normally distributed (Bell shaped) if -1  skewness  +1 and if -1  kurtosis  +1 Let’s Review

L00D MGS Descriptive Statistics 54 Prerequisite Spreadsheet Skills Cut, Copy, Paste & Paste Special Cell corner Copy Add or delete Rows or Columns Change width/height of row/column Font, alignment, boarder & number of cell Referencing and calculations with cells. Data / Sort Naming cell or range of cells Insert / Function / Average Sum, Max, Min, Count, Small and Large (Tools / Add-ins / Data analysis) Tools / Data Analysis / Descriptive statistics Single quote for equation statement. REPLACE command. DATA / Group & FORMAT: Column, Hide. Grab an entire column of data (CTRL+SHIFT, down arrow). See also “L99A MBA7025.ppt” in folder “L00A MGS8110”