MBA7025_04.ppt/Jan 27, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Descriptive Statistics Jan 27, 2015.

Slides:



Advertisements
Similar presentations
Measures of Dispersion
Advertisements

Numerically Summarizing Data
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 3 Describing Data Using Numerical Measures
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slides by JOHN LOUCKS St. Edward’s University.
Basic Business Statistics 10th Edition
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Coefficient of Variation
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
Chapter 3 – Descriptive Statistics
MBA7020_04.ppt/June 120, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Descriptive Statistics June 20, 2005.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Modified by ARQ, from © 2002 Prentice-Hall.Chap 3-1 Numerical Descriptive Measures Chapter %20ppts/c3.ppt.
Descriptive Statistics: Numerical Methods
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 2 Describing Data.
Descriptive Statistics1 LSSG Green Belt Training Descriptive Statistics.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Lecture 3 Describing Data Using Numerical Measures.
Variation This presentation should be read by students at home to be able to solve problems.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
BUS304 – Data Charaterization1 Other Numerical Measures  Median  Mode  Range  Percentiles  Quartiles, Interquartile range.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Business Statistics, A First Course.
Estimating the Population Mean Income of Lexus Owners Sample Mean + Margin of Error Called a Confidence Interval To Compute Margin of Error, One of Two.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
Confidence Intervals Inferences about Population Means and Proportions.
CONFIDENCE INTERVALS.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data Descriptive Statistics: Central Tendency and Variation.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
CHAPTER 2: Basic Summary Statistics
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
© 1999 Prentice-Hall, Inc. Chap Measures of Central Location Mean, Median, Mode Measures of Variation Range, Variance and Standard Deviation Measures.
Descriptive Statistics ( )
Business and Economics 6th Edition
Chapter 3 Describing Data Using Numerical Measures
Estimating the Population Mean Income of Lexus Owners
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Business and Economics 7th Edition
Presentation transcript:

MBA7025_04.ppt/Jan 27, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Descriptive Statistics Jan 27, 2015

MBA7025_04.ppt/Jan 27, 2015/Page 2 Georgia State University - Confidential Agenda Central Limit Theorem Descriptive Summary Measures 1. Measures of Central Location Mean, Median, Mode 2. Measures of Variation The Range, Percentile, Variance and Standard Deviation 3. Measures of Association Coefficient of Variation Confidence Interval

MBA7025_04.ppt/Jan 27, 2015/Page 3 Georgia State University - Confidential 1. It is the Arithmetic Average of data values: 2. The Most Common Measure of Central Tendency 3. Affected by Extreme Values (Outliers) Mean = 5Mean = 6 Sample Mean Mean

MBA7025_04.ppt/Jan 27, 2015/Page 4 Georgia State University - Confidential Median = 5 1.Important Measure of Central Tendency 2.In an ordered array, the median is the “middle” number. If n is odd, the median is the middle number. If n is even, the median is the average of the 2 middle numbers. 3.Not Affected by Extreme Values Median

MBA7025_04.ppt/Jan 27, 2015/Page 5 Georgia State University - Confidential Mode = 9 1.A Measure of Central Tendency 2.Value that Occurs Most Often 3.Not Affected by Extreme Values 4.There May Not be a Mode 5.There May be Several Modes 6.Used for Either Numerical or Categorical Data No Mode Mode

MBA7025_04.ppt/Jan 27, 2015/Page 6 Georgia State University - Confidential Describes How Data Are Distributed Measures of Shape: Symmetric or skewed Right-Skewed Left-Skewed Symmetric Mean =Median =Mode Mean Median Mode Median Mean Mode Shape

MBA7025_04.ppt/Jan 27, 2015/Page 7 Georgia State University - Confidential Agenda Central Limit Theorem Descriptive Summary Measures 1. Measures of Central Location Mean, Median, Mode 2. Measures of Variation The Range, Percentile, Variance and Standard Deviation 3. Measures of Association Coefficient of Variation Confidence Interval

MBA7025_04.ppt/Jan 27, 2015/Page 8 Georgia State University - Confidential Measure of Variation Difference Between Largest & Smallest Observations: Range = Ignores How Data Are Distributed: The Range Range = = Range = = 5

MBA7025_04.ppt/Jan 27, 2015/Page 9 Georgia State University - Confidential Percentile 1.Arrange data in ascending order. 2.The middle number is the median. 3.The number halfway to the median is the first quartile. 4.The number halfway past the median is the 3 rd quartile. 5.A number with (no more than) 66% of the values less than it is the 66 th percentile, and so forth.

MBA7025_04.ppt/Jan 27, 2015/Page 10 Georgia State University - Confidential Percentile Obs Medals Obs Medals Obs Medals Obs Medals Obs Medals Olympic Medal Tally for top 55 nations. What is the percentile score for a country with 9 medals? What is the 50 th percentile?

MBA7025_04.ppt/Jan 27, 2015/Page 11 Georgia State University - Confidential Percentile Solutions Order all data (ascending or descending). 1. Country with 9 medals ranks 24 th out of 55. There are 31 nations (56.36%) below it and 23 nations (41.82%) above it. Hence it can be considered a 57 th or 58 th percentile score. 2. The medal tally that corresponds to a 50 th percentile is the one in the middle of the group, or the 28 th country, with 7 medals. Hence the 50 th percentile (Median) is 7.

MBA7025_04.ppt/Jan 27, 2015/Page 12 Georgia State University - Confidential Box Plot Median Q1Q3 SmallestLargest

MBA7025_04.ppt/Jan 27, 2015/Page 13 Georgia State University - Confidential Important Measure of Variation Shows Variation About the Mean: For the Population: For the Sample: Variance For the Population: use N in the denominator. For the Sample : use n - 1 in the denominator. or

MBA7025_04.ppt/Jan 27, 2015/Page 14 Georgia State University - Confidential Most Important Measure of Variation Shows Variation About the Mean: For the Population: For the Sample: Standard Deviation For the Population: use N in the denominator. For the Sample : use n - 1 in the denominator. or

MBA7025_04.ppt/Jan 27, 2015/Page 15 Georgia State University - Confidential Computing Standard Deviation Computing Sample Variance and Standard Deviation Mean of X = 6 Deviation XFrom MeanSquared Sum of Squares 6.50Variance = SS/n Stdev = Sqrt(Variance)

MBA7025_04.ppt/Jan 27, 2015/Page 16 Georgia State University - Confidential The Normal Distribution A property of normally distributed data is as follows: Distance from Mean Percent of observations included in that range ± 1 standard deviation Approximately 68% ± 2 standard deviations Approximately 95% ± 3 standard deviations Approximately 99.74%

MBA7025_04.ppt/Jan 27, 2015/Page 17 Georgia State University - Confidential Comparing Standard Deviations Mean = 15.5 s = Data B Data A Mean = 15.5 s = Mean = 15.5 s = 4.57 Data C

MBA7025_04.ppt/Jan 27, 2015/Page 18 Georgia State University - Confidential Outliers Typically, a number beyond a certain number of standard deviations is considered an outlier. In many cases, a number beyond 3 standard deviations (about 0.25% chance of occurring) is considered an outlier. If identifying an outlier is more critical, one can make the rule more stringent, and consider 2 standard deviations as the limit.

MBA7025_04.ppt/Jan 27, 2015/Page 19 Georgia State University - Confidential Agenda Central Limit Theorem Descriptive Summary Measures 1. Measures of Central Location Mean, Median, Mode 2. Measures of Variation The Range, Percentile, Variance and Standard Deviation 3. Measures of Association Coefficient of Variation Confidence Interval

MBA7025_04.ppt/Jan 27, 2015/Page 20 Georgia State University - Confidential Measure of Relative Variation Always a % Shows Variation Relative to Mean Used to Compare 2 or More Groups Formula (for Sample): Coefficient of Variation

MBA7025_04.ppt/Jan 27, 2015/Page 21 Georgia State University - Confidential Stock A: Average Price last year = $50 Standard Deviation = $5 Stock B: Average Price last year = $100 Standard Deviation = $5 Coefficient of Variation: Stock A: CV = 10% Stock B: CV = 5% Computing Coefficient of Variation

MBA7025_04.ppt/Jan 27, 2015/Page 22 Georgia State University - Confidential Agenda Central Limit Theorem Descriptive Summary Measures Confidence Interval

MBA7025_04.ppt/Jan 27, 2015/Page 23 Georgia State University - Confidential Central Limit Theorem Regardless of the population distribution, the distribution of the sample means is approximately normal for sufficiently large sample sizes (n>=30), with For a Sample Sizes of 30 or More, Distribution of the Sample Mean Will Be Normal, with –mean of sample means = population mean, and –standard error = [population deviation] / [sqrt(n)] and

MBA7025_04.ppt/Jan 27, 2015/Page 24 Georgia State University - Confidential Level of Significance & Level of Confidence Level of Significance – α (alpha), equals the maximum allowed percent of error. If the maximum allowed error is 5%, then α = Level of Confidence is the desired degree of certainty. A 95% Confidence Level is the most common. A 95% Confidence Level would correspond to a 95% Confidence Interval of the Mean. This would state that the actual population mean has a 95% probability of lying within the calculated interval. A 95% Confidence Level corresponds to a 5% level of significance, or α = The Confidence Level therefore equals 1- α.

MBA7025_04.ppt/Jan 27, 2015/Page 25 Georgia State University - Confidential Why Does Central Limit Theorem Work? As Sample Size Increases: 1.Most Sample Means will be Close to Population Mean, 2.Some Sample Means will be Either Relatively Far Above or Below Population Mean. 3.A Few Sample Means will be Either Very Far Above or Below Population Mean.

MBA7025_04.ppt/Jan 27, 2015/Page 26 Georgia State University - Confidential Agenda Confidence Interval Descriptive Summary Measures Central Limit Theorem

MBA7025_04.ppt/Jan 27, 2015/Page 27 Georgia State University - Confidential Confidence Intervals The population mean is within 2 Standard Errors (SE) of the sample mean, 95% of the time. Thus, is in the range defined by: 2*SE, about 95% of the time. (2 *SE) is also called the Margin of Error (MOE). 95% is called the confidence level. Sample Mean + Margin of Error (MOE) Called a Confidence Interval

MBA7025_04.ppt/Jan 27, 2015/Page 28 Georgia State University - Confidential The Standard Normal Distribution 68% 95% 99.7%

MBA7025_04.ppt/Jan 27, 2015/Page 29 Georgia State University - Confidential Confidence Interval for Mean In general, the confidence interval for is given by z. is the sample mean z is the confidence factor. It is the number of standard errors one has to go from the mean in order to include a certain percent of observations. For 95% confidence the value is 1.96 (approximately 2.00). is the standard error of the sample means. In Excel, compute z with 95% confidence level (i.e. level of significance = 0.05) z score = normsinv(1-0.05/2) = 1.96

MBA7025_04.ppt/Jan 27, 2015/Page 30 Georgia State University - Confidential Confidence Interval for Mean Since is generally not known we substitute the sample standard deviation, ‘s’. This changes the distribution of the sample means from z (standard normal) to a t-distribution, a close relative. t. The t value is slightly larger than the z for a given confidence level, thereby increasing the margin of error. That is the price of using s in place of

MBA7025_04.ppt/Jan 27, 2015/Page 31 Georgia State University - Confidential Confidence Interval for Mean (Example 1) Gas Price A sample of 49 gas stations nationwide shows average price of unleaded is $ 3.87 and a standard deviation of $ Estimate the mean price of gas nationwide with 95% confidence. In Excel, compute t with 5% error and (n-1), or 48 degrees of freedom =tinv(0.05,48) = , rounded to % CI for the Mean is: t =3.87 ± [2.01 * (0.15/√49)] = $ 3.87 ± Thus, $3.827 < < $3.913 Interpret the result!

MBA7025_04.ppt/Jan 27, 2015/Page 32 Georgia State University - Confidential Confidence Interval for Mean (Example 2) Federal Aid Problem Suppose a census tract with 5000 families is eligible for aid under program HR- 247 if average income of families of 4 is between $7500 and $8500 (those lower than 7500 are eligible in a different program). A random sample of 12 families yields data below. 7,300 7,700 8,100 8,400 7,800 8,300 8,500 7,600 7,400 7,800 8,300 8,600 Representative Sample

MBA7025_04.ppt/Jan 27, 2015/Page 33 Georgia State University - Confidential Confidence Interval for Mean (Example 2) Federal Aid Problem 7,300 7,700 8,100 8,400 7,800 8,300 8,500 7,600 7,400 7,800 8,300 8,600 Representative Sample In Excel, compute t with 5% error and (n-1), or 11 degrees of freedom =tinv(0.05,11) =

MBA7025_04.ppt/Jan 27, 2015/Page 34 Georgia State University - Confidential Confidence Interval for Mean (Example 2) Federal Aid Problem In Excel, compute t with 5% error and (n-1), or 11 degrees of freedom =tinv(0.05,11) = % CI for the Mean is: t =7,983 ± MOE =7,983 ± [2.201 * (441/√12)] = 7,983 ± 280 Thus, $7,703 < < $8,263 Interpretation of Confidence Interval PopulationNot Sample) 95% Confident that Interval $7,983 + $280 Contains Unknown Population (Not Sample) Mean Income. If We Selected 1,000 Samples of Size 12 and Constructed 1,000 Confidence Intervals, about 950 Would Contain Unknown Population Mean and 50 Would Not.

MBA7025_04.ppt/Jan 27, 2015/Page 35 Georgia State University - Confidential Confidence Interval for Proportions For proportions, p = population proportion = sample proportion Confidence Interval for p is given by ± z.

MBA7025_04.ppt/Jan 27, 2015/Page 36 Georgia State University - Confidential Confidence Interval for Proportions (Example 1) Presidential Election The Wall Street Journal for Sept 10, 2008 reports that a poll of 860 people shows a 46% support for Sen. Obama as President. Find the 95% CI for the proportion of the population that supports him. In Excel, compute z with 95% confidence level (i.e. level of significance = 0.05) z score = normsinv(1-0.05/2) = % CI for the Proportions is: = 0.46 ± Thus,.427 < p <.493

MBA7025_04.ppt/Jan 27, 2015/Page 37 Georgia State University - Confidential Confidence Interval for Proportions (Example 2) Japan Business Survey N =200 Californians Yes = 116 No = 84 Is Japan the Foremost Economic Power Today?

MBA7025_04.ppt/Jan 27, 2015/Page 38 Georgia State University - Confidential Confidence Interval for Proportions (Example 2) Japan Business Survey In Excel, compute z with 95% confidence level (i.e. level of significance = 0.05) z score = normsinv(1-0.05/2) = % CI for the Proportions is: = 0.58 ± MOE = 0.58 ± Thus,.512 < p <.648 In Excel, compute z with 90% confidence level (i.e. level of significance = 0.10) z score = normsinv(1-0.10/2) = % CI for the Proportions is: = 0.58 ± MOE = 0.58 ± Thus,.523 < p <.637

MBA7025_04.ppt/Jan 27, 2015/Page 39 Georgia State University - Confidential Sample Means versus Sample Proportion Income/Loss Time to Complete Loan Papers Number of Fat Calories in Burger Breaking Strength of Cellular Phone Housing Americans Who Believe that Japan is #1 Economic Power Circuit Boards with One or More Failed Solder Connections African-Americans Who Pass CPA MeanProportion of Means and Proportions Not the Same!!!!

MBA7025_04.ppt/Jan 27, 2015/Page 40 Georgia State University - Confidential Similarities and Differences Between Sample Means and Proportions Sample Means Measured tComputed from Data that Are Measured. tEstimate Population Means. Sample Proportions Counted tComputed from Data that Are Counted. tEstimate Population Proportions.