The mean for quantitative data is obtained by dividing the sum of all values by the number of values in the data set.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Descriptive Measures MARE 250 Dr. Jason Turner.
Class Session #2 Numerically Summarizing Data
NUMERICAL DESCRIPTIVE MEASURES
Descriptive Statistics
NUMERICAL DESCRIPTIVE MEASURES
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
Department of Quantitative Methods & Information Systems
Describing distributions with numbers
Rules of Data Dispersion By using the mean and standard deviation, we can find the percentage of total observations that fall within the given interval.
Numerical Descriptive Techniques
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review Measures of central tendency
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Chapter 2 Describing Data.
Describing distributions with numbers
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Lecture 3 Describing Data Using Numerical Measures.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Measure of Central Tendency Measures of central tendency – used to organize and summarize data so that you can understand a set of data. There are three.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Summary Statistics and Mean Absolute Deviation MM1D3a. Compare summary statistics (mean, median, quartiles, and interquartile range) from one sample data.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Summary Statistics: Measures of Location and Dispersion.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
LIS 570 Summarising and presenting data - Univariate analysis.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
Descriptive Statistics(Summary and Variability measures)
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
Chapter 3 Numerical Descriptive Measures. 3.1 Measures of central tendency for ungrouped data A measure of central tendency gives the center of a histogram.
Exploratory Data Analysis
2.5: Numerical Measures of Variability (Spread)
Describing Distributions Numerically
NUMERICAL DESCRIPTIVE MEASURES
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
NUMERICAL DESCRIPTIVE MEASURES
Descriptive Statistics
NUMERICAL DESCRIPTIVE MEASURES
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics
DAY 3 Sections 1.2 and 1.3.
Numerical Descriptive Statistics
Chapter 1: Exploring Data
Describing Distributions Numerically
NUMERICAL DESCRIPTIVE MEASURES
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Business and Economics 7th Edition
NUMERICAL DESCRIPTIVE MEASURES
Presentation transcript:

The mean for quantitative data is obtained by dividing the sum of all values by the number of values in the data set.

The following are the ages of all eight employees of a small company: Find the mean age of these employees.

Thus, the mean age of all eight employees of this company is years, or 45 years and 3 months.

 The median is the value of the middle term in a data set that has been ranked in increasing order.  The calculation of the median consists of the following two steps: 1.Sort/Arrange the data set in increasing order 2.Find the middle term in a data set with n values. The value of this term is the median.

The following data give the weight lost (in pounds) by a sample of five members of a health club at the end of two months of membership: Find the median.

First, we rank the given data in increasing order as follows: Therefore, the median is The median weight loss for this sample of five members of this health club is 8 pounds.

The median gives the “center” with half the data values to the left of the median and half to the right of the median. The advantage of using the median as a measure of central tendency is that it is less influenced by outliers & skewness. Consequently, the median is preferred over the mean as a measure of central tendency for data sets that contain outliers and/or skewness.

The mode is the value that occurs with the highest frequency in a data set.

Range = Largest value – Smallest Value

The range, like the mean has the disadvantage of being influenced by outliers. Its calculation is based on two values only: the largest and the smallest.

The standard deviation is the most used measure of dispersion. The value of the standard deviation tells how closely the values of a data set are clustered around the mean.

xdeviation – 84 = – 84 = – 84 = – 84 = +8 ∑(deviation) = 0

standard deviation stdev = sqrt (sum squared deviations divided by n-1) Example :: sqrt[( )/3] sqrt(478/3) = sqrt(159.3) = 12.62

A numerical measure such as the mean, median, mode, range, variance, or standard deviation calculated for a population data set is called a population parameter, or simply a parameter. A summary measure calculated for a sample data set is called a sample statistic, or simply a statistic.

 For a bell shaped distribution approximately 1. 68% of the observations lie within one standard deviation of the mean 2. 95% of the observations lie within two standard deviations of the mean % of the observations lie within three standard deviations of the mean

 The age distribution of a sample of 5000 persons is bell-shaped with a mean of 40 years and a standard deviation of 12 years. Determine the approximate percentage of people who are 16 to 64 years old.

 Quartiles are three summary measures that divide a ranked data set into four equal parts. The second quartile is the same as the median of a data set. The first quartile is the value of the middle term among the observations in the lower half, and the third quartile is the value of the middle term among the observations in the upper half.

25% Each of these portions contains 25% of the observations of a data set arranged in increasing order Q 1 Q 2

 Calculating Interquartile Range  The difference between the third and first quartiles gives the interquartile range; that is,  IQR = Interquartile range = Q 3 – Q 1

 The following are the ages of nine employees of an insurance company:  a)Find the values of the three quartiles. Where does the age of 28 fall in relation to the ages of the employees? b)Find the interquartile range.

The following data are the incomes (in thousands of dollars) for a sample of 12 households.   Construct a box-and-whisker plot for these data.

 Step 1.   Median = ( ) / 2 = 47  Q 1 = ( ) / 2 = 37  Q 3 = ( ) / 2 = 61  IQR = Q 3 – Q 1 = 61 – 37 = 24

 Step 2.  1.5 x IQR = 1.5 x 24 = 36  Lower inner fence = Q 1 – 36 = 37 – 36 = 1  Upper inner fence = Q = = 97

 Step 3.  Smallest value within the two inner fences = 29  Largest value within the two inner fences = 72

Income First quartile Third quartile Median

 An outlier First quartile Median Third quartile Smallest value within the two inner fences Largest value within two inner fences Income

What Can Go Wrong? Do a reality check— don’t let technology do your thinking for you. Don’t forget to sort the values before finding the median … quartiles. Don’t compute numerical summaries of a categorical variable. Watch out for multiple peaks—multiple peaks might indicate multiple groups in your data.

What Can Go Wrong? Be aware of slightly different methods— different statistics packages and calculators may give you different answers for the same data. Beware of outliers. Make a picture (make a picture, make a picture). Be careful when comparing groups that have very different spreads.

So What Do We Know? We describe distributions in terms of L.O.S.S. For symmetric distributions, it’s safe to use the mean and standard deviation; for skewed distributions, it’s better to use the median and interquartile range. Always make a picture—don’t make judgments about which measures of center and spread to use by just looking at the data.