Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
Descriptive Measures MARE 250 Dr. Jason Turner.
Measures of Dispersion
Numerically Summarizing Data
Organizing and describing Data. Instructor:W.H.Laverty Office:235 McLean Hall Phone: Lectures: M W F 11:30am - 12:20pm Arts 143 Lab: M 3:30 -
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 3 Describing Data Using Numerical Measures
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Coefficient of Variation
QBM117 Business Statistics
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
Statistics: Use Graphs to Show Data Box Plots.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Chapter 2 Describing Data with Numerical Measurements
Numerical Descriptive Measures
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Objectives 1.2 Describing distributions with numbers
Methods for Describing Sets of Data
Chapter 3 Averages and Variations
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Modified by ARQ, from © 2002 Prentice-Hall.Chap 3-1 Numerical Descriptive Measures Chapter %20ppts/c3.ppt.
Some definitions In Statistics. A sample: Is a subset of the population.
Applied Quantitative Analysis and Practices LECTURE#08 By Dr. Osman Sadiq Paracha.
Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Percentiles and Box – and – Whisker Plots Measures of central tendency show us the spread of data. Mean and standard deviation are useful with every day.
Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems.
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.
Measure of Central Tendency Measures of central tendency – used to organize and summarize data so that you can understand a set of data. There are three.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Measure of Variability (Dispersion, Spread) 1.Range 2.Inter-Quartile Range 3.Variance, standard deviation 4.Pseudo-standard deviation.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.
1 Chapter 4 Numerical Methods for Describing Data.
Unit 3: Averages and Variations Week 6 Ms. Sanchez.
Summary Statistics: Measures of Location and Dispersion.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
MR. MARK ANTHONY GARCIA, M.S. MATHEMATICS DEPARTMENT DE LA SALLE UNIVERSITY.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Measures of Variability Variability. Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Pseudo-standard.
Exploratory Data Analysis
Methods for Describing Sets of Data
Numerical Measures.
Chapter 3 Describing Data Using Numerical Measures
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics
Quartile Measures DCOVA
Presentation transcript:

Numerical Measures

Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape

Measures of Central Tendency (Location) Mean Median Mode Central Location

Measures of Non-central Location Quartiles, Mid-Hinges Percentiles Non - Central Location

Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Variability

Measures of Shape Skewness Kurtosis

Summation Notation

Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the symbol denotes the sum of these n numbers x 1 + x 2 + x 3 + …+ x n

Example Let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

Then the symbol denotes the sum of these 5 numbers x 1 + x 2 + x 3 + x 4 + x 5 = = 66

Meaning of parts of summation notation Quantity changing in each term of the sum Starting value for i Final value for i each term of the sum

Example Again let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

Then the symbol denotes the sum of these 3 numbers = = = 12979

Measures of Central Location (Mean)

Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

Example Again let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

Then the mean of the 5 numbers is:

Interpretation of the Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean,, is the centre of gravity of those the n numbers. That is if we drew a horizontal line and placed a weight of one at each value of x i, then the balancing point of that system of mass is at the point.

x1x1 x2x2 x3x3 x4x4 xnxn

In the Example

The mean,, is also approximately the center of gravity of a histogram

Measures of Central Location (Median)

The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

Example Again let x 1, x 2, x 3, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

The numbers arranged in order are: Unique “Middle” observation – the median

Example 2 Let x 1, x 2, x 3, x 4, x 5, x 6 denote the 6 denote numbers: Arranged in increasing order these observations would be: Two “Middle” observations

Median = average of two “middle” observations =

Example The data on N = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score

Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement

Computing the Median Stem leaf Diagrams Median = middle observation =12 th observation

Summary

Numerical Measures

Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape

Measures of Central Tendency (Location) Mean Median Mode Central Location

Measures of Non-central Location Quartiles, Mid-Hinges Percentiles Non - Central Location

Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Variability

Measures of Shape Skewness Kurtosis

Measures of Central Location Mean Median

Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

Some Comments The mean is the centre of gravity of a set of observations. The balancing point. The median splits the obsevations equally in two parts of approximately 50%

The median splits the area under a histogram in two parts of 50% The mean is the balancing point of a histogram 50% median

For symmetric distributions the mean and the median will be approximately the same value 50% Median &

50% median For Positively skewed distributions the mean exceeds the median For Negatively skewed distributions the median exceeds the mean 50%

An outlier is a “wild” observation in the data Outliers occur because –of errors (typographical and computational) –Extreme cases in the population

The mean is altered to a significant degree by the presence of outliers Outliers have little effect on the value of the median This is a reason for using the median in place of the mean as a measure of central location Alternatively the mean is the best measure of central location when the data is Normally distributed (Bell-shaped)

Review

Summarizing Data Graphical Methods

Histogram Stem-Leaf Diagram Grouped Freq Table

Numerical Measures Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape The objective is to reduce the data to a small number of values that completely describe the data and certain aspects of the data.

Summation Notation Quantity changing in each term of the sum Starting value for i Final value for i each term of the sum

Example Let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

Then the symbol denotes the sum of these 3 numbers = = = 12979

Then the symbol denotes the sum of these 5 numbers x 1 + x 2 + x 3 + x 4 + x 5 = = 66

Measures of Central Location (Mean)

Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

Example Again let x 1, x 2, x 3, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

Then the mean of the 5 numbers is:

Interpretation of the Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean,, is the centre of gravity of those the n numbers. That is if we drew a horizontal line and placed a weight of one at each value of x i, then the balancing point of that system of mass is at the point.

x1x1 x2x2 x3x3 x4x4 xnxn

In the Example

The mean,, is also approximately the center of gravity of a histogram

The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

Example Again let x 1, x 2, x 3, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi

The numbers arranged in order are: Unique “Middle” observation – the median

Example 2 Let x 1, x 2, x 3, x 3, x 4, x 5, x 6 denote the 6 denote numbers: Arranged in increasing order these observations would be: Two “Middle” observations

Median = average of two “middle” observations =

Example The data on N = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score

Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement

Computing the Median Stem leaf Diagrams Median = middle observation =12 th observation

Summary

Some Comments The mean is the centre of gravity of a set of observations. The balancing point. The median splits the observations equally in two parts of approximately 50%

The median splits the area under a histogram in two parts of 50% The mean is the balancing point of a histogram 50% median

For symmetric distributions the mean and the median will be approximately the same value 50% Median &

50% median For Positively skewed distributions the mean exceeds the median For Negatively skewed distributions the median exceeds the mean 50%

An outlier is a “wild” observation in the data Outliers occur because –of errors (typographical and computational) –Extreme cases in the population

The mean is altered to a significant degree by the presence of outliers Outliers have little effect on the value of the median This is a reason for using the median in place of the mean as a measure of central location Alternatively the mean is the best measure of central location when the data is Normally distributed (Bell-shaped)

Measures of Non-Central Location Percentiles Quartiles (Hinges, Mid-hinges)

Definition The P×100 Percentile is a point, x P, underneath a distribution that has a fixed proportion P of the population (or sample) below that value P×100 % xPxP

Definition (Quartiles) The first Quartile, Q 1,is the 25 Percentile, x % x 0.25

The second Quartile, Q 2,is the 50th Percentile, x % x 0.50

The second Quartile, Q 2, is also the median and the 50 th percentile

The third Quartile, Q 3,is the 75 th Percentile, x % x 0.75

The Quartiles – Q 1, Q 2, Q 3 divide the population into 4 equal parts of 25%. 25 % Q1Q1 Q2Q2 Q3Q3

Computing Percentiles and Quartiles – Method 1 The first step is to order the observations in increasing order. We then compute the position, k, of the P×100 Percentile. k = P × (n+1) Where n = the number of observations

Example The data on n = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score We want to compute the 75 th percentile and the 90 th percentile

The position, k, of the 75 th Percentile. k = P × (n+1) =.75 × (23+1) = 18 The position, k, of the 90 th Percentile. k = P × (n+1) =.90 × (23+1) = 21.6 When the position k is an integer the percentile is the k th observation (in order of magnitude) in the data set. For example the 75 th percentile is the 18 th (in size) observation

When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f then the percentile is x P = (1-f) × (m th observation in size) + f × (m+1 st observation in size) In the example the position of the 90 th percentile is: k = 21.6 Then x.90 = 0.4(21 st observation in size) + 0.6(22 nd observation in size)

When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f then the percentile is x P = (1-f) × (m th observation in size) + f × (m+1 st observation in size) x p = (1- f) ( m th obs) + f [(m+1) st obs] (m+1) st obs m th obs

When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f x p = (1- f) ( m th obs) + f [(m+1) st obs] (m+1) st obs m th obs Thus the position of x p is 100f% through the interval between the m th observation and the (m +1) st observation

Example The data Verbal IQ on n = 23 students arranged in increasing order is:

x 0.75 = 75 th percentile = 18 th observation in size =105 (position k = 18) x 0.90 = 90 th percentile = 0.4(21 st observation in size) + 0.6(22 nd observation in size) = 0.4(111)+ 0.6(118) = (position k = 21.6)

An Alternative method for computing Quartiles – Method 2 Sometimes this method will result in the same values for the quartiles. Sometimes this method will result in the different values for the quartiles. For large samples the two methods will result in approximately the same answer.

Let x 1, x 2, x 3, … x n denote a set of n numbers. The first step in Method 2 is to arrange the numbers in increasing order. From the arranged numbers we compute the median. This is also called the Hinge

Example Consider the 5 numbers: Arranged in increasing order: The median (or Hinge) splits the observations in half Median (Hinge)

The lower mid-hinge (the first quartile) is the “median” of the lower half of the observations (excluding the median). The upper mid-hinge (the third quartile) is the “median” of the upper half of the observations (excluding the median).

Consider the five number in increasing order: Median (Hinge) 13 Lower Half Upper Half Upper Mid-Hinge (First Quartile) (7+10)/2 =8.5 Upper Mid-Hinge (Third Quartile) (15+21)/2 = 18

Computing the median and the quartile using the first method: Position of the median: k = 0.5(5+1) = 3 Position of the first Quartile: k = 0.25(5+1) = 1.5 Position of the third Quartile: k = 0.75(5+1) = Q 2 = 13Q 1 = 8. 5 Q 3 = 18

Both methods result in the same value This is not always true.

Example The data Verbal IQ on n = 23 students arranged in increasing order is: Median (Hinge) 96 Lower Mid-Hinge (First Quartile) 89 Upper Mid-Hinge (Third Quartile) 105

Computing the median and the quartile using the first method: Position of the median: k = 0.5(23+1) = 12 Position of the first Quartile: k = 0.25(23+1) = 6 Position of the third Quartile: k = 0.75(23+1) = Q 2 = 96Q 1 = 89 Q 3 = 105

Many programs compute percentiles, quartiles etc. Each may use different methods. It is important to know which method is being used. The different methods result in answers that are close when the sample size is large.

Measures of Central Location Mean Median

Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

Measures of Non-Central Location Percentiles Quartiles (Hinges, Mid-hinges)

Definition The P×100 Percentile is a point, x P, underneath a distribution that has a fixed proportion P of the population (or sample) below that value P×100 % xPxP

Computing Percentiles and Quartiles – Method 1 The first step is to order the observations in increasing order. We then compute the position, k, of the P×100 Percentile. k = P × (n+1) Where n = the number of observations

When the position k is an integer the percentile is the k th observation (in order of magnitude) in the data set. When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f then the percentile is x P = (1-f) × (m th observation in size) + f × (m+1 st observation in size)

An Alternative method for computing Quartiles – Method 2 Sometimes this method will result in the same values for the quartiles. Sometimes this method will result in the different values for the quartiles. For large samples the two methods will result in approximately the same answer.

Let x 1, x 2, x 3, … x n denote a set of n numbers. The first step in Method 2 is to arrange the numbers in increasing order. From the arranged numbers we compute the median. This is also called the Hinge

The lower mid-hinge (the first quartile) is the “median” of the lower half of the observations (excluding the median). The upper mid-hinge (the third quartile) is the “median” of the upper half of the observations (excluding the median).

Box-Plots Box-Whisker Plots A graphical method of of displaying data An alternative to the histogram and stem-leaf diagram

To Draw a Box Plot Compute the Hinge (Median, Q 2 ) and the Mid-hinges (first & third quartiles – Q 1 and Q 3 ) We also compute the largest and smallest of the observations – the max and the min.

Example The data Verbal IQ on n = 23 students arranged in increasing order is: Q 2 = 96Q 1 = 89 Q 3 = 105 min = 80max = 119

The Box Plot is then drawn Drawing above an axis a “box” from Q 1 to Q 3. Drawing vertical line in the box at the median, Q 2 Drawing whiskers at the lower and upper ends of the box going down to the min and up to max.

Box Lower Whisker Upper Whisker Q2Q2 Q1Q1 Q3Q3 minmax

Example The data Verbal IQ on n = 23 students arranged in increasing order is: min = 80 Q 1 = 89 Q 2 = 96 Q 3 = 105 max = 119

Box Plot of Verbal IQ

Box Plot can also be drawn vertically

Box-Whisker plots (Verbal IQ, Math IQ)

Box-Whisker plots (Initial RA, Final RA )

Summary Information contained in the box plot Middle 50% of population 25%

Next topic: Numerical Measures of Variability Numerical Measures of Variability