Descriptive Statistics: Numerical Methods

Descriptive Statistics: Numerical Methods
Chapter 3 Descriptive Statistics: Numerical Methods

Descriptive Statistics
3.1 Describing Central Tendency 3.2 Measures of Variation 3.3 Percentiles, Quartiles and Box-and-Whiskers Displays 3.4 Covariance, Correlation, and the Least Square Line 3.5 Weighted Means and Grouped Data (Optional) 3.6 The Geometric Mean (Optional)

Describing Central Tendency
In addition to describing the shape of a distribution, want to describe the data set’s central tendency A measure of central tendency represents the center or middle of the data

Parameters and Statistics
A population parameter is a number calculated from all the population measurements that describes some aspect of the population A sample statistic is a number calculated using the sample measurements that describes some aspect of the sample

Measures of Central Tendency
Mean,  The average or expected value Median, Md The value of the middle point of the ordered measurements Mode, Mo The most frequent value

The Mean Population X1, X2, …, XN m Population Mean
Sample x1, x2, …, xn Sample Mean

The Sample Mean For a sample of size n, the sample mean is defined as
and is a point estimate of the population mean  It is the value to expect, on average and in the long run

Example 3.1: The Car Mileage Case
Example 3.1: Sample mean for first five car mileages from Table 3.1: 30.8, 31.7, 30.1, 31.6, 32.1

The Median The median Md is a value such that 50% of all measurements, after having been arranged in numerical order, lie above (or below) it If the number of measurements is odd, the median is the middlemost measurement in the ordering If the number of measurements is even, the median is the average of the two middlemost measurements in the ordering

Example: Car Mileage Case
Example 3.1: First five observations from Table 3.1: 30.8, 31.7, 30.1, 31.6, 32.1 In order: 30.1, 30.8, 31.6, 31.7, 32.1 There is an odd so median is one in middle, or 31.6

The Mode The mode Mo of a population or sample of measurements is the measurement that occurs most frequently Modes are the values that are observed “most typically” Sometimes higher frequencies at two or more values If there are two modes, the data is bimodal If more than two modes, the data is multimodal When data are in classes, the class with the highest frequency is the modal class The tallest box in the histogram

Histogram Describing the 50 Mileages

Relationships Among Mean, Median and Mode

Measures of Variation Knowing the measures of central tendency is not enough Both of the distributions below have identical measures of central tendency

Measures of Variation Range Largest minus the smallest measurement
Variance The average of the squared deviations of all the population measurements from the population mean Standard The square root of the variance Deviation

The Range Largest minus smallest
Measures the interval spanned by all the data For Figure 3.13, largest repair time is 5 and smallest is 3 Range is 5 – 3 = 2 days

Population Variance and Standard Deviation
The population variance (σ2) is the average of the squared deviations of the individual population measurements from the population mean (µ) The population standard deviation (σ) is the positive square root of the population variance

Variance For a population of size N, the population variance σ2 is:
For a sample of size n, the sample variance s2 is:

Standard Deviation Population standard deviation (σ):
Sample standard deviation (s):

Example: Chris’s Class Sizes This Semester
Data points are: 60, 41, 15, 30, 34 Mean is 36 Variance is: Standard deviation is:

Example: Sample Variance and Standard Deviation
Example 3.7: data for first five car mileages from Table 3.1 are 30.8, 31.7, 30.1, 31.6, 32.1 The sample mean is 31.26

The Empirical Rule for Normal Populations
If a population has mean µ and standard deviation σ and is described by a normal curve, then 68.26% of the population measurements lie within one standard deviation of the mean: [µ-σ, µ+σ] 95.44% of the population measurements lie within two standard deviations of the mean: [µ-2σ, µ+2σ] 99.73% of the population measurements lie within three standard deviations of the mean: [µ-3σ, µ+3σ]

The Empirical Rule and Tolerance Intervals

Example 3.9: The Car Mileage Case Continued
68.26% of all individual cars will have mileages in the range [x±s] = [31.6±0.8] = [30.8, 32.4] mpg 95.44% of all individual cars will have mileages in the range [x±2s] = [31.6±1.6] = [30.0, 33.2] mpg 99.73% of all individual cars will have mileages in the range [x±3s] = [31.6±2.4] = [29.2, 34.0] mpg

Estimated Tolerance Intervals in the Car Mileage Case

Chebyshev’s Theorem Let µ and σ be a population’s mean and standard deviation, then for any value k> 1 At least 100(1 - 1/k2 )% of the population measurements lie in the interval [µ-kσ, µ+kσ] Only practical for non-mound-shaped distribution population that is not very skewed

z Scores For any x in a population or sample, the associated z score is The z score is the number of standard deviations that x is from the mean A positive z score is for x above (greater than) the mean A negative z score is for x below (less than) the mean

Example: z Score Population of profit margins for five American companies: 8%, 10%, 15%, 12%, 5% µ = 10%, σ = 3.406%

Coefficient of Variation
Measures the size of the standard deviation relative to the size of the mean Coefficient of variation =standard deviation/mean × 100% Used to: Compare the relative variabilities of values about the mean Compare the relative variability of populations or samples with different means and different standard deviations Measure risk

Descriptive Statistics: Numerical Methods

Similar presentations

Presentation on theme: "Descriptive Statistics: Numerical Methods"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Descriptive Statistics: Numerical Methods

Similar presentations

Presentation on theme: "Descriptive Statistics: Numerical Methods"— Presentation transcript:

Similar presentations

About project

Feedback