Download presentation
1
Stat 2411 Statistical Methods
Chapter 3 Measures of Location
2
3.1 Populations and Samples
Population: All conceivably possible or hypothetically possible observation Sample: The particular observations actually taken
3
Population Example: Temperatures of patients with meningitis.
There are unlimited or infinite potential observations 100.2 101.5 100.3 Population of potential measurements
4
Sample Notation: value Sample n=10 104.0 100.2 100.8 108.0 104.8 102.4
Notation: value
5
3.2 The mean Sample mean= =Average =Center of gravity
6
Summation notation
7
Population Descriptions
The Population mean is the average of all values in the population of potential values. Population mean = Population descriptions are denoted by Greek letters like Meningitis example: = average of all potential measurement of temperature of all meningitis cases.
8
Parameter and Statistic
Population descriptions – parameters Sample descriptions – statistics Sample statistics are usually used to estimate the corresponding population parameters.
9
3.3 Weighted mean Weight X Homework 20 90 Exam 1 8 82 Exam 2 11 87
Final
10
Geometric Mean (problem 3.15)
Sometimes data are analyzed in the log scale (for reasons discussed later). Geometric mean = back-transformed mean of log’s x y log10x 10y
11
Geometric mean Example: x 1 10 100 y 0 1 2
Algebraically equivalent formula
12
Harmonic Mean Back-Transformed mean of 1/x Example: x = time Y = rate
X Y 1 1 10 0.1 Example: x = time Y = rate Current: 1 mph 15 miles 3 mph upstream 5 mph downstream Harmonic mean 30miles/5 hours up +3 hours down
13
3.4 The Median The median M is the midpoint of a data set. When observations are ordered from smallest to largest, M is in the middle, with half the observations smaller, half larger: M = 7 If there are an even number of observations (no single midpoint), the location of the median is really between the two middle points, so the average of the two is used as the estimate of the median value. M = = 6 2
14
Means vs. Medians 3 5 7 9 38 M = 7 3 + 5 + 7 + 9 + 38 5 62 12.4
The two values can behave VERY differently, because the Median (M) is resistant to the magnitude of possible outliers, but the Mean ( ) is not, so it can be drawn toward them. M = 7 The median remains unaffected, but the mean is drawn toward the outlier at 38. X 5 = 62 12.4
15
Mode The value that occurs most frequently Mode=108 9 10 11 12 13 06
02448 04
16
Fractiles Quartiles : divide data into 4 parts.
Deciles : divide data into 10 parts. Percentiles: divide data into a hundred parts Among the many fractiles, quartiles are used very often in describing data. Quartiles are the values at which 25% (Q1), 50% (Q2=Median) and 75% (Q3) of the observations fall at or below them, and can be used to describe the internal variability. Like the term Center, we now need a definition of the SPREAD of the distribution. Variability - That’s where quartiles come in.
17
Defining the Quartiles
To calculate the quartiles: 1. Arrange the observations in increasing order and locate the median M in the ordered list of observations. 2. The first quartile Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median. 3. the third quartile Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median. Quartiles are defined following the same rules as we followed for identifying the Median (M), which makes perfect sense since the Median (M) is the same thing as the second quartile (50th percentile).
18
Calculating (Identifying) the Quartiles
26 systolic blood pressure Q1=108 Q3=120 M = ( )/2 M = 112 With an even number of measures, there is no single value associated with the midpoint of the distribution The median is really between the two 112’s with a value estimated by their average age, 112. Then, from the Median there are 13 points to the left, 90, 96, 100, … 112. The median of those values (Q1) is then the midpoint of the 13 numbers, which is the 7th number 108. Then, from the Median there are 13 points to the right, 112, 114, 114,…, 134. The median of those values (Q3) is the midpoint of these 13 numbers, or the 7th number counting backwards: 120.
19
Graphing the Five-Number Summary
The Box Plot Graphing the Five-Number Summary (Min, Q1, Median, Q3, Max) Maximum (Largest Observation) Q3 (75th percentile) Values of the Variable Median M (50th percentile) Q1 (25th percentile) A box plot is a method of plotting the values of the five-number summary, from lowest to highest. ALWAYS PLOT YOUR DATA Minimum (Smallest Observation) Box plots can show very large datasets & highlight skewness Because they show less detail than histograms or stemplots, they are best used for side-by-side comparison of more than 1 dataset.
20
Read section 3.8 on summation notation.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.