STA 291 - Lecture 111 STA 291 Lecture 11 Describing Quantitative Data – Measures of Central Location Examples of mean and median –Review of Chapter 5.

Slides:



Advertisements
Similar presentations
DESCRIBING DISTRIBUTION NUMERICALLY
Advertisements

Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Class Session #2 Numerically Summarizing Data
CHAPTER 1 Exploring Data
LECTURE 7 THURSDAY, 11 FEBRUARY STA291 Fall 2008.
Looking at data: distributions - Describing distributions with numbers IPS chapter 1.2 © 2006 W.H. Freeman and Company.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Basic Practice of Statistics - 3rd Edition
CHAPTER 2: Describing Distributions with Numbers
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
LECTURE 6 TUESDAY, 10 FEBRUARY 2008 STA291. Administrative Suggested problems from the textbook (not graded): 4.2, 4.3, and 4.4 Check CengageNow for second.
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Objectives 1.2 Describing distributions with numbers
LECTURE 8 Thursday, 19 February STA291 Fall 2008.
STA Lecture 131 STA 291 Lecture 13, Chap. 6 Describing Quantitative Data – Measures of Central Location – Measures of Variability (spread)
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Lecture 3 Describing Data Using Numerical Measures.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
THURSDAY, 24 SEPTEMBER 2009 STA291. Announcement Exam 1: September 30 th at 5pm to 7pm. Location MEH, Memorial Auditoriam. The make-up will be at 7:30pm.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Lecture 2 Dustin Lueker.  Center of the data ◦ Mean ◦ Median ◦ Mode  Dispersion of the data  Sometimes referred to as spread ◦ Variance, Standard deviation.
Chapter 3 Looking at Data: Distributions Chapter Three
Numerical descriptors BPS chapter 2 © 2006 W.H. Freeman and Company.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Numerical descriptions of distributions
IPS Chapter 1 © 2012 W.H. Freeman and Company  1.1: Displaying distributions with graphs  1.2: Describing distributions with numbers  1.3: Density Curves.
TUESDAY, 22 SEPTEMBER 2009 STA291. Exam 1: September 30 th at 5pm to 7pm. Location MEH, Memorial Auditoriam. The make-up will be at 7:30pm to 9:30pm at.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
Numerical descriptions of distributions
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Please take out Sec HW It is worth 20 points (2 pts
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Summer 2008 Lecture 4 Dustin Lueker.
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 2: Describing Distributions with Numbers
SYMMETRIC SKEWED LEFT SKEWED RIGHT
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
STA 291 Spring 2008 Lecture 4 Dustin Lueker.
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Presentation transcript:

STA Lecture 111 STA 291 Lecture 11 Describing Quantitative Data – Measures of Central Location Examples of mean and median –Review of Chapter 5. using the probability rules

You need a Calculator for the exam, but no laptop, no cellphone, no blackberry, no iphone, etc (anything that can transmitting wireless signal is not allowed) Location: Memorial Hall, Time: Tuesday 5-7pm. Talk to me if you have a conflict. STA Lecture 112

A Formula sheet, with probability rules and sample mean etc will be available. Memorial Hall STA Lecture 113

Feb pm Covers up to mean and median of a sample (beginning of chapter 6). But not any measure of spread (i.e. standard deviation, inter-quartile range etc) Chapter 1-5, 6(first 3 sections) + 23(first 5 sections) STA Lecture 114

5 Summarizing Data Numerically Center of the data –Mean (average) –Median –Mode (…will not cover) Spread of the data –Variance, Standard deviation –Inter-quartile range –Range

STA Lecture 116 Mathematical Notation: Sample Mean Sample size n Observations x 1, x 2,…, x n Sample Mean “x-bar” --- a statistic

STA Lecture 117 Mathematical Notation: Population Mean for a finite population of size N Population size (finite) N Observations x 1, x 2,…, x N Population Mean “mu ” --- a Parameter

Infinite populations Imagine the population mean for an infinite population. Also denoted by mu or Cannot compute it (since infinite population size) but such a number exist in the limit. Carry the same information. STA Lecture 118

Infinite population When the population consists of values that can be ordered Median for a population also make sense: it is the number in the middle….half of the population values will be below, half will be above. STA Lecture 119

10 Mean If the distribution is highly skewed, then the mean is not representative of a typical observation Example: Monthly income for five persons 1,000 2,000 3,000 4, ,000 Average monthly income: = 22,000 Not representative of a typical observation.

Median = 3000 STA Lecture 1111

STA Lecture 1112 Median The median is the measurement that falls in the middle of the ordered sample When the sample size n is odd, there is a middle value It has the ordered index (n+1)/2 Example: 1.1, 2.3, 4.6, 7.9, 8.1 n=5, (n+1)/2=6/2=3, so index = 3, Median = 3 rd smallest observation = 4.6

STA Lecture 1113 Median When the sample size n is even, average the two middle values Example: 3, 7, 8, 9, n=4, (n+1)/2=5/2=2.5, index = 2.5 Median = midpoint between 2 nd and 3 rd smallest observation = (7+8)/2 =7.5

STA Lecture 1114 Summary: Measures of Location Mean- Arithmetic Average Median – Midpoint of the observations when they are arranged in increasing order Mode…. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x = Variable to be measured x i = Measurement of the ith unit

STA Lecture 1115 Mean vs. Median ObservationsMedianMean 1, 2, 3, 4, 533 1, 2, 3, 4, , 3, 3, 3, 333 1, 2, 3, 100,

STA Lecture 1116 Mean vs. Median If the distribution is symmetric, then Mean=Median If the distribution is skewed, then the mean lies more toward the direction of skew Mean and Median Online Applet

Example the sample consist of 5 numbers, 3.6, 4.4, 5.9, 2.1, and the last number is over 10. (some time we write it as 10+) Median = 4.4 Can we find the mean here? No STA Lecture 1117

STA Lecture 1118 Example: Mean and Median Example: Weights of forty-year old men 158, 154, 148, 160, 161, 182, 166, 170, 236, 195, 162 Mean = Ordered weights: (order a large dataset can take a long time) 148, 154, 158, 160, 161, 162, 166, 170, 182, 195, 236 Median = 162

Eye ball the plot to find mean/median STA Lecture 1119

Extreme valued observations pulls mean, but not on median. For data with a symmetric histogram, mean=median. STA Lecture 1120

STA Lecture 1121

Using probability rule In a typical week day, a restaurant sells ? Gallons of house soup. Given that P( sell more than 5 gallon ) = 0.8 P( sell less than 10 gallon ) = 0.7 P( sell between 5 and 10 gallon) = 0.5 STA Lecture 1122

STA Lecture 1123 Why not always Median? Disadvantage: Insensitive to changes within the lower or upper half of the data Example: 1, 2, 3, 4, 5, 6, 7 vs. 1, 2, 3, 4, 100,100,100 For symmetric, bell shaped distributions, mean is more informative. Mean is easy to work with. Ordering can take a long time Sometimes, the mean is more informative even when the distribution is slightly skewed

STA Lecture 1124 Census DataLexingtonFayette CountyKentuckyUnited States Population261,545 4,069,734281,422,131 Area in square miles306 40,1313,554,141 People per sq. mi Median Age Median Family Income$42,500$39,500$32,101$40,591 Real Estate Market DataLexingtonFayette CountyKentuckyUnited States Total Housing Units54, ,524115,904,743 Average Home Price$151,776 $115,545$173,585 Median Rental Price$383 $257$471 Owner Occupied52% 64%60%

Given a histogram, find approx mean and median STA Lecture 1125

STA Lecture 1126

STA Lecture 1127

STA Lecture 1128 Five-Number Summary Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software SAS output (Murder Rate Data) Quantile Estimate 100% Max % Q % Median % Q % Min 1.60

STA Lecture 1129 Five-Number Summary Maximum, Upper Quartile, Median, Lower Quartile, Minimum Example: The five-number summary for a data set is min=4, Q1=256, median=530, Q3=1105, max=320,000. What does this suggest about the shape of the distribution?

Box plot A box plot is a graphic representation of the five number summary --- provided the max is within 1.5 IQR of Q3 (min is within 1.5 IQR of Q1) STA Lecture 1130

STA Lecture 1131 Attendance Survey Question On a 4”x6” index card –write down your name and section number –Question: Pick one: Mean or Median _______ is a measure more resistant to extreme valued observations in the sample.