Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.

Slides:



Advertisements
Similar presentations
DESCRIBING DISTRIBUTION NUMERICALLY
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
CHAPTER 1 Exploring Data
Measures of Dispersion
LECTURE 7 THURSDAY, 11 FEBRUARY STA291 Fall 2008.
Descriptive Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Intro to Descriptive Statistics
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
QBM117 Business Statistics
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 3 Describing Data Using Numerical Measures.
CHAPTER 2: Describing Distributions with Numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
CHAPTER 2: Describing Distributions with Numbers ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
STA Lecture 111 STA 291 Lecture 11 Describing Quantitative Data – Measures of Central Location Examples of mean and median –Review of Chapter 5.
LECTURE 8 Thursday, 19 February STA291 Fall 2008.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Applied Quantitative Analysis and Practices LECTURE#08 By Dr. Osman Sadiq Paracha.
STA Lecture 131 STA 291 Lecture 13, Chap. 6 Describing Quantitative Data – Measures of Central Location – Measures of Variability (spread)
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Lecture 3 Describing Data Using Numerical Measures.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Lecture 2 Dustin Lueker.  Center of the data ◦ Mean ◦ Median ◦ Mode  Dispersion of the data  Sometimes referred to as spread ◦ Variance, Standard deviation.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data Descriptive Statistics: Central Tendency and Variation.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Summary Statistics: Measures of Location and Dispersion.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Chapter 3 Averages and Variation Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
Averages and Variation
Description of Data (Summary and Variability measures)
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Descriptive Statistics
Box and Whisker Plots Algebra 2.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Summer 2008 Lecture 4 Dustin Lueker.
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
STA 291 Spring 2008 Lecture 4 Dustin Lueker.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Presentation transcript:

Lecture 5 Dustin Lueker

2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x = Variable to be measured x i = Measurement of the i th unit Mean - Arithmetic Average Median - Midpoint of the observations when they are arranged in increasing order STA 291 Summer 2010 Lecture 5

3

 Sample ◦ Variance ◦ Standard Deviation  Population ◦ Variance ◦ Standard Deviation 4STA 291 Summer 2010 Lecture 5

5 1. Calculate the mean 2. For each observation, calculate the deviation 3. For each observation, calculate the squared deviation 4. Add up all the squared deviations 5. Divide the result by (n-1) Or N if you are finding the population variance (To get the standard deviation, take the square root of the result) STA 291 Summer 2010 Lecture 5

 If the data is approximately symmetric and bell-shaped then ◦ About 68% of the observations are within one standard deviation from the mean ◦ About 95% of the observations are within two standard deviations from the mean ◦ About 99.7% of the observations are within three standard deviations from the mean 6STA 291 Summer 2010 Lecture 5

7

 The p th percentile (X p ) is a number such that p% of the observations take values below it, and (100-p)% take values above it ◦ 50 th percentile = median ◦ 25 th percentile = lower quartile ◦ 75 th percentile = upper quartile  The index of X p ◦ (n+1)p/100 8STA 291 Summer 2010 Lecture 5

 25 th percentile ◦ lower quartile ◦ Q1 ◦ (approximately) median of the observations below the median  75 th percentile ◦ upper quartile ◦ Q3 ◦ (approximately) median of the observations above the median 9STA 291 Summer 2010 Lecture 5

 Find the 25 th percentile of this data set ◦ {3, 7, 12, 13, 15, 19, 24} 10STA 291 Summer 2010 Lecture 5

 Use when the index is not a whole number  Want to start with the closest index lower than the number found then go the distance of the decimal towards the next number  If the index is found to be 5.4 you want to go to the 5 th value then add.4 of the value between the 5 th value and 6 th value ◦ In essence we are going to the 5.4 th value STA 291 Summer 2010 Lecture 511

 Find the 40 th percentile of the same data set ◦ {3, 7, 12, 13, 15, 19, 24}  Must use interpolation 12STA 291 Summer 2010 Lecture 5

 Five Number Summary ◦ Minimum ◦ Lower Quartile ◦ Median ◦ Upper Quartile ◦ Maximum  Example ◦ minimum=4 ◦ Q1=256 ◦ median=530 ◦ Q3=1105 ◦ maximum=320,000.  What does this suggest about the shape of the distribution? 13STA 291 Summer 2010 Lecture 5

 The Interquartile Range (IQR) is the difference between upper and lower quartile ◦ IQR = Q3 – Q1 ◦ IQR = Range of values that contains the middle 50% of the data ◦ IQR increases as variability increases  Murder Rate Data ◦ Q1= 3.9 ◦ Q3 = 10.3 ◦ IQR = 14STA 291 Summer 2010 Lecture 5

 Displays the five number summary (and more) graphical  Consists of a box that contains the central 50% of the distribution (from lower quartile to upper quartile)  A line within the box that marks the median,  And whiskers that extend to the maximum and minimum values  This is assuming there are no outliers in the data set 15STA 291 Summer 2010 Lecture 5

 An observation is an outlier if it falls ◦ more than 1.5 IQR above the upper quartile or ◦ more than 1.5 IQR below the lower quartile 16STA 291 Summer 2010 Lecture 5

 Whiskers only extend to the most extreme observations within 1.5 IQR beyond the quartiles  If an observation is an outlier, it is marked by an x, +, or some other identifier 17STA 291 Summer 2010 Lecture 5

 Values  Min = 148  Q1 = 158  Median = Q2 = 162  Q3 = 182  Max = 204  Create a box plot 18STA 291 Summer 2010 Lecture 5

 On right-skewed distributions, minimum, Q1, and median will be “bunched up”, while Q3 and the maximum will be farther away.  For left-skewed distributions, the “mirror” is true: the maximum, Q3, and the median will be relatively close compared to the corresponding distances to Q1 and the minimum.  Symmetric distributions? STA 291 Summer 2010 Lecture 519

 Value that occurs most frequently ◦ Does not need to be near the center of the distribution  Not really a measure of central tendency ◦ Can be used for all types of data (nominal, ordinal, interval)  Special Cases ◦ Data Set  {2, 2, 4, 5, 5, 6, 10, 11}  Mode = ◦ Data Set  {2, 6, 7, 10, 13}  Mode = 20STA 291 Summer 2010 Lecture 5

 Mean ◦ Interval data with an approximately symmetric distribution  Median ◦ Interval or ordinal data  Mode ◦ All types of data 21STA 291 Summer 2010 Lecture 5

 Mean is sensitive to outliers ◦ Median and mode are not  Why?  In general, the median is more appropriate for skewed data than the mean ◦ Why?  In some situations, the median may be too insensitive to changes in the data  The mode may not be unique 22STA 291 Summer 2010 Lecture 5

 “How often do you read the newspaper?” 23 ResponseFrequency every day969 a few times a week 452 once a week261 less than once a week 196 Never76 TOTAL1954 Identify the mode Identify the median response STA 291 Summer 2010 Lecture 5

 Statistics that describe variability ◦ Two distributions may have the same mean and/or median but different variability  Mean and Median only describe a typical value, but not the spread of the data ◦ Range ◦ Variance ◦ Standard Deviation ◦ Interquartile Range  All of these can be computed for the sample or population 24STA 291 Summer 2010 Lecture 5

 Difference between the largest and smallest observation ◦ Very much affected by outliers  A misrecorded observation may lead to an outlier, and affect the range  The range does not always reveal different variation about the mean 25STA 291 Summer 2010 Lecture 5

 Sample 1 ◦ Smallest Observation: 112 ◦ Largest Observation: 797 ◦ Range =  Sample 2 ◦ Smallest Observation: ◦ Largest Observation: ◦ Range = 26STA 291 Summer 2010 Lecture 5