Welcome to Week 03 Tues MAT135 Statistics

Slides:



Advertisements
Similar presentations
Statistics.
Advertisements

Descriptive Statistics Statistical Notation Measures of Central Tendency Measures of Variability Estimating Population Values.
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
Descriptive Statistics
Biostatistics Unit 2 Descriptive Biostatistics 1.
Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Standard Deviation Interquartile Range (IQR)
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
Central Tendency and Variability
Chapter 4 SUMMARIZING SCORES WITH MEASURES OF VARIABILITY.
EDUC 200C Section 1– Describing Data Melissa Kemmerle September 28, 2012.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Today: Central Tendency & Dispersion
Describing Data: Numerical
Means & Medians Chapter 5. Parameter - ► Fixed value about a population ► Typical unknown.
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 2 NUMERICAL DATA REPRESENTATION.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
Statistics 1 Measures of central tendency and measures of spread.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
Statistics 11 The mean The arithmetic average: The “balance point” of the distribution: X=2 -3 X=6+1 X= An error or deviation is the distance from.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Determination of Sample Size: A Review of Statistical Theory
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
What does Statistics Mean? Descriptive statistics –Number of people –Trends in employment –Data Inferential statistics –Make an inference about a population.
Introduction to Statistics Santosh Kumar Director (iCISA)
Means & Medians Chapter 4. Parameter - Fixed value about a population Typical unknown.
Central Tendency & Dispersion
Measures of Location INFERENTIAL STATISTICS & DESCRIPTIVE STATISTICS Statistics of location Statistics of dispersion Summarise a central pointSummarises.
Statistics 1: Introduction to Probability and Statistics Section 3-2.
CHAPTER 2: Basic Summary Statistics
Chapter 1 Lesson 7 Variance and Standard Deviation.
Welcome to Week 04 College Statistics
Welcome to Week 03 College Statistics
Welcome to Wk10 MATH225 Applications of Discrete Mathematics and Statistics
Welcome to Week 04 Tues MAT135 Statistics
One-Variable Statistics
Practice Page Practice Page Positive Skew.
Welcome to Week 03 Thurs MAT135 Statistics
Descriptive Statistics (Part 2)
How to describe a graph Otherwise called CUSS
Welcome to Week 02 Thurs MAT135 Statistics
Central Tendency and Variability
Descriptive Statistics: Presenting and Describing Data
Univariate Descriptive Statistics
Welcome to Wk09 MATH225 Applications of Discrete Mathematics and Statistics
Means & Medians Chapter 4.
Statistics Central Tendency
Means & Medians Chapter 4.
Central Tendency.
Means & Medians Chapter 5.
Variance Variance: Standard deviation:
Measures of Location Statistics of location Statistics of dispersion
Variability.
3-2 Measures of Variance.
Statistics Variability
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Describing Data with Numerical Measures
Numerical Descriptive Measures
Summary descriptive statistics: means and standard deviations:
Means & Medians Chapter 4.
Statistics 1: Introduction to Probability and Statistics
Means & Medians Chapter 5.
Good morning! Please get out your homework for a check.
CHAPTER 2: Basic Summary Statistics
Means & Medians.
Lecture 4 Psyc 300A.
Numerical Descriptive Measures
Means & Medians Chapter 4.
Presentation transcript:

Welcome to Week 03 Tues MAT135 Statistics http://media.dcnews.ro/image/201109/w670/statistics.jpg

Review

Types of Statistics Descriptive statistics – describe our sample – we’ll use this to make inferences about the population Inferential statistics – make inferences about the population with a level of probability attached

Descriptive Statistics graphs n max min each observation frequencies

Descriptive Statistics And… averages!

Measures of Central Tendency In statistics class, averages are called “measures of central tendency” (where the data “tend to center”)

Measures of Central Tendency arithmetic mean (sample x , population μ) x is your best estimate of the population mean μ

Measures of Central Tendency The arithmetic mean is the balance point for a data set

Measures of Central Tendency 3 4 5 8 9 The median (50th percentile) is also a measure of central tendency (aka: average)

Measures of Central Tendency The mode (the most commonly-occurring observation) is another!

Statistics vs Parameters Statistic Parameter n N x μ

Questions?

Descriptive Statistics Averages tell where the data tends to pile up

Descriptive Statistics Another good way to describe data is how spread out it is

Descriptive Statistics Suppose you are using the mean “5” to describe each of the observations in your sample

For which sample would “5” be closer to the actual data values? VARIABILITY IN-CLASS PROBLEM 5 For which sample would “5” be closer to the actual data values?

VARIABILITY IN-CLASS PROBLEM 5 In other words, for which of the two sets of data would the mean be a better descriptor?

VARIABILITY IN-CLASS PROBLEM 6 For which of the two sets of data would the mean be a better descriptor?

Variability Numbers telling how spread out our data values are are called “Measures of Variability”

Variability The variability tells how close to the “average” the sample data tend to be

Variability Just like measures of central tendency, there are several measures of variability

Variability Range = R = max – min

Variability Variance (symbolized s2) s2 = sum of (obs – x )2 n - 1

Variability An observation “x” minus the mean x is called a “deviation” The variance is sort of an average (arithmetic mean) of the squared deviations

Variability In algebra, the absolute value of “deviations” are a measure of distance

Variability We square them because it gets rid of the “+” “-” problem and has mathematical advantages over taking absolute values

Variability Sums of squared deviations are used in the formula for a circle: r2 = (x-h)2 + (y-k)2 where r is the radius of the circle and (h,k) is its center

Variability OK… so if its sort of an arithmetic mean, howcum is it divided by “n-1” not “n”?

Variability Every time we estimate something in the population using our sample we have used up a bit of the “luck” that we had in getting a (hopefully) representative sample

Variability To make up for that, we give a little edge to the opposing side of the story

Variability Since a small variability means our sample arithmetic mean is a better estimate of the population mean than a large variability is, we bump up our estimate of variability a tad to make up for it

Variability Dividing by “n” would give us a smaller variance than dividing by “n-1”, so we use that

Variability Why not “n-2”?

Variability Why not “n-2”? Because we only have used 1 estimate to calculate the variance: x

Variability So, the variance is sort of an average (arithmetic mean) of the squared deviations bumped up a tad to make up for using an estimate ( x ) of the population mean (μ)

Variability Trust me…

Variability Standard deviation (symbolized “s” or “std”) s = variance

Variability The standard deviation is an average square root of a sum of squared deviations We’ve used this in algebra class for distance calculations: d = (x1−x2)2 + (y1−y2)2

Variability The range and standard deviation are in the same units as the original data (a good thing) The variance is in squared units (which can be confusing…)

Variability Naturally, the measure of variability used most often is the hard-to-calculate one…

Variability Naturally, the measure of variability used most often is the hard-to-calculate one… … the standard deviation

Variability Statisticians like it because it is an average distance of all of the data from the center – the arithmetic mean

Variability Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance

Questions?

Variability Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance

VARIABILITY IN-CLASS PROBLEM 7 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 What is the range?

VARIABILITY IN-CLASS PROBLEM 7 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 Range = 3 – 1 = 2 Min Max

VARIABILITY IN-CLASS PROBLEMS Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 What is the Variance?

VARIABILITY IN-CLASS PROBLEM 8 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 First find x !

VARIABILITY IN-CLASS PROBLEM 8 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 x = 3+3+2+2+1+1 6 = 2

VARIABILITY IN-CLASS PROBLEM 9 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 Now calculate the deviations!

VARIABILITY IN-CLASS PROBLEM 9 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 Dev: 1-2=-1 1-2=-1 2-2=0 2-2=0 3-2=1 3-2=1

Variability What do you get if you add up all of the deviations? Data: 1 1 2 2 3 3 Dev: 1-2=-1 1-2=-1 2-2=0 2-2=0 3-2=1 3-2=1

Variability Zero!

Variability Zero! That’s true for ALL deviations everywhere in all times!

Variability Zero! That’s true for ALL deviations everywhere in all times! That’s why they are squared in the sum of squares!

VARIABILITY IN-CLASS PROBLEM 10 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 Dev: -12=1 -12=1 02=0 02=0 12=1 12=1

VARIABILITY IN-CLASS PROBLEM 11 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 sum(obs– x )2: 1+1+0+0+1+1 = 4

VARIABILITY IN-CLASS PROBLEM 12a,b Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 Variance: 4/(6-1) = 4/5 = 0.8

YAY!

VARIABILITY IN-CLASS PROBLEM 13 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 What is s?

VARIABILITY IN-CLASS PROBLEM 13 Range = max – min Variance = sum of (obs – x )2 n − 1 s = variance Data: 1 1 2 2 3 3 s = 0.8 ≈ 0.89

VARIABILITY IN-CLASS PROBLEMS So, for: Data: 1 1 2 2 3 3 Range = max – min = 2 Variance = sum of (obs – x )2 n − 1 = 0.8 s = variance ≈ 0.89

Variability Aren’t you glad Excel does all this for you???

Questions?

Variability Just like for n and N and x and μ there are population variability symbols, too!

Variability Naturally, these are going to have funny Greek-y symbols just like the averages …

Variability The population variance is “σ2” called “sigma-squared” The population standard deviation is “σ” called “sigma”

Variability Again, the sample statistics s2 and s values estimate population parameters σ2 and σ (which are unknown)

Variability Some calculators can find x s and σ for you (Not recommended for large data sets – use EXCEL)

Variability s sq “s2” vs sigma sq “σ2”

Variability s2 is divided by “n-1” σ2 is divided by “N”

Questions?

Standard Deviation What does standard deviation mean?

STANDARD DEVIATION IN-CLASS PROBLEM 14 Suppose we have two pizza delivery drivers We’re going to give one a raise But who?

STANDARD DEVIATION IN-CLASS PROBLEM 14 Both have the same mean delivery time of 15 minutes but Amanda’s standard deviation of delivery times = 2.6 minutes while Bethany’s standard deviation of delivery times = 8.4 minutes.

Who should get the raise? STANDARD DEVIATION IN-CLASS PROBLEM 14 Who should get the raise?

STANDARD DEVIATION IN-CLASS PROBLEM 15 What are the advantages of having a data set that has a small standard deviation?

Questions?

Variability Outliers! They can really affect your statistics!

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 16 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Is the mode affected?

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 16 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Original mode: 1 New mode: 1

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 17 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Is the median affected?

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 17 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Original median: 2 New median: 2

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 18 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Is the mean affected?

OUTLIERS IN-CLASS PROBLEM 18 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Original mean: 2.4 New mean: 149.6

Outliers! How about measures of variability?

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 19 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Is the range affected?

OUTLIERS IN-CLASS PROBLEM 19 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Original range: 4 New range: 740

Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: OUTLIERS IN-CLASS PROBLEM 20 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Is the standard deviation affected?

OUTLIERS IN-CLASS PROBLEM 20 Suppose we originally had data: 1 1 2 3 5 Suppose we now have data: 1 1 2 3 741 Original s: ≈1.7 New s: ≈330.6

What advantages does the standard deviation have over the range? IN-CLASS PROBLEM 21 What advantages does the standard deviation have over the range?

In-class Project Turn in your classwork! Don’t forget your homework due next class! See you Thursday!