Centrality revisited We have already seen how to compute the median. If we use the median as an axe we cut the data into two halves, each with an equal.

Slides:



Advertisements
Similar presentations
HS 67 - Intro Health Statistics Describing Distributions with Numbers
Advertisements

Unit 16: Statistics Sections 16AB Central Tendency/Measures of Spread.
The Simple Regression Model
STATISTICS. SOME BASIC STATISTICS MEAN (AVERAGE) – Add all of the data together and divide by the number of elements within that set of data. MEDIAN –
Measures of Dispersion
N.D. Analysis We are NOT going to analyze Notre Dame, just Numerical (Quantitative) Data. There are essentially two ways to represent and analyze quantitative.
Statistics for the Social Sciences
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Measures of Central Tendency
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
VARIABILITY. PREVIEW PREVIEW Figure 4.1 the statistical mode for defining abnormal behavior. The distribution of behavior scores for the entire population.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Introduction Measures of center and variability can be used to describe a data set. The mean and median are two measures of center. The mean is the average.
Standard Deviation. Two classes took a recent quiz. There were 10 students in each class, and each class had an average score of 81.5.
Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.
Box and Whisker Plot 5 Number Summary for Odd Numbered Data Sets.
Measures of Dispersion Week 3. What is dispersion? Dispersion is how the data is spread out, or dispersed from the mean. The smaller the dispersion values,
Anthony J Greene1 Dispersion Outline What is Dispersion? I Ordinal Variables 1.Range 2.Interquartile Range 3.Semi-Interquartile Range II Ratio/Interval.
Chapter 3 Averages and Variations
Exploration of Mean & Median Go to the website of “Introduction to the Practice of Statistics”website Click on the link to “Statistical Applets” Select.
Measures of Central Tendency Section 2.3. Central Values There are 4 values that are considered measures of the center. 1.Mean 2.Median 3.Mode 4.Midrange.
Statistics 1. How long is a name? To answer this question, we might collect some data on the length of a name.
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Chapter 4 Variability. Variability In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. In.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Part II  igma Freud & Descriptive Statistics Chapter 3 Viva La Difference: Understanding Variability.
 The data set below gives the points per game averages for the 10 players who had the highest averages (minimum 70 games or 1400 points) during the
Table of Contents 1. Standard Deviation
Created by Tom Wegleitner, Centreville, Virginia Section 2-4 Measures of Center.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Objectives The student will be able to: find the variance of a data set. find the standard deviation of a data set.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Standard Deviation Link for follow along worksheet:
Math 3680 Lecture #2 Mean and Standard Deviation.
Numerical Statistics Given a set of data (numbers and a context) we are interested in how to describe the entire set without listing all the elements.
DATA ANALYSIS n Measures of Central Tendency F MEAN F MODE F MEDIAN.
Descriptive Statistics: Presenting and Describing Data.
Variability Pick up little assignments from Wed. class.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Summary Statistics and Mean Absolute Deviation MM1D3a. Compare summary statistics (mean, median, quartiles, and interquartile range) from one sample data.
What is Mean Absolute Deviation?  Another measure of variability is called the mean absolute deviation. The mean absolute deviation (MAD) is the average.
What’s with all those numbers?.  What are Statistics?
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
Numerical parameters of a Random Variable Remember when we were studying sets of data of numbers. We found some numbers useful, namely The spread The.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
1.  In the words of Bowley “Dispersion is the measure of the variation of the items” According to Conar “Dispersion is a measure of the extent to which.
Standard Deviation. Two classes took a recent quiz. There were 10 students in each class, and each class had an average score of 81.5.
CHAPTER 2: Basic Summary Statistics
Algebra with Whole Numbers Algebraic Notation. Simplify.
Unit 1 – Descriptive Statistics Throughout the course of these lectures we will work within this same scenario: We are a team of junior climate scientists.
2.4 Measures of Variation The Range of a data set is simply: Range = (Max. entry) – (Min. entry)
Data Analysis Student Text :Chapter 7. Data Analysis MM2D1. Using sample data, students will make informal inferences about population means and standard.
One-Variable Statistics. Descriptive statistics that analyze one characteristic of one sample  Where’s the middle?  How spread out is it?  How do different.
Measures of Central Tendency
One-Variable Statistics
Measures of Dispersion
MATHEMATICS The Measure of Data Location
Measures of Central Tendency & Center of Spread
Descriptive Statistics (Part 2)
Descriptive Statistics: Presenting and Describing Data
Measures of Central Tendency & Center of Spread
Variance Variance: Standard deviation:
Measures of Location Statistics of location Statistics of dispersion
Variability.
Lesson 1: Summarizing and Interpreting Data
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
CHAPTER 2: Basic Summary Statistics
Box Plots CCSS 6.7.
Presentation transcript:

Centrality revisited We have already seen how to compute the median. If we use the median as an axe we cut the data into two halves, each with an equal number of entries. (Careful: if n is odd the two halves do not contain the median, if n is even they may) We compute the “median” on each half, and …

call the two resulting numbers (one on the left, one on the right of the median) the lower (on the left) quartile and the upper (on the right) quartile. We get this nice “breakdown of data:” That justifies the name quartiles (duh!) We could go on cutting sets of data in half, but not now. Instead we look at

Measures of “Spread” One measure of spread comes immediately to mind, the range, but a quick look at some examples shows right away that this isn’t precise enough, wildly different sets of data have the same range. Now what? Another way to look at spread, besides range (which is too crude a measure), is to look at how “spread out” the data are, that is, how far they wonder away from the middle.

Unfortunately we have to decide first which middle? Say we have a finite set of data x 1, x 2, x 3, …, xnxn Intuitively we would like to take the median, but for computational ease we’ll choose the average, ( for a sample, for a population). So … we write the distance between and xi xi for each datum, add the distances and divide by n. We can write a long hand formula for this as follows:

Or an even prettier short hand formula (BUT forget about pretty long or pretty short, learn the method!) This is very nice, except that absolute values are computationally intractable! Much nicer (computationally)

The right-hand side is called the variance (denoted by Var) So we have the baptism (definition) (This formula may lead to fairly difficult computations, we’ll learn a short-cut soon) If the data are from the entire population of interest life is good. If however the data are from just a sample of the population, it turns out that

the value we get from Var tends to underestimate the true value of Var (from the entire population, such is life!) We compensate for this slight underestimation by a slight increase in the value of Var. We just multiply by the fraction (why is this an increase ?). In summary: population Var sample Var = (population Var)

We have stated before that the formula can lead to some seriously difficult computations. Try applying it to the set of numbers There is, however, a short-cut. In formula it looks worse, but in words (and use) it is much easier.

In words it says: 1.First compute the mean 2.Then compute the mean of the squares 3. Then subtract 1 2 from 2. Let's try the short-cut on the set of numbers

Step 1 2 gives Step 2 gives We get Var =

When the number of data is small there is an even easier (visually) way to proceed. We apply it to the same set of seven data:

Final Remarks The variance we have computed is a population variance If the data come from a sample we must remember to correct our answer, multiplying by the correction factor Then we take a square root and obtain the corresponding standard deviation (population or sample)