Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Quantitative Methods in HPELS 440:210
Brought to you by Tutorial Support Services The Math Center.
Descriptive (Univariate) Statistics Percentages (frequencies) Ratios and Rates Measures of Central Tendency Measures of Variability Descriptive statistics.
Calculating & Reporting Healthcare Statistics
Descriptive Statistics Statistical Notation Measures of Central Tendency Measures of Variability Estimating Population Values.
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
Descriptive Statistics
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION.
Intro to Descriptive Statistics
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
1 Measures of Central Tendency Greg C Elvers, Ph.D.
Measures of Central Tendency
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Summarizing Scores With Measures of Central Tendency
Statistics for the Behavioral Sciences Second Edition Chapter 4: Central Tendency and Variability iClicker Questions Copyright © 2012 by Worth Publishers.
Part II Sigma Freud & Descriptive Statistics
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
Chapter 3 Averages and Variations
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistics Recording the results from our studies.
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
Page 1 Chapter 3 Variability. Page 2 Central tendency tells us about the similarity between scores Variability tells us about the differences between.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Statistics 11 The mean The arithmetic average: The “balance point” of the distribution: X=2 -3 X=6+1 X= An error or deviation is the distance from.
DATA ANALYSIS n Measures of Central Tendency F MEAN F MODE F MEDIAN.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION h458 student
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Practice Page 65 –2.1 Positive Skew Note Slides online.
Measures of Dispersion. Introduction Measures of central tendency are incomplete and need to be paired with measures of dispersion Measures of dispersion.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Statistics 1: Introduction to Probability and Statistics Section 3-2.
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
Descriptive Statistics. Outline of Today’s Discussion 1.Central Tendency 2.Dispersion 3.Graphs 4.Excel Practice: Computing the S.D. 5.SPSS: Existing Files.
LIS 570 Summarising and presenting data - Univariate analysis.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 2 The Mean, Variance, Standard.
Chapter 2 Describing and Presenting a Distribution of Scores.
Measures of Central Tendency (MCT) 1. Describe how MCT describe data 2. Explain mean, median & mode 3. Explain sample means 4. Explain “deviations around.
Descriptive Statistics(Summary and Variability measures)
Data Description Chapter 3. The Focus of Chapter 3  Chapter 2 showed you how to organize and present data.  Chapter 3 will show you how to summarize.
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION h458 student
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Chapter 2 The Mean, Variance, Standard Deviation, and Z Scores.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
An Introduction to Statistics
SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION
One-Variable Statistics
Chapter 6 ENGR 201: Statistics for Engineers
Central Tendency and Variability
Summarizing Scores With Measures of Central Tendency
Do-Now-Day 2 Section 2.2 Find the mean, median, mode, and IQR from the following set of data values: 60, 64, 69, 73, 76, 122 Mean- Median- Mode- InterQuartile.
IB Psychology Today’s Agenda: Turn in:
IB Psychology Today’s Agenda: Turn in:
Description of Data (Summary and Variability measures)
Summary descriptive statistics: means and standard deviations:
Chapter 2 The Mean, Variance, Standard Deviation, and Z Scores
Descriptive Statistics
Summary descriptive statistics: means and standard deviations:
Statistics 1: Introduction to Probability and Statistics
Summary (Week 1) Categorical vs. Quantitative Variables
Lecture 4 Psyc 300A.
The Mean Variance Standard Deviation and Z-Scores
Presentation transcript:

Central Tendency and Variability Chapter 4

Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and less is more!, depending). – (and error happens).

Central Tendency Definition: descriptive stat that best represents the center of a distribution of data. Mean: arithmetic average – “Typical score” – Often described as the “middle” of the scores, so don’t confuse this with medians.

Calculating the Mean Add up all scores Divide by number of scores Traditionally, we use M as the symbol for sample means.

Note on Symbols Usually Latin letters (normal alphabet) are used for samples – M, SD – Sample statistics Greek letters are used for populations – μ, σ – Population parameters. All statistical letters are italicized.

Get some data! Go to al-logos al-logos Take the quiz! Give your score!

How-To R Entering raw data. – You can enter the data from the board by creating your own individual columns of data. – mycolumn = c(#, #, #) – How to reference that column? You do NOT need the $ operator. Why not?

How-To R How to calculate the mean: – Two ways: – summary(column name) – mean(column name, na.rm = T)

Central Tendency Median: middle score when ordered from lowest to highest – No real symbol, but you can abbreviate mdn

Calculating the Median Line up the scores in ascending order Find the middle number – For an odd number of scores, just find the middle value. – For an even number of scores, divide number of scores by two. – Take the average of the scores around this position.

How-To R You will get the median with the summary() function. Or you can use: – median(column, na.rm = T)

Central Tendency Mode: most common score – It’s the value: With the largest frequency (or percent on a table). The highest bar on a histogram depending on binwidth. The highest point on a frequency polygon. Note…sometimes there are multiple modes.

Calculating the Mode Line up the scores in ascending order. Find the most frequent score. That’s the Mode! Aka, book notes can be silly sometimes.

Mode + Distributions We talked about this before but: – Unimodal = one hump distributions with one mode. – Bimodal = distributions with two modes. – Multimodal = distributions with three+ mode. Remember we talked about traditionally how if there are 10 5s and 10 6s (that is technically two modes) that people consider that unimodal because they are so close together.

How-To R Not as easy  temp <- table(as.vector(column)) names(temp)[temp == max(temp)]

Figure 4-4: Bipolar Disorder and the Modal Mood Why central tendency is not always the best answer:

Outliers and the Mean An early lesson in lying with statistics – Which central tendency is “best”: mean, median, or mode? – Depends!

Figure 4-6: The Mean without the Outlier

Let’s try it! Add an outlier to our data. outlier = c(mycolumn, #) Rerun the mean, median, mode. What happened?

Test with Outliers So what happens if we delete our outliers? Summary: – Mean is most affected by outliers (moved up or down, can be by a lot). Best for symmetric distributions. – Median may change slightly one number up or down. Best for skewed distributions or with outliers. – Generally the mode will not change. Uses: One particular score dominates a distribution. Distribution is bi or multi modal Data are nominal.

Measures of Variability Variability: a measure of how much spread there is in a distribution Range – From the lowest to the highest score

Calculating the Range Determine the highest score Determine the lowest score Subtract the lowest score from the highest score

How to Range Use the summary() function to get the min and max and subtract. Or do this: – max(column, na.rm = T) – min(column, na.rm = T)

Measures of Variability Variance – Average squared deviation from the mean – How much, on average, do people vary from the middle?

Calculating the Variance Subtract the mean from each score Square every deviation from the mean Sum the squared deviations Divide the sum of squares by N Super special notes right here about N versus N-1.

Measures of Variability Standard deviation – (square root of variance) – Typical amount that each score deviates from the mean. – Most commonly used statistic with the mean. – Why use this when variance says the same thing? Standardized – brings the numbers back to the original scaling (since they were squared before). Still biased by scale.

Calculating the Standard Deviation Typical amount the scores vary or deviate from the sample mean – This is the square root of variance

Quick Notes about Formulas Samples usually use N – 1 as the denominator – UNBIASED – var(column name, na.rm = T) – sd(column name, na.rm = T)

Quick Notes about Formulas Populations usually use N as the denominator – BIASED Run this code as is: – pop.var <- function(x) var(x) * (length(x)-1) / length(x) – pop.sd <- function(x) sqrt(pop.var(x))

Quick Notes about Formulas Populations usually use N as the denominator – BIASED Run this code as is: – pop.var(column name) – pop.sd(column name) We created our own functions! – So, you will have to run it every time you open R and want to use it. Maybe start a “I need this” file with libraries, themes, and special functions?

Interquartile Range Measure of the distance between the 1st and 3rd quartiles. 1st quartile: 25th percentile of a data set The median marks the 50th percentile of a data set. 3rd quartile: marks the 75th percentile of a data set

Calculating the Interquartile Range Subtract: 75th percentile – 25th percentile. – You can look at these numbers in the summary() function. IQR(column name, na.rm = T)