Important Properties of Distributions:

Slides:



Advertisements
Similar presentations
I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Important Properties of Distributions: Focus is on summarizing the distribution as a whole, rather than individual values The distribution of points represents.
Calculating & Reporting Healthcare Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
PPA 415 – Research Methods in Public Administration
Descriptive Statistics
Analysis of Research Data
Intro to Descriptive Statistics
Measures of Dispersion
1 Chapter 4: Variability. 2 Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure.
Chapter 5 – 1 Chapter 5: Measures of Variability The Importance of Measuring Variability The Range IQR (Inter-Quartile Range) Variance Standard Deviation.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Measures of Central Tendency
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
Today: Central Tendency & Dispersion
Programming in R Describing Univariate and Multivariate data.
Objective To understand measures of central tendency and use them to analyze data.
Statistics. Question Tell whether the following statement is true or false: Nominal measurement is the ranking of objects based on their relative standing.
Numerical Descriptive Techniques
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Statistics Recording the results from our studies.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 4 Describing Numerical Data.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Central Tendency & Dispersion
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Summarizing Data with Numerical Values Introduction: to summarize a set of numerical data we used three types of groups can be used to give an idea about.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Section 2.1 Visualizing Distributions: Shape, Center, and Spread.
Methods for Describing Sets of Data
EHS 655 Lecture 4: Descriptive statistics, censored data
Analysis and Empirical Results
PA330 FEB 28, 2000.
Reasoning in Psychology Using Statistics
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
6th Grade Math Lab MS Jorgensen 1A, 3A, 3B.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Means & Medians.
Numerical Descriptive Measures
Chapter 1: Exploring Data
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Lecture 4 Psyc 300A.
Central Tendency & Variability
Presentation transcript:

Important Properties of Distributions: Focus is on summarizing the distribution as a whole, rather than individual values The distribution of points represents a combination of: Common patterns or group conditions Unique features or individual conditions How to summarize/describe these: Numerically (in statistical indexes) Visually (in images and pictures) Verbally (in words and phrases)

(Parenthetical Note) Note: focus here is on summarizing the distribution of one variable at a time This is called univariate analysis or statistics Unique features or individual conditions If we consider the combined (joint) distribution of multiple variables (to see how they are interrelated): Analyzing two variables jointly is called bivariate analysis Analyzing three or more variables jointly is called multivariate analysis

Important Properties of Distributions: Central Tendency: What is the typical or average value of the distribution? Where is the middle of the data? Variation: How wide are the data points spread out? (range) How concentrated are the data points within the distribution? (variance) Size: How numerous are the data points in the distribution? Symmetry: (also called skew) How lop-sided is the distribution across its range? Peakiness (also called modality): Are all data points smoothly spread over the values? Are there notable peaks or lumps are in the distribution? How many and how sharp are the peaks?

Central Tendency: (3 common measures) Mode: The most common, popular, or “typical” value. Applies to all levels of variables – nominal & up Median: The “midpoint” (50/50) of ordered distribution. Divides distribution into upper and lower halves. Variable must be at least ordinal level (ordered). Mean: The “average” (“center of gravity”) of the values Weighted by the size or value of the data points. Variable need to be interval level (at least).

Which one is the correct measure of Central Tendency? Depends on the type of data Nominal = mode Ordinal = median Interval/Ratio = mean (quasi-interval?) Depends on the distribution of the variable Highly skewed or weirdly distributed variables Unusual or extreme outliers (AKA the “Bill Gates effect” or the “New York City effect”) Variables with infinitely many “unique” values

How to compute measures of Central Tendency? By hand (& calculator)? See the textbook and the handouts Notice difference between formulas for: (a) data list, (b) frequency table, © grouped distribution By SPSS? Use one of 3 procedures: Frequencies command  compute more kinds of statistics and accompanying chart; more detailed output Descriptives command  quickly compute most common statistics but no median and no charts Explore command  wider array of information

Shape of Distribution: properties Symmetry: “Lopsidedness”  unevenness around center “Skew” = the technical name for asymmetry Skew = direction of the longer tail Left-Skew = negative; Right-Skew = positive Some statistics assume symmetric distribution If symmetric, mean & median = same Peakiness: Multi-modality  number of peaks “Kurtosis”  sharpness of peaks Truncation: Some values are excluded or “censored”

How to tell Shape of a Distribution? Look at frequency table (if # values = small): Look at frequency graph: Bar chart or line graph (if # values = small) Histogram (if # values = large) Compare values of median and mean: Difference between Mean & Median = skew If Mean > Median: skewed to the right If Mean < Median: skewed to the left Box Plots:

Bar Chart Histogram

How to tell Shape of a Distribution? Box Plots:

Variation (the spread) of the data): Range: The difference between the highest and lowest values in the distribution Inter-Quartile Range: The difference (range) between the 25th & 75th percentiles (lowest & highest quarters) of the distribution. (span of the middle 50%) Variance (& standard deviation): The total amount of variance around the mean. Counts the amount but not direction of deviation. Weights large deviations more heavily.

How to compute the Variance: Compute the Mean of the distribution Compute the deviation of each score from the Mean of the distribution Square the deviations from the mean Add all the squared deviations together Divide by the total number of scores To Compute the Standard Deviation: Take the square root of the variance

2 Measures of Variance? Note 2 slightly different formulas: Population/Description formula: Sample/Estimation formula:

How to compute the Variance: Note two different computing strategies that yield the same answers: Definitional Formula: Requires computing the mean first & then deviations Uses deviation scores and decimal fractions Messier computations (with decimal fractions) Computational Formula: Computations occur in the same step Does not compute deviations Simpler computations (decimals only at the end)