Numeric Summaries and Descriptive Statistics. populations vs. samples we want to describe both samples and populations the latter is a matter of inference…

Slides:



Advertisements
Similar presentations
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Advertisements

Descriptive Statistics
Descriptive statistics. Statistics Many studies generate large numbers of data points, and to make sense of all that data, researchers use statistics.
Chapter 13 Conducting & Reading Research Baumgartner et al Data Analysis.
Calculating & Reporting Healthcare Statistics
DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion.
Data Summary Using Descriptive Measures Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Analysis of Research Data
Introduction to Educational Statistics
Edpsy 511 Homework 1: Due 2/6.
Data observation and Descriptive Statistics
The Data Analysis Plan. The Overall Data Analysis Plan Purpose: To tell a story. To construct a coherent narrative that explains findings, argues against.
Quiz 2 Measures of central tendency Measures of variability.
Describing Data: Numerical
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3 Statistical Concepts.
EPE/EDP 557 Key Concepts / Terms –Empirical vs. Normative Questions Empirical Questions Normative Questions –Statistics Descriptive Statistics Inferential.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Summary statistics Using a single value to summarize some characteristic of a dataset. For example, the arithmetic mean (or average) is a summary statistic.
Measures of Central Tendency or Measures of Location or Measures of Averages.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Descriptive Statistics And related matters. Two families of statistics Descriptive statistics – procedures for summarizing, organizing, graphing, and,
Module 11: Standard Deviations and the Like This module describes measures of dispersion or unlikeness, including standard deviations, variances,
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Skewness & Kurtosis: Reference
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Measures of Dispersion
INVESTIGATION 1.
Agenda Descriptive Statistics Measures of Spread - Variability.
Practice Page 65 –2.1 Positive Skew Note Slides online.
Descriptive Statistics The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.
Measures of Central Tendency or Measures of Location or Measures of Averages.
Basic Statistical Terms: Statistics: refers to the sample A means by which a set of data may be described and interpreted in a meaningful way. A method.
Numerical Measures of Variability
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Introduction to Statistics Santosh Kumar Director (iCISA)
Introduction to Basic Statistical Tools for Research OCED 5443 Interpreting Research in OCED Dr. Ausburn OCED 5443 Interpreting Research in OCED Dr. Ausburn.
Appendix B: Statistical Methods. Statistical Methods: Graphing Data Frequency distribution Histogram Frequency polygon.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Introduction to Statistics Measures of Central Tendency and Dispersion.
LIS 570 Summarising and presenting data - Univariate analysis.
Descriptive Statistics for one variable. Statistics has two major chapters: Descriptive Statistics Inferential statistics.
1 Day 1 Quantitative Methods for Investment Management by Binam Ghimire.
Descriptive Statistics(Summary and Variability measures)
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Descriptive Statistics Dr.Ladish Krishnan Sr.Lecturer of Community Medicine AIMST.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 10 Descriptive Statistics Numbers –One tool for collecting data about communication.
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
PA330 FEB 28, 2000.
Measures of Central Tendency
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Numerical Descriptive Measures
Basic Statistical Terms
CHAPTER 5 Fundamentals of Statistics
Numerical Descriptive Measures
Summary descriptive statistics: means and standard deviations:
Univariate Statistics
Mean, Median, Mode The Mean is the simple average of the data values. Most appropriate for symmetric data. The Median is the middle value. It’s best.
CENTRAL MOMENTS, SKEWNESS AND KURTOSIS
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Numerical Descriptive Measures
Presentation transcript:

Numeric Summaries and Descriptive Statistics

populations vs. samples we want to describe both samples and populations the latter is a matter of inference…

“outliers” minority cases, so different from the majority that they merit separate consideration –are they errors? –are they indicative of a different pattern? think about possible outliers with care, but beware of mechanical treatments… significance of outliers depends on your research interests

summaries of distributions graphic vs. numeric –graphic may be better for visualization –numeric are better for statistical/inferential purposes resistance to outliers is usually an advantage in either case

general characteristics kurtosis ‘leptokurtic’   ’platykurtic’ [“peakedness”]

right (positive) skew left (negative) skew skew (skewness)

central tendency measures of central tendency –provide a sense of the value expressed by multiple cases, over all… mean median mode

mean center of gravity evenly partitions the sum of all measurement among all cases; average of all measures

crucial for inferential statistics mean is not very resistant to outliers a “trimmed mean” may be better for descriptive purposes mean – pro and con

mean R: mean(x)

trimmed mean R: mean(x, trim=.1)

median 50 th percentile… less useful for inferential purposes more resistant to effects of outliers…

median

mode the most numerous category for ratio data, often implies that data have been grouped in some way can be more or less created by the grouping procedure for theoretical distributions—simply the location of the peak on the frequency distribution

isolated scatters hamletsvillagesregional centers modal class = ‘hamlets’

dispersion measures of dispersion –summarize degree of clustering of cases, esp. with respect to central tendency… range variance standard deviation

range would be better to use midspread… R: range(x)

variance analogous to average deviation of cases from mean in fact, based on sum of squared deviations from the mean—“sum-of-squares” R: var(x)

variance computational form:

note: units of variance are squared… this makes variance hard to interpret ex.: projectile point sample: mean = 22.6 mm variance = 38 mm 2 what does this mean???

standard deviation square root of variance:

standard deviation units are in same units as base measurements ex.: projectile point sample: mean = 22.6 mm standard deviation = 6.2 mm mean +/- sd (16.4—28.8 mm) –should give at least some intuitive sense of where most of the cases lie, barring major effects of outliers

trimmed dispersion measures variance and sd are even more sensitive to extreme values (outliers) than the mean… why?? you can calculate a trimmed version of the variance simply by eliminating cases from the tails, and calculating the variance in the normal way…

trimmed standard deviation trimmed sd is calculated differently s T = trimmed standard deviation n =number of cases in untrimmed batch s 2 w = variance of trimmed (winsorized) batch n T = number of cases in the trimmed batch