Descriptive statistics I Distributions, summary statistics.

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
DEPICTING DISTRIBUTIONS. How many at each value/score Value or score of variable.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 4. Measuring Averages.
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Statistics: An Introduction Alan Monroe: Chapter 6.
Data Analysis Statistics. OVERVIEW Getting Ready for Data Collection Getting Ready for Data Collection The Data Collection Process The Data Collection.
PPA 415 – Research Methods in Public Administration
Descriptive Statistics
Edpsy 511 Homework 1: Due 2/6.
Quantitative Data Analysis Definitions Examples of a data set Creating a data set Displaying and presenting data – frequency distributions Grouping and.
Central Tendency.
Topics: Descriptive Statistics A road map Examining data through frequency distributions Measures of central tendency Measures of variability The normal.
Data observation and Descriptive Statistics
Chapter 3: Central Tendency
Measures of Central Tendency
Chapter 4 Measures of Central Tendency
Measures of Central Tendency
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Summarizing Scores With Measures of Central Tendency
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
Data Handbook Chapter 4 & 5. Data A series of readings that represents a natural population parameter A series of readings that represents a natural population.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
DISTRIBUTIONS. What is a “distribution”? One distribution for a continuous variable. Each youth homicide is a case. There is one variable: the number.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Psychology’s Statistics. Statistics Are a means to make data more meaningful Provide a method of organizing information so that it can be understood.
Measures of Central Tendency: The Mean, Median, and Mode
Chapter 2 Means to an End: Computing and Understanding Averages Part II  igma Freud & Descriptive Statistics.
Part II  igma Freud & Descriptive Statistics Chapter 2 Means to an End: Computing and Understanding Averages.
Central Tendency. Variables have distributions A variable is something that changes or has different values (e.g., anger). A distribution is a collection.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
IE(DS)1 Descriptive Statistics Data - Quantitative observation of Behavior What do numbers mean? If we call one thing 1 and another thing 2 what do we.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
DISTRIBUTIONS. What is a “distribution”? One distribution for a continuous variable. Each youth homicide is a case. There is one variable: the number.
LIS 570 Summarising and presenting data - Univariate analysis.
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Descriptive Statistics Research Writing Aiden Yeh, PhD.
Chapter 2 Review Using graphs/tables/diagrams to show variable relationships Understand cumulative frequency, percentile rank, and cross-tabulations Perform.
Why do we analyze data?  It is important to analyze data because you need to determine the extent to which the hypothesized relationship does or does.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
Chapter 2 Describing and Presenting a Distribution of Scores.
Data Description Chapter 3. The Focus of Chapter 3  Chapter 2 showed you how to organize and present data.  Chapter 3 will show you how to summarize.
Psychology’s Statistics Appendix. Statistics Are a means to make data more meaningful Provide a method of organizing information so that it can be understood.
Applied Quantitative Analysis and Practices LECTURE#05 By Dr. Osman Sadiq Paracha.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
Types of variables Discrete VS Continuous Discrete Continuous
Chapter 2: Methods for Describing Data Sets
Quantitative Methods in HPELS HPELS 6210
Summarizing Scores With Measures of Central Tendency
Description of Data (Summary and Variability measures)
STATS DAY First a few review questions.
Descriptive Statistics
Introduction to Statistics
MEASURES OF CENTRAL TENDENCY
Dispersion How values arrange themselves around the mean
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
Ms. Saint-Paul A.P. Psychology
Presentation transcript:

Descriptive statistics I Distributions, summary statistics

Frequency distributions Frequency means the number of cases at a single value of a variable A “distribution” depicts the frequency (number of cases) at every value of a variable –Frequency distributions illustrate how values disperse –For categorical variables use a BAR graph –For continuous variables use a HISTOGRAM (also try AREA) Open DEMO PLUS.SAV For categorical choose variable SEX (1=Male, 2=Female) For continuous choose variable AGE Open Height weight gender age.sav (or.xls), choose a categorical and continuous variable, display their distributions as above

Summarizing distributions Producing a single statistic that best depicts a distribution For categorical variables, use the statistic “proportion” –Proportions with a base 100 are called a “percentage” (per 100) For continuous variables, use a measure of central tendency –The statistic “mean” (arithmetic average) –The statistic “median” (midpoint value – half of cases above, half below) –The statistic “mode” (most frequent value – can be more than one) Open DEMO PLUS.SAV –For categorical choose variable SEX (1=Male, 2=Female) Analyze|Descriptive Statistics|Frequencies Ask for a Bar Chart –For continuous choose variable AGE Analyze|Descriptive Statistics|Frequencies Ask for a Histogram Open Height weight gender age.sav (or.xls), choose a categorical and continuous variable, proceed as above

Categorical variables “Percent” is a summary statistic – it summarizes a distribution “Percent” – per cent – per hundred. 100 is always the denominator Increases in percentage are computed off the base amount: Increase in jail population of 100 prisoners 100 percent increase percent of 100 is 100; = percent increase – 150 percent of 100 is 150, 150 plus 100 = percent increase – 200 percent of 100 is 200, 200 plus 100= 300 (3 times the base amount)

Percentages of less than 1 percent are described as a fraction –Example percent is 2/10 th of 1 percent –Do not confuse decimals and percentages Decimal.20 = 20/100 = 20 percent Decimal.0020 = 20/10,000 =.20 percent

Percentages (proportions) are usually the best way to summarize datasets using categorical variables –70 percent of students are employed –60 percent of parolees recidivate Percentages can be used to summarize findings when large numbers are involved –50,000 persons were asked whether crime is a serious problem: 32,700 said “yes” Compute…

Divide 32,700 by 50,000 and multiply by , = X 100 = 65% 50,000

Percentages can be used to compare datasets –This year, 65% of 10,000 people polled said crime is a serious problem –Last year, 12,000 people were polled and 9,000 said crime is a serious problem Compute…

9, = X 100= 75% 12,000 Because both samples were standardized (responses per 100 persons) they are directly comparable even though different numbers of persons were polled –65% v. 75%

Percentages can magnify differences when raw numbers are small Percentages can deflate differences when numbers are large –Increase from 1 to 3 convictions is … –Increase from 5,000 to 6,000 convictions is … Compute both...

Increase from 1 to 3 convictions is 200 percent –3-1 = 2 –2/1 (base) X 100= 200% Increase from 5,000 to 6,000 convictions is 20 percent –6, ,000 = 1000 –1000/5000 (base) X 100= 20%

Categorical variables – categories reflect an inherent rank or order Can summarize the distribution of an ordinal variable two ways: –As a categorical variable, using proportions / percentages –As a continuous variable, treating categories as points on a scale Assign a numerical value to each category and calculate a mean Open DEMO PLUS.SAV –Variable “class” is ordinal –Display and summarize the distribution both ways... As a categorical/ordinal variable As a continuous variable Summarizing a distribution for ordinal variables

If variables are continuous, can summarize a distribution with one or more measures of “central tendency” –M ean, median, mode Mean: arithmetic average of scores –Pulled in the direction of extreme scores –Experiment with Height weight gender age.sav Median: Middle score – half higher, half lower –If there is an even number of scores, average the two center scores –If there is an odd number of scores, use the center score Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21 Continuous variables

Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Answer: 8 Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21 Answer: = 4 4/2 = or 12-2 = 10 Median is a useful summary statistic when there are extreme scores –Extreme scores make the mean a misleading summary measure of a distribution Median can be used with continuous or ordinal variables

Mode: Score that occurs most often (with the greatest frequency) –There can be more than one mode (bi-modal, tri-modal, etc.) Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21

Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Mode = 5 (uni-modal) Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21 Modes = 5, 21 (bi-modal) Modes are a useful summary statistic for distributions where cases cluster at particular scores – an interesting condition that would be missed by the mean or median

Range Another way to describe a distribution of a continuous variable –Not a measure of central tendency Range depicts the lowest and highest scores in a distribution 2, 3, 5, 5, 8, 12, 17, 19, 21 Range is 2  21 or 19 (21-2)