DEPICTING DISTRIBUTIONS. How many at each value/score Value or score of variable.

Slides:



Advertisements
Similar presentations
Learning Objectives In this chapter you will learn about measures of central tendency measures of central tendency levels of measurement levels of measurement.
Advertisements

Describing Quantitative Variables
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
Displaying & Summarizing Quantitative Data
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Data Analysis Statistics. OVERVIEW Getting Ready for Data Collection Getting Ready for Data Collection The Data Collection Process The Data Collection.
Data observation and Descriptive Statistics
Summarizing and Displaying Measurement Data
Chapter 3: Central Tendency
Measures of Central Tendency
Today: Central Tendency & Dispersion
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
1 Statistics This lecture covers chapter 1 and 2 sections in Howell Why study maths in psychology? “Mathematics has the advantage of teaching you.
Statistics Used In Special Education
Describing distributions with numbers
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Statistics and Research methods Wiskunde voor HMI Betsy van Dijk.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Chapter 2: Frequency Distributions Peer Tutor Slides Instructor: Mr. Ethan W. Cooper, Lead.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Descriptive statistics I Distributions, summary statistics.
DISTRIBUTIONS. What is a “distribution”? One distribution for a continuous variable. Each youth homicide is a case. There is one variable: the number.
Chapter 8 Quantitative Data Analysis. Meaningful Information Quantitative Analysis Quantitative analysis Quantitative analysis is a scientific approach.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
VARIABILITY. Case no.AgeHeightM/F 12368M 22264F 32369F 42571M 52764F 62272M 72465F 82366M 92366F F M F M F F F.
Statistics 2. Variables Discrete Continuous Quantitative (Numerical) (measurements and counts) Qualitative (categorical) (define groups) Ordinal (fall.
INVESTIGATION 1.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency.
Chapter 2 Means to an End: Computing and Understanding Averages Part II  igma Freud & Descriptive Statistics.
Part II  igma Freud & Descriptive Statistics Chapter 2 Means to an End: Computing and Understanding Averages.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
1 Chapter 4 Numerical Methods for Describing Data.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
DISTRIBUTIONS. What is a “distribution”? One distribution for a continuous variable. Each youth homicide is a case. There is one variable: the number.
LIS 570 Summarising and presenting data - Univariate analysis.
Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house.
Descriptive Statistics Research Writing Aiden Yeh, PhD.
Chapter 2 Review Using graphs/tables/diagrams to show variable relationships Understand cumulative frequency, percentile rank, and cross-tabulations Perform.
VARIABILITY. Case no.AgeHeightM/F 12368M 22264F 32369F 42571M 52764F 62272M 72465F 82366M 92366F F M F M F F F.
Chapter 2 Describing and Presenting a Distribution of Scores.
Data Description Chapter 3. The Focus of Chapter 3  Chapter 2 showed you how to organize and present data.  Chapter 3 will show you how to summarize.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
Types of variables Discrete VS Continuous Discrete Continuous
Analysis of Quantitative Data
Chapter 2: Methods for Describing Data Sets
Descriptive Statistics: Presenting and Describing Data
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Description of Data (Summary and Variability measures)
MEASURES OF CENTRAL TENDENCY
Dispersion How values arrange themselves around the mean
Random sample of patrol officers, each scored 1-5 on a cynicism scale
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
DEFINITIONS Population Sample Unit of analysis Case Sampling frame.
VARIABILITY Distributions Measuring dispersion
VARIABILITY Distributions Measuring dispersion
Central Tendency & Variability
Presentation transcript:

DEPICTING DISTRIBUTIONS

How many at each value/score Value or score of variable

What is a “distribution”? One distribution for a single variable. Each youth homicide is a case. There is one variable: the number each month. Two distributions, each for a single variable: violent crime or imprisonment. Each violent crime is a case. The variable is their number each year (divided by 100,000) Each prisoner is a case. The variable is their number each year (divided by 100,000). One distribution for TWO variables: Youth’s demeanor (two categories) Officer disposition (four categories) Each police encounter with a youth is a case. An arrangement of cases in a sample or population according to their values or scores on one or more variables (A case is a single unit that “contains” all the variables of interest) Distributions can be visually depicted. How that is done depends on the kind of variable, categorical or continuous.

Depicting the distribution of categorical variables: the bar graph Distributions depict the frequency (number of cases) at each value of a variable. Here there is one: gender. A case is a single unit that “contains” all the variables of interest. Here each student is a case Frequency means the number of cases – students – at a single value of a variable. Frequencies are always on the Y axis Values of the variable are always on the X axis X - axis Y - axis Distributions illustrate how cases cluster or spread out according to the value or score of the variable. Here the proportions of men and women seem about equal. n=15 n=17

Depicting the distribution of continuous variables: the histogram Y - axis X - axis Frequency means the number of cases at a single value of a variable Distributions depict the frequency (number of cases) at each value of a variable Frequencies (“counts”) are always on the Y axis Values of the variable are always on the X axis

CATEGORICAL VARIABLES Summarizing the distribution of

Summarizing the distribution of categorical variables using percentage Percentage is a “statistic.” It’s a proportion with a denominator of 100. Percentages are used to summarize categorical data – 70 percent of students are employed; 60 percent of parolees recidivate Since per cent means per 100, any decimal can be converted to a percentage by multiplying it by 100 (moving the decimal point two places to the right) –.20 =.20 X 100 = 20 percent (twenty per hundred) –.368 =.368 X 100 = 36.8 percent (thirty-six point eight per hundred) When converting, remember that there can be fractions of one percent –.0020 =.0020 X 100 =.20 percent (two tenths of one percent) To obtain a percentage for a category, divide the number of cases in the category by the total number of cases in the sample 50,000 persons were asked whether crime is a serious problem: 32,700 said “yes.” What percentage said “yes”?

Using percentages to compare datasets Percentages are “normalized” numbers (e.g., per 100), so they can be used to compare datasets of different size – Last year, 10,000 people were polled. Eight-thousand said crime is a serious problem – This year 12,000 people were polled. Nine-thousand said crime is a serious problem. Calculate the second percentage and compare it to the first

Class 1Class 2 Draw two bar graphs, one for each class, depicting proportions for gender Practical exercise

Wed. classThurs. class 15 Females 15/31 =.483 X 100 = 48% 16 Males 16/31 =.516 X 100 = 52% _____ 100% 20 Females 20/31 =.645 X 100 = 65% 11 Males 11/31 =.354 X 100 = 35% _____ 100% ____ 100%

Calculating increases in percentage Increases in percentage are computed off the base amount Example: Jail with 120 prisoners. How many prisoners will there be with a… 100 percent increase? – 100 percent of the base amount, 120, is 120 (120 X 100 / 100) – 120 base increase = 240 (2 times the base amount) 150 percent increase? – 150 percent of 120 is 180 (120 X 150 / 100) – 120 base plus 180 increase = 300 (2 ½ times the base amount) How many will there be with a 200 percent increase?

Percentage changes can mislead Answer to preceding slide – prison with 120 prisoners 200 percent increase 200 percent of 120 is 240 (120 X 200 / 100) 120 base plus 240 = 360 (3 times the base amount) Percentages can make changes seem large when bases are small Example: Increase from 1 to 3 convictions is 200 (two-hundred) percent 3-1 = 2 2/base = 2/1 = 2 2 X 100 = 200% Percentages can make changes seem small when bases are large Example: Increase from 5,000 to 6,000 convictions is 20 (twenty) percent 6, ,000 = 1,000 1,000/base = 1000/5,000 =.20 = 20%

CONTINUOUS VARIABLES Summarizing the distribution of

Four summary statistics Continuous variables – review – Can take on an infinite number of values (e.g., age, height, weight, sentence length) – Precise differences between cases – Equivalent differences: Distances between years same as years Summary statistics for continuous variables – Mean: arithmetic average of scores – Median: midpoint of scores (half higher, half lower) – Mode: most frequent score (or scores, if tied) – Range: Difference between low and high scores

Summarizing the distribution of continuous variables - the mean Arithmetic average of scores – Add up all the scores – Divide the result by the number of scores Example: Compare numbers of arrests for twenty police precincts during a certain shift Method: Use mean to summarize arrests at each precinct, then compare the means Mean 3.0 Mean 3.5 arrests Variable: number of arrests Unit of analysis: police precincts Case: one precinct Issue: Means are pulled in the direction of extreme scores, possibly misleading the comparison

Using the mean for ordinal variables Ordinal variables are categorical variables with an inherent order – Small, medium, large – Cooperative, uncooperative Can summarize in the ordinary way: proportions / percentages Or, treat categories as points on a continuous scale and calculate a mean Not always recommended because “distances” between points on scale may not be equal, causing misleading results Is the distance between “Admonished” and “Informal” same as between “Informal and Citation”? “Citation” and “Arrest”? Rank Severity of Disposition Youths Freq.% 4 Arrested Citation or official reprimand Informal reprimand Admonished & release 2538 Total Severity of disposition mean = 2.24 (25 X 1) + (16 X 2) + (9 X 3) + (16 X 4) / 66

0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6 Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21 Compute / 2 = 3 arrests Summarizing the distribution of continuous variables - the median Median can be used with continuous or ordinal variables Median is a useful summary statistic when there are extreme scores, making the mean misleading In this example, which is identical to the preceding page except for one outlier (16), the mean is 3.5 –.5 higher But the medians (3.0) are the same

Answers to preceding slide Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Answer: 8 Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21 Answer: 10 ( / 2) Median can be used with continuous or ordinal variables Median is a useful summary statistic when there are extreme scores, making the mean misleading 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, / 2 = 3 arrests

Score that occurs most often (with the greatest frequency) Here the mode is 3 Modes are a useful summary statistic when cases cluster at particular scores – an interesting condition that might otherwise be overlooked Symmetrical distributions, like this one, are called “normal” distributions. In such distributions the mean, mode and median are the same. Near-normal distributions are common. There can be more than one mode (bi-modal, tri-modal, etc.). Identify the modes: Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21 Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21 arrests Summarizing the distribution of continuous variables - the mode

Answers to preceding side Exercise 1: 2, 3, 5, 5, 8, 12, 17, 19, 21Mode = 5 (unimodal) Exercise 2: 2, 3, 5, 5, 8, 12, 17, 19, 21, 21Modes = 5, 21 (bimodal) Range: a simple way to convey the distribution of a continuous variable – Depicts the lowest and highest scores in a distribution 2, 3, 5, 5, 8, 12, 17, 19, 21 – range is “2 to 21” – Range can also be defined as the difference between the scores (21-2 = 19). If so, minimum and maximum scores should also be given. – Useful to cite range if there are outliers (extreme scores) that misleadingly distort the shape of the distribution A final way to depict the distribution of continuous variables - the range

In-class exercise Calculate your class summary statistics for age and height – mean, median, mode and range Pictorially depict the distributions for age and height, placing the variables and frequencies on the correct axes Case no.

Next week – Every week: Without fail – bring an approved calculator – the same one you will use for the exam. It must be a basic calculator with a square root key. NOT a scientific or graphing calculator. NOT a cell phone, etc.

Case No. Income No. of arrests Gender M F F M F M F F M F F M F F M M F M F F HOMEWORK EXERCISE (link on weekly schedule) 1. Calculate all appropriate summary statistics for each distribution 2. Pictorially depict the distribution of arrests 3. Pictorially depict the distribution of gender