APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez
Lecture 5 Assessing Associations Bivariate Analysis
Juan P. Rodriguez - Fall Perspective Research Techniques Accessing, Examining and Saving Data Univariate Analysis – Descriptive Statistics Constructing (Manipulating) Variables Association – Bivariate Analysis Association – Multivariate Analysis Comparing Group Means – Bivariate Multivariate Analysis - Regression
Juan P. Rodriguez - Fall Assessing Association – Bivariate Analysis Why do we need significance tests? Analyzing Bivariate Relationships Categorical Variables: Cross Tabulations Bar Charts Numerical Variables Correlations Scatter Plots
Juan P. Rodriguez - Fall Variable Relationships Questions addressed by SS examine relationships between variables: Is death penalty associated with lower crimes? Is school funding related to educational success? Are religious people more conservative? We need to understand if the observed relationships are true or the result of chance: significance tests
Juan P. Rodriguez - Fall Significance Tests Patterns in data are due to: Random Chance: the Null Hypothesis Real Relationships: the Alternate Hypothesis Significance Tests rely on Significance Levels, estimates of the probability degree to which chance is a likely explanation for the observed pattern: Probability: A mathematical measurement of the likelihood that an event has occurred or will occur. Ranges from 0 to 1
Juan P. Rodriguez - Fall Significance Tests A high SL indicates a strong possibility that the observed pattern is due to chance A low SL indicates that chance alone is unlikely to explain the observed pattern and thus the AH is to be considered SL give an exact estimate of the probability that chance produced a pattern in the data
Juan P. Rodriguez - Fall Significance Tests Statistically Significant: relationship is low enough Usually 0.05 or 0.01
Juan P. Rodriguez - Fall Significance Tests Based on: Strength of Association Sample size Strong Association for small samples Not as strong for large samples They do not indicate that: The relationship is important That relationship is causal
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables Cross Tabulations: Grids of all possible combinations of the values of 2 categorical variables Example: Full and part-time work preferences by gender Load Dataset GSS98
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables Null Hypothesis: No association between gender and job preference, i.e., women and men do not vary in their preferences for full and part time work
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables Group sample sizes are not equal (they rarely are) A solution is to convert the counts to percentages:
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables There seems to be a relationship between sex and work preferences: Men (71%) are more willing than women (46.6%) to say they prefer full time work Women are more willing than men to say they prefer to work part time It would appear the the Null Hypothesis is false. Before making that conclusion, we need a Test of Significance
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables SPSS offers several tests of significance We’ll use Chi Square
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables
Juan P. Rodriguez - Fall Bivariate Relationships Categorical Variables Results of Significance Test: Relationship is statistically significant because the significance level is The probability that the observed relationship between sex and work preference is random is less than 1/1000
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts Display Bivariate relationships between 2 categorical variables We’ll graph the relationship in previous example
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts
Juan P. Rodriguez - Fall Bivariate Relationships Bar Charts The Bar Chart shows quite clearly how women outnumber men in their preference for part time jobs
Juan P. Rodriguez - Fall Bivariate Relationships Numerical Variables Numerical Variables display a range of values: Age, Income, miles driven to work Analysis: Recode them into categorical variables and do cross tabulation Correlation analysis on numerical values
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations Correlation is a measure of the degree to which the values in 2 variables correspond to each other Pearson’s Correlation coefficient measures the strength of the Linear relationship between 2 variables Other types of relationships: curvilinear, U shaped, inverted U
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations Linear Relationships A change in one variable is associated with a consistent change in another variable Correlation coefficients: -1 to 1
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations We’ll examine relationship between “social disorganization” and suicide Social disorganization indicators: crime, divorce, substance abuse Null Hypothesis: There is no real linear relationship between suicide rates and divorce rates Alternate Hypothesis: There is a positive linear relationship
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations Use States dataset
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Correlations There is a positive relationship between divorce and suicide rates (0.683). This relationship is statistically significant (P<0.001)
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatterplots Graph relationships between 2 numerical variables The “independent” variable is placed on the X axis and the dependent on the Y axis
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatter plots
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatter plots
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatter plots
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatter plots
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatter plots
Juan P. Rodriguez - Fall Bivariate Analysis - Numerical Variables Scatter plots