Introduction to Inferential Statistics. Taking out the “loosey-goosey” So far we’ve assessed relationships between variables two ways: – Categorical variables:

Slides:



Advertisements
Similar presentations
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Advertisements

Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Logistic Regression and Odds Ratios
Data Analysis Statistics. Inferential statistics.
Review: What influences confidence intervals?
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
Social Research Methods
Data Analysis Statistics. Inferential statistics.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 8 Analyzing and Interpreting Quantitative.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Today Concepts underlying inferential statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Choosing Statistical Procedures
AM Recitation 2/10/11.
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
Bivariate Relationships Analyzing two variables at a time, usually the Independent & Dependent Variables Like one variable at a time, this can be done.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Making decisions about distributions: Introduction to the Null Hypothesis 47:269: Research Methods I Dr. Leonard April 14, 2010.
Inferential Statistics 2 Maarten Buis January 11, 2006.
Correlation, OLS (simple) regression, logistic regression, reading tables.
Correlation, OLS (simple) regression, logistic regression, reading tables.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Preparing for the final - sample questions with answers.
Difference Between Means Test (“t” statistic) Analysis of Variance (“F” statistic)
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 26.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
CATEGORICAL VARIABLES Testing hypotheses using. When only one variable is being measured, we can display it. But we can’t answer why does this variable.
Testing hypotheses Continuous variables. H H H H H L H L L L L L H H L H L H H L High Murder Low Murder Low Income 31 High Income 24 High Murder Low Murder.
SW318 Social Work Statistics Slide 1 Logistic Regression and Odds Ratios Example of Odds Ratio Using Relationship between Death Penalty and Race.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Introduction to Inferential Statistics
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Correlation, OLS (simple) regression, logistic regression, reading tables.
Difference Between Means Test (“t” statistic) Analysis of Variance (F statistic)
PART 2 SPSS (the Statistical Package for the Social Sciences)
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Cross Tabs and Chi-Squared Testing for a Relationship Between Nominal/Ordinal Variables.
Chapter 13 Understanding research results: statistical inference.
Nonparametric Statistics
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Other tests of significance. Independent variables: continuous Dependent variable: continuous Correlation: Relationship between variables Regression:
Nonparametric Statistics
Review – What are the odds?
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Chi-Square X2.
Introduction to Inferential Statistics
Understanding Results
Difference Between Means Test (“t” statistic)
Multiple logistic regression
Nonparametric Statistics
Review: What influences confidence intervals?
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Presentation transcript:

Introduction to Inferential Statistics

Taking out the “loosey-goosey” So far we’ve assessed relationships between variables two ways: – Categorical variables: tables and proportions (percentages) – Continuous variables: scattergrams and simple correlation (r) Alas, results are usually less extreme than those above. What if 55 percent of officers are high stress and 45 percent low stress? What if the correlation coefficient (r) between income and crime is -.2 (r 2 of.04)? Would we really want to stick our necks out and confirm the hypotheses? What would be the chance that we were wrong? Higher rank  more stress Higher income  less crime r = -.6 r 2 =.36

Inferential statistics Inferential statistics are an extension of procedures that we’ve already used – Provide far more precise assessments of relationships – Allow us to properly “infer” (project) our results to populations – Called “test” statistics because they are used to test hypotheses Examples of inferential statistics – Categorical variables: Chi-Square (X 2 ) – Combination of categorical dependent and continuous independent variable Difference between the means test (t statistic) – Continuous variables Regression (r 2 and R 2 ) b statistic, generated through regression analysis – Combination of nominal and continuous variables Logistic regression, generates b and exp(b) (b exponentiated, a.k.a. odds ratio) Requirements – Must use probability sampling techniques (e.g., random sampling) – “Parametric” inferential statistics, including r 2, b and t Variables must be continuous and normally distributed in the population – Non-parametric statistics Variables need not be normally distributed. We’ll cover one – Chi-Square (X 2 ).

Some statistics used to test relationships ProcedureLevel of Measurement StatisticInterpretation RegressionAll variables continuous r2, R2br2, R2b Proportion of change in the dependent variable accounted for by change in the independent variable. R 2 denotes cumulative effect of multiple independent variables. Unit change in the dependent variable caused by a one-unit change in the independent variable Logistic regression DV nominal & dichotomous, IV’s nominal or continuous b exp(B) Don’t try Odds that DV will change if IV changes one unit, or, if IV is dichotomous, if it changes its state. Range 0 to infinity; 1 denotes even odds, or no relationship. Higher than 1 means positive relationship, lower negative relationship. Use percentage to describe likelihood of effect. Chi-SquareAll variables categorical (nominal or ordinal) X2X2 Reflects difference between Observed and Expected frequencies. Use table to determine if coefficient is sufficiently large to reject null hypothesis Difference between means IV dichotomous, DV continuous tReflects magnitude of difference. Use table to determine if coefficient is sufficiently large to reject null hypothesis.

General procedure Types of hypotheses – Working hypothesis – what a regular hypothesis is called – Null hypothesis – its opposite: the presumption that any apparent relationship between variables is caused by chance. Draw one or more samples and code the independent and dependent variables Use a test statistic to assess the working hypothesis – The computer calculates a coefficient for the test statistic (e.g., r 2 =.20) – These coefficients are the sum of two components “Systematic” variance: The actual, “systematic” relationship between variables “Error” variance: An apparent relationship, caused by sampling error. It shrinks as sample size increases. Error variance Systematic variance - the “real” relationship The big question Once we remove the error component, is enough “real” relationship left to reject the null hypothesis?

Test statistics and the null hypothesis To reject the null hypothesis, the test statistic coefficient (e.g., r 2 =.20) must be sufficiently large, after subtracting sampling error, to reject the null hypothesis How much “room” is required? Enough to yield a probability of less than five in one- hundred (<.05) that the relationship between variables was produced by chance. – If the computer decides that the coefficient is sufficiently large it will award at least one asterisk. The relationship between variables is “statistically significant” and the null hypothesis (no relationship) is FALSE. – If the coefficient is too small, no asterisk (*) is awarded. The association between variables is deemed “non-significant” and the null hypothesis is TRUE. Working hypotheses that depend on this relationship must be rejected. For significant relationships, one to three asterisks usually appear next to the test statistic’s coefficient (e.g.,.25 *,.36 **,.41 *** ). More asterisks = greater confidence that a relationship is systematic – not the product of chance. * Probability less than 5 in 100 that a coefficient was produced by chance (p<.05) ** Probability less than 1 in 100 that a coefficient was produced by chance (p<.01) *** Probability less than 1 in 1,000 that a coefficient was produced by chance (p<.001) Instead of asterisks, sometimes the actual probability that a coefficient was produced by chance are given, usually in a column labeled “p”. – Again, significant relationships are denoted by p’s less than.05 Good Better Best

Probabilities (that the null hypothesis is true) are the most common way to evaluate relationships. – The smaller the probability, the more likely that the null hypothesis (meaning, no relationship) is false, meaning that the greater the likelihood that the working hypothesis is true – But this process has been criticized for suggesting misleading results. (Click here for a summary of the arguments.)here We normally use p values to accept or reject null hypotheses. Its real meaning is subtle: – Formally, a p <.05 means that, if an association between variables was tested an infinite number of times, a coefficient as large as the one actually obtained (say, an r 2 of.30) would come up less than five times in a hundred if the null hypothesis of no relationship was actually true. For our purposes, as long as we keep in mind the inherent sloppiness of social science, and the difficulties of accurately quantifying social science phenomena, it’s sufficient to use p-values to accept or reject null hypotheses. We should always be skeptical of findings of “significance,” particularly when very large samples are involved. – When sample size is large - say, a thousand - even weak relationships can show up as statistically significant. (More on this later.) A caution on hypothesis testing…

Examples of tables from articles, panels 1-12

1 Hypothesis: Alcohol consumption  Victimization Method: Logistic regression Statistics: b and Odds Ratio (Exp b) Richard B. Felson and Keri B. Burchfield, “Alcohol and the Risk of Physical and Sexual Assault Victimization,” Criminology (42:4, 2004)

2 Hypothesis: Veteran status  less punitive police response to domestic violence Method: Logistic regression Statistics: b and Odds Ratio (Exp b) Fred Markowitz and Amy C. Watson, “Police Response to Domestic Violence Situations Involving Veterans Exhibiting Signs of Mental Illness,” Criminology, (53:2, 2015)

3 Hypothesis: Race and class  Satisfaction with police Method: Logistic regression Statistics: b and Exp b (odds ratio) Yuning Wu, Ivan Y. Sun and Ruth A. Triplett, “Race, Class or Neighborhood Context: Which Matters More in Measuring Satisfaction With Police?,” Justice Quarterly (26:1, 2009)

4 Hypothesis: Low self control  More contact with police Method: Logistic regression Statistics: b and Exp b (odds ratio) Kevin M. Beaver, Matt DeLisi, Daniel P. Mears and Eric Stewart, “Low Self-Control and Contact with the Criminal Justice System in a Nationally Representative Sample of Males,” Justice Quarterly (26:4, 2009)

5 Hypothesis: Gender and race of victim  Imposition of death sentence Method: Logistic regression Statistics: b (“coefficient”) and odds-ratio (exp b) Marian R. Williams, Stephen Demuth and Jefferson E. Holcomb, “Understanding the Influence of Victim Gender in Death Penalty Cases: The Importance of Victim Race, Sex- Related Victimization, and Jury Decision Making,” Criminology (45:4, 2007)

6 Hypothesis: Academic performance  Delinquency Method: “Tobit” regression* Statistic: b * Best when the DV for a large proportion of cases has a zero value Richard B. Felson and Jeremy Staff, “Explaining the Academic Performance- Delinquency Relationship,” Criminology (44:2, 2006)

7 Hypothesis: Strains of imprisonment  Recidivism Method: Logistic regression Statistics: B and exp B (odds-ratio) Shelley Johnson Listwan, Christopher J. Sullivan, Robert Agnew, Francis T. Cullen and Mark Colvin, “The Pains of Imprisonment Revisited: The Impact of Strain on Inmate Recidivism,” Justice Quarterly (30:1, 2013)

8 Hypothesis: Father’s incarceration  Son’s delinquency Method: Tobit regression Statistic: Random effect coefficient (S.E. in parentheses) Michael E. Roettger and Raymond R. Swisher, “Associations of Fathers’ History of Incarceration With Sons’ Delinquency and Arrest Among Black, White and Hispanic Males in the United States,” Criminology (49:4, 2011)

8 Hypothesis: Father’s incarceration  Son’s delinquency Method: Logistic regression Statistic: Odds ratio (Standard Error in parentheses) Michael E. Roettger and Raymond R. Swisher, “Associations of Fathers’ History of Incarceration With Sons’ Delinquency and Arrest Among Black, White and Hispanic Males in the United States,” Criminology (49:4, 2011)

9 Hypothesis: Officer and driver race  Vehicle search Method: Logistic regression Statistics: Odds ratio (Standard Error in parentheses) Jeff Rojek, Richard Rosenfeld and Scott Decker, “Policing Race: The Racial Stratification of Searches in Police Traffic Stops,” Criminology (50:4, 2012

10 Hypothesis: Offender race & gender  Use of intermediate sanctions Method: Logistic regression Statistics: b and Exp b (odds ratio) Brian D. Johnson and Stephanie M. Dipietro, “The Power of Diversion: Intermediate Sanctions and Sentencing Disparity Under Presumptive Guidelines,” Criminology (50:3, 2012)

11 Hypothesis: Race & ethnicity  Prosecution and sentencing outcomes Method: Logistic regression Statistic: Odds ratio (Exp b) Besiki L. Kutateladze, Nancy R. Andiloro, Brian D. Johnson and Cassia C. Spohn, “Cumulative Disadvantage: Examining Racial and Ethic Disparity in Prosecution and Sentencing,” Criminology (52:3, 2014)

12 Hypothesis: Marriage  Desistance from crime Method: HLM (like logistic regression) Statistics: b (Coeff.) [Can compute log odds) Bianca E. Bersani and Elaine Eggleston Doherty, “When the Ties That Bind Unwind: Examining the Enduring and Situational Processes of Change Behind the Marriage Effect,” Criminology (51:2, 2013)