PART 1: Models, metrics and the demystification of statistical significance FRSS!!!!

Slides:



Advertisements
Similar presentations
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Quantitative Data Analysis: Hypothesis Testing
Assumption of normality
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
CORRELATON & REGRESSION
 Once you know the correlation coefficient for your sample, you might want to determine whether this correlation occurred by chance.  Or does the relationship.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Social Research Methods
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
SW388R7 Data Analysis & Computers II Slide 1 Assumption of normality Transformations Assumption of normality script Practice problems.
SPSS Session 4: Association and Prediction Using Correlation and Regression.
Correlation & Regression
Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
SELECT LA_code, sum(population)/sum(wardarea) AS density, (SELECT count(*) FROM wards_by_LA WLA2, violent_crime WHERE WLA2.ward_code =violent_crime.ward_code.
Chapter 8 Introduction to Hypothesis Testing
Copyright © 2008 by Pearson Education, Inc. Upper Saddle River, New Jersey All rights reserved. John W. Creswell Educational Research: Planning,
N318b Winter 2002 Nursing Statistics Specific statistical tests: Correlation Lecture 10.
Statistical Analysis A Quick Overview. The Scientific Method Establishing a hypothesis (idea) Collecting evidence (often in the form of numerical data)
Association between 2 variables
11 Chapter 12 Quantitative Data Analysis: Hypothesis Testing © 2009 John Wiley & Sons Ltd.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Lecture 10: Correlation and Regression Model.
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Correlation The apparent relation between two variables.
Chapter Eight: Using Statistics to Answer Questions.
Stat 112 Notes 5 Today: –Chapter 3.7 (Cautions in interpreting regression results) –Normal Quantile Plots –Chapter 3.6 (Fitting a linear time trend to.
URBDP 591 I Lecture 4: Research Question Objectives How do we define a research question? What is a testable hypothesis? How do we test an hypothesis?
PART 2 SPSS (the Statistical Package for the Social Sciences)
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Understanding Statistics © Curriculum Press 2003     H0H0 H1H1.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Pearson Product-Moment Correlation Test PowerPoint.
Chapter 13 Understanding research results: statistical inference.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
When the means of two groups are to be compared (where each group consists of subjects that are not related) then the excel two-sample t-test procedure.
15 Inferential Statistics.
Chapter 8 Introducing Inferential Statistics.
Chapter 2 Research Methods.
Introduction to Marketing Research
Chapter 13 Simple Linear Regression
GS/PPAL Section N Research Methods and Information Systems
Is there a relationship between the lengths of body parts?
Topic 10 - Linear Regression
AP Statistics Chapter 14 Section 1.
Correlation and Simple Linear Regression
Inference and Tests of Hypotheses
Linear Regression and Correlation Analysis
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Starter: complete the research methods paper
Chapter 15 Linear Regression
Social Research Methods
Regression Analysis Week 4.
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Analysis based on normal distributions
STEM Fair Graphs & Statistical Analysis
Statistical Tests P Values.
Correlation Coefficient
Ass. Prof. Dr. Mogeeb Mosleh
Chapter 11: Introduction to Hypothesis Testing Lecture 5a
SIMPLE LINEAR REGRESSION
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

PART 1: Models, metrics and the demystification of statistical significance FRSS!!!!

Causal modelling (theory-based metrics) Key terminology: Independent variables (IV) = causal factors (+ve or –ve) Dependent variables (DV) = effects/outcomes Moderating variables (MV) = modify cause-effect relations IV DV MV - Treatment/intervention effect = impact of the IV

Group exercise 1 Alcohol-related violence and its reduction is a priority area of social policy Need to understand its “epidemiology”, i.e. those factors which influence its prevalence –In your groups, produce a causal model for AVC, identifying those socio-demographic factors you think are key….

Evidence-based Policy: crime control Preston street drinking ban Ambulance Incidents (monthly) Target zone County demand Before ban After ban Change-9%+5% % total serious violent crime committed within target zone (reduced 14.8% to 12.3%) All effects stat. sig. BUT … any validity concerns, alternative explanations?? Guess what…..

Statistical testing & the null hypotheses (H 0 ) A street drinking ban is being implemented in an effort to reduce alcohol- related violence (AVC) –Randomised control trial (RCT) used to evaluate 6 towns chosen, 3 randomly picked for the drinking ban –why randomisation? After three months, the levels of AVC reduced in the three “treatment” sites, but no change in the “controls” –has the intervention been effective? How strong is the evidence? H 0 = the intervention has not changed anything –How many we explain the results if H 0 is true? –What is the probability of getting the observed evidence on this assumption? Statistical inference: –If the likelihood of getting results as extreme as those obtained, assuming H 0 to be true, is less than some threshold value (typically 1 in 20), then reject H 0 and conclude that the effects are genuine, i.e. could not have occurred by chance This is the principle of STATISTICAL SIGNIFICANCE –NB. Not the same as substantive significance!! Can always get statistical significance by gathering more data…. Even though treatment effect is very small

PART 2: Exploratory data analysis The term EDA coined by John Tukey (1977) – he likened EDA to “detective work” In EDA, the role of the researcher is to explore the data in as many ways as possible until a plausible "story" of the data emerges. –A detective does not collect just any information. Instead he collects evidence and clues related to the central question of the case. Some tools of the trade, using Excel: –Histograms (stem-and-leaf diagrams) –Scatter-plots –Correlation coefficients

Exercise 1: The Humble Histogram 3) From Data Analysis on the Tools menu select Histogram 1) Open Workshop Spreadsheet 2) Select Violent Crime Incidents worksheet

Exercise 2: the not-so-scatty scatterplot! Select the “Ward Profile” worksheet –Inspect it carefully!!! –What relationships should we look at in terms of the causal modelling exercise? Draw a scatterplot relating crime incidence to the no. of liquor outlets: –Highlight “No pubs” and the crimes column –Click Chart on the Insert menu –Select XY (scatter) and press Next –Hey Presto!!

Exercise 3: correlation Correlation coefficients (r) measure the strength of the linear relationship between two variables –1=perfect correlation, 0 = no relation, –What would r = -1 mean? What is the correlation between the no. of pubs per ward and the rate of AVC? –Select Correlation from Data Analysis tools –Identify variables of interest –Click ok and correlation matrix appears in new sheet How can the model be improved? –Create new variable (pub density) and repeat correlational analysis –Is the correlation higher? NB: correlation does not mean causality!!! Highlight the appropriate columns Click here when columns identified

Exercise 4: More truffling…. Draw a histogram showing the distribution of crime: –Use automatic “bin widths” –Chose a more meaningful set –Is the distribution “normal”? Use basic Excel functions (sorting) to explore relationships between crime rates and contextual factors: –Sort table by crime rates, highest crime first –What stands out!!! –What percentage of crimes in top ten wards? If time investigate other possible correlations concerning crime, and also between other variables… –How strong is the link with deprivation? –Is this what you’d expect? –Would the relationship be stronger with other sorts of crime?

Any other business Data-mining example – Govmetric Statistics packages –Mostly rather expensive, e.g. SPSS Opensource: –R is brilliant, though hard work to learn –R Commander provides a GUI interface