Statistical hypothesis testing – Inferential statistics II. Testing for associations.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Correlation and regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Final Review Session.
Linear Regression and Correlation Analysis
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Ch. 14: The Multiple Regression Model building
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Simple Linear Regression Analysis
Lecture 5 Correlation and Regression
Correlation & Regression
Inferential Statistics
Leedy and Ormrod Ch. 11 Gray Ch. 14
Analyzing Data: Bivariate Relationships Chapter 7.
Lecture 16 Correlation and Coefficient of Correlation
Regression and Correlation Methods Judy Zhong Ph.D.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Hypothesis Testing Charity I. Mulig. Variable A variable is any property or quantity that can take on different values. Variables may take on discrete.
Simple Linear Regression
Learning Objective Chapter 14 Correlation and Regression Analysis CHAPTER fourteen Correlation and Regression Analysis Copyright © 2000 by John Wiley &
Descriptive Statistics e.g.,frequencies, percentiles, mean, median, mode, ranges, inter-quartile ranges, sds, Zs Describe data Inferential Statistics e.g.,
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
SESSION Last Update 17 th June 2011 Regression.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Examining Relationships in Quantitative Research
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
CHI SQUARE TESTS.
Regression & Correlation. Review: Types of Variables & Steps in Analysis.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
ECON 338/ENVR 305 CLICKER QUESTIONS Statistics – Question Set #8 (from Chapter 10)
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Interpretation of Common Statistical Tests Mary Burke, PhD, RN, CNE.
STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Determining and Interpreting Associations between Variables Cross-Tabs Chi-Square Correlation.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Ass. Prof. Dr. Mogeeb Mosleh
Association, correlation and regression in biomedical research
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Presentation transcript:

Statistical hypothesis testing – Inferential statistics II. Testing for associations

Three main topics Chi-square test of association Correlation analysis Linear regression analysis

Introduction Association: A general term used to describe the relationship between two variables. If two variables are associated, the value of one variable could be more or less guessed, provided we know the value of the other variable. Briefly, they are NOT independent of each other in a statistical sense. E.g: –Colour of hair and eyes: If someone’s hair is brown there is great likelihood his eyes are brown too. –Length and weight of fish: The longer the fish the greater its weight is.

Chi-square test of association We use this test to examine the association between two or more categorical (nominal or factor) variables. Data should be arranged in a contingency table. Variable 1 Variable 2 Categories of variable 1 Categories of variable Observed frequencies of cases

Contingency tables test whether the patterns of frequencies in one categorical variable differ between different levels of other categorical variable: Could the variables be independent of one another? H0: the observations are independent of one another, that is the categorical variables are not associated. Test statistic: chi 2 Null distribution: chi 2 distribution with df = (nb of rows - 1) × (nb of columns - 1)

Correlation analysis Correlation: –It is a monotonous type of association: The greater the value of one variable the greater (positive correlation) or less (negative correlation) the value of the other variable is. The scale of measurement of the two variables need to be at least ordinal scale. There is no distinction between dependent and independent variables => there is no attempt to interpret the causality of the association. Two frequently used types of correlation: –Pearson’s product-moment correlation –Spearman’s rank correlation.

Pearson’s product-moment correlation –Correlation coefficient: It measures the strength of the relationships between two variables. [-1 < r < 1] r = -1 perfect negative correlation r = 1 perfect positive correlation r = 0 there is no correlation –H0: r = 0 H1: r != 0 –Assumptions: Both variables are measured on a continuous scale. Both variables are normally distributed. If assumptions are not met Spearman’s rank correlation should be used.

Spearman’s rank correlation –Actually it is the same correlation as Pearson’s one but computed on the basis of ranks of the data.

Regression analysis We assume that there is dependence structure between the variables: –dependent (response) variable (Y) – effect –independent (explanatory or predictor) variable (X) – cause. Aim of the analysis: describe the relationship between Y and X in a function form. This function can be used for prediction. Simple linear regression: there is only one X variable in the model: Y = b 0 + b 1 X 1 Multiple linear regression: there are two or more X variables in the model: Y = b 0 + b 1 X 1 + b 2 X 2 + … + b p X p

Simple linear regression model: Simple linear regression: y = x Parameters of the model: β0: the value of y when x = 0 (y-intercept) β1: the degree to which y changes per unit of change in x (gradient of line, i.e. regression slope)

Hypothesis tests in simple linear regression: –F-test: the general test of the model –t-test for zero intercept: H0: β0 = 0 H1: β0 != 0 –t-test for zero slope (result of it is the same as that of the F-test in simple linear regression): H0: β1 = 0 There is no relationship between X and Y. H1: β1 != 0 There is a relationship between X and Y. Coefficient of determination (R 2 ): –Gives the proportion of the variation in Y that is accounted for by X.

Residuals of the model (error): –The variation in the data left over after the linear regression model has been accounted for. Model validation process: –After applying a model on our data, we need to check if the assumptions of linear regression analysis are met. –This can be done by examining the residuals of our fitted model. Assumptions of the linear regression model: –Independence: observations are independent of one another. –Normality: it means that the populations of Y-values and the error terms (ε i ) are normally distributed for each value of the predictor variable x i. –Homogeneity of variance: is means that the populations of Y-values, and the error terms (ε i ), have the same variance for each x i.