Exercise 1: Gestational age and birthweight

Slides:



Advertisements
Similar presentations
Inference for Regression
Advertisements

Overview Correlation Regression -Definition
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Describing the Relation Between Two Variables
Project #3 by Daiva Kuncaite Problem 31 (p. 190)
Correlation and Regression Analysis
SPSS Session 4: Association and Prediction Using Correlation and Regression.
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
SIMPLE LINEAR REGRESSION AND CORRELLATION
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
Copyright © 2017, 2014 Pearson Education, Inc. Slide 1 Chapter 4 Regression Analysis: Exploring Associations between Variables.
Stats Methods at IC Lecture 3: Regression.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Chapter 15 Multiple Regression Model Building
Regression Analysis.
Inference for Least Squares Lines
Chapter 3: Describing Relationships
Statistical Data Analysis - Lecture /04/03
A little VOCAB.
Chapter 3: Describing Relationships
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
Regression Analysis Simple Linear Regression
Chapter 4 Correlation.
Chapter 3: Describing Relationships
Chapter 14: Correlation and Regression
Elementary Statistics
Interpret Scatterplots & Use them to Make Predictions
(Residuals and
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Lecture Slides Elementary Statistics Thirteenth Edition
Chapter 3: Describing Relationships
CHAPTER 26: Inference for Regression
Lecture Notes The Relation between Two Variables Q Q
Unit 3 – Linear regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Residuals and Residual Plots
Chapter 3: Describing Relationships
11A Correlation, 11B Measuring Correlation
3.1: Scatterplots & Correlation
Chapter 3: Describing Relationships
Product moment correlation
Chapter 3: Describing Relationships
Objective: Interpret Scatterplots & Use them to Make Predictions.
Summarizing Bivariate Data
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Learning outcomes By the end of this session you should know about:
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Exercise 1: Open the file ‘Birthweight_reduced’
Exercise 1: Entering data into SPSS
Exercise 1: Gestational age and birthweight
BUS-221 Quantitative Methods
Exercise 1 (a): producing individual tables, using the cross-tabs menu
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition
Exercise 1: Open the file ‘Birthweight_reduced’
Chapter 3: Describing Relationships
Presentation transcript:

Exercise 1: Gestational age and birthweight Draw a line of best fit through the data (with roughly half the points above and half below). Describe the relationship Is the relationship: strong/ weak? positive/ negative? linear?

Exercise 2: Interpretation Interpret the following correlation coefficients using Cohen’s classification and explain what they mean. Which correlations seem meaningful? Relationship Correlation Average IQ and chocolate consumption 0.27 Road fatalities and Nobel winners 0.55 Gross Domestic Product and Nobel winners 0.70 Mean temperature and Nobel winners -0.60

Exercise 3a: Scatterplot Use Recode > Transform into Different Variables to construct a variable for maternal smoking status (non-smoker / smoker) Construct a scatterplot for birthweight and gestational age? Use Set Markers by to distinguish between smokers and non-smokers Is there evidence of a linear relationship Interpret the correlation coefficient. What does it mean? Note: Think about which variable should be on the x axis (horizontal) and which should be on the y axis( vertical) If you double-click on the graph you can open the Graph dialog window and edit the chart, for example change the colours used for smokers and non-smokers

Exercise 3b: Scatterplot & Correlation Construct a scatterplot and calculate Pearson’s correlation coefficient for birthweight and maternal pre- pregnancy weight? Is there evidence of a linear relationship Interpret the correlation coefficient. What does it mean? Note: think about which variable should be on the x axis (horizontal) and which should be on the y axis( vertical)

Exercise 4 Investigate whether mother’s pre-pregnancy weight and birth weight are associated using a simple linear regression

Exercise 4: regression Adjusted R2 = Does the model result in reliable predictions? ANOVA p-value = Is the model an improvement on the null model (where every baby is predicted to be the mean weight)?

Exercise 4: Regression Pre-pregnancy weight coefficient and p-value: Regression equation: Interpretation:

Exercise 5 Re-run the regression model, but this time, produce the residual plots. Do you think that the assumptions of normality of residuals and homogeneity of variance are met?

Exercise 6: correlations Produce a correlation matrix for the correlations between Birthweight, Gestational age, Maternal height and Maternal pre-pregnancy weight: Analyse > Correlate > Bivariate & add the 4 variables to the Variables box:

Exercise 7 With birthweight as the outcome, run a series of regression models: Model 1: Gestational age Model 2: Gestational age and maternal smoking status Check the assumptions and interpret the output of Does the model give more reliable predictions than the model with just gestational age? Model 3: gestational age, maternal smoking status, maternal pre-pregnancy weight Model 4: gestational age, maternal smoking status, maternal pre-pregnancy weight, maternal height Note you will need to create a variable for smoking status based on the number of cigarettes that the mother smokes (assuming that 0 cigarettes indicates someone who does not smoke)

Exercise 7: model 1 summary Variable Coefficient (β) P-value Significant? Constant Gestation Adjusted R2 = Interpretation:

Exercise 7: model 2 summary Variable Coefficient (β) P-value Significant? Constant Gestation Smoker Adjusted R2 = Interpretation:

Exercise 7: model 3 summary Variable Coefficient (β) P-value Significant? Constant Gestation Smoker Pre-pregnancy weight Adjusted R2 = Interpretation:

Exercise 7: model 4 summary Variable Coefficient (β) P-value Significant? Constant Gestation Smoker Pre-pregnancy weight Height Adjusted R2 = Interpretation:

Exercise 7: Compare p-values Model Gestation Smoking Weight Height Model 1: P < 0.001 Model 2: Model 1 + Smoker 0.028 Model 3: Model 2 + Weight Model 4: Model 3 + Height

Exercise 7: Compare R2 Model R2 Adjusted R2 Model 1: Gestation 0.499 0.486 Model 2: Model 1 + Smoker 0.558 0.535 Model 3: Model 2 + Weight Model 4: Model 3 + Height

Exercise 1: Gestational age and birthweight There is a strong positive relationship which is linear

Exercise 2: Interpretation Relationship Correlation Interpretation Average IQ and chocolate consumption 0.27 Weak positive relationship. More chocolate per capita = higher average IQ Road fatalities and Nobel winners 0.55 Strong positive. More accidents = more prizes! Gross Domestic Product and Nobel winners 0.7 Strong positive. Wealthy countries = more prizes Mean temperature and Nobel winners -0.6 Strong negative. Colder countries = more prizes.

Exercise 3a: Scatterplot

Exercise 3b: scatterplot Is there a linear relationship? Yes!

Exercise 3b: correlation Pearson’s correlation = 0.40 Describe the relationship using the scatterplot and correlation coefficient: There is a moderate positive relationship between mothers’ pre-pregnancy weight and birth weight (r = 0.40). Generally, birth weight increases as mothers weight increases

Exercise 4: regression Adjusted R2 = 0.14 Does the model result in reliable predictions? Not really. The adjusted R2 value is 0.14. ANOVA p-value = 0.009 Is the model an improvement on the null model (where every baby is predicted to be the mean weight)? Yes as p < 0.05

Exercise 4: regression Pre-pregnancy weight coefficient & p-value: 0.034 (p = 0.009) Regression equation: y = 1.379 + 0.034 Interpretation: There is a significant relationship between a mothers’ pre-pregnancy weight and the weight of her baby (p = 0.009). Pre-pregnancy weight has a positive affect on a baby’s weight with an increase of 0.034 kg for each extra kg a mother weighs.

Exercise 5: normality of the residuals? Yes – histogram roughly peaks in the middle

Exercise 5: homoscedasticity? Yes – no patterns in residuals

Exercise 6: correlations Which variables are most strongly related to each other?

Exercise 6: correlations Which variables are most strongly related? Gestation and birth weight (0.708) Mothers height and weight (0.681) Mothers height and weight are strongly related. They don’t exceed 0.8 but try the model with and without height in case it’s a problem

Exercise 7: model 1 summary Variable Coefficient (β) P-value Significant? Constant -3.029 0.004 Yes Gestation 0.162 < 0.001 Adjusted R2: 0.489 Interpretation: As p < 0.05, gestational age is a significant predictor of birth weight. Weight increases by 0.16 kgs for each week of gestation

Exercise 7: model 2 summary Variable Coefficient (β) P-value Significant? Constant -2.661 0.009 Yes Gestation 0.162 < 0.001 Smoker -0.298 0.024 Adjusted R2: 0.541 Interpretation: As p < 0.05 for both smoking status and gestational age both are significant predictors of birth weight. Weight increases by 0.16 kgs for each week of gestation. Mothers who smoke have, on average babies who weigh 0.30kgs less than babies born to mothers who do not smoke.

Exercise 7: Model 2 residual assumptions Assumptions are met

Exercise 7: Model 2 ANOVA ANOVA p-value < 0.001 Is the model an improvement on the null model (where every baby is predicted to be the mean weight)? Yes as p < 0.05

Exercise 7: Model 2 Adjusted R2 Does the model result in reliable predictions? Yes – the adjusted R2 is reasonably high

Exercise 7: model 3 summary Variable Coefficient (β) P-value Significant? Constant -3.268 0.001 Yes Gestation 0.142 < 0.001 Smoker -0.305 0.016 Pre-pregnancy weight 0.020 0.025 Adjusted R2: 0.588 Interpretation: As p < 0.05 for all variables, all are significant predictors of birth weight. Weight increases by 0.14 kgs for each week of gestation. Mothers who smoke have, on average babies who weigh 0.30kgs less than babies born to mothers who do not smoke, and for each increase in pre- pregnancy weight of 1kg, babies weight increases by 0.02kgs, or 20gms. It is worth noting that whilst this is significant, it makes very little difference to birthweight in practice.

Exercise 7: model 4 summary Variable Coefficient (β) P-value Significant? Constant -0.4736 0.015 Yes Gestation 0.141 < 0.001 Smoker -0.306 0.016 Pre-pregnancy weight 0.013 0.263 No Height 0.012 0.366 Adjusted R2: 0.586 Interpretation: P < 0.05 for gestational age and smoking status. However, now that maternal height has been added to the model, neither pre-pregnancy weight nor height are significant. They are strongly related and are sharing some of the variation in birth weight when both in the model.

Exercise 7: Compare p-values Model Gestation Smoking Weight Height Model 1: < 0.001 Model 2: Model 1 + Smoker 0.024 Model 3: Model 2 + Weight 0.016 0.025 Model 4: Model 3 + Height 0.263 0.366 Smoking gets more significant as variables are added. Mothers’ weight becomes non-significant once height has been added. They are strongly related and are sharing some of the variation in birth weight when both in the model.

Exercise 7: Compare R2 Model R2 Adjusted R2 Model 1: Gestation 0.502 0.486 Model 2: Model 1 + Smoker 0.563 0.541 Model 3: Model 2 + Weight 0.618 0.588 Model 4: Model 3 + Height 0.627 0.586 Adding smoker and weight improves the fit a little bit Adding height has not improved the fit of the model at all as the adjusted R2 decreases