Regression analysis Linear regression Logistic regression.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
1
Ecole Nationale Vétérinaire de Toulouse Linear Regression
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
Solve Multi-step Equations
Simple Linear Regression 1. review of least squares procedure 2
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Chapter 4: Basic Estimation Techniques
PP Test Review Sections 6-1 to 6-6
VOORBLAD.
15. Oktober Oktober Oktober 2012.
2009 Foster School of Business Cost Accounting L.DuCharme 1 Determining How Costs Behave Chapter 10.
Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Constant, Linear and Non-Linear Constant, Linear and Non-Linear
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Statistical Analysis SC504/HS927 Spring Term 2008
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.4 Polynomials in Several Variables Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
9: Examining Relationships in Quantitative Research ESSENTIALS OF MARKETING RESEARCH Hair/Wolfinbarger/Ortinau/Bush.
Chapter 5 Test Review Sections 5-1 through 5-4.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 11 Measuring Item Interactions.
25 seconds left…...
Subtraction: Adding UP
Slippery Slope
Januar MDMDFSSMDMDFSSS
Determining How Costs Behave
Analyzing Genes and Genomes
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Essential Cell Biology
Chapter Thirteen The One-Way Analysis of Variance.
Intracellular Compartments and Transport
PSSA Preparation.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Essential Cell Biology
Simple Linear Regression Analysis
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Correlation and Linear Regression
Multiple Regression and Model Building
Heibatollah Baghi, and Mastee Badii
STAT E-150 Statistical Methods
Regression analysis Linear regression Logistic regression.
Multipe and non-linear regression. What is what? Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered.
N-way ANOVA. 3-way ANOVA 2 H 0 : The mean respiratory rate is the same for all species H 0 : The mean respiratory rate is the same for all temperatures.
Measures of Association Deepak Khazanchi Chapter 18.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Presentation transcript:

Regression analysis Linear regression Logistic regression

Relationship and association 2

Straight line 3

Best straight line? 4

Best straight line! 5 Least square estimation

Simple linear regression 1.Is the association linear? 6

Simple linear regression 1.Is the association linear? 2.Describe the association: what is b 0 and b 1 BMI = -12.6kg/m kg/m 3 *Hip 7

Simple linear regression 1.Is the association linear? 2.Describe the association 3.Is the slope significantly different from 0? Help SPSS!!! 8 Coefficients a Model Unstandardized Coefficients Standardized Coefficients tSig. BStd. ErrorBeta 1(Constant)-12,5812,331-5,396,000 Hip,345,023,56515,266,000 a. Dependent Variable: BMI

Simple linear regression 1.Is the association linear? 2.Describe the association 3.Is the slope significantly different from 0? 4.How good is the fit? How far are the data points fom the line on avarage? 9

The Correlation Coefficient, r 10 R = 0 R = 1 R = 0.7 R = -0.5

r 2 – Goodness of fit How much of the variation can be explained by the model? 11 R 2 = 0 R 2 = 1 R 2 = 0.5 R 2 = 0.2

Multiple linear regression Could waist measure descirbe some of the variation in BMI? BMI =1.3 kg/m kg/m 3 * Waist Or even better: 12

Multiple linear regression Adding age: adj R 2 = Adding thigh: adj R 2 = 0.352? 13 Coefficients a Model Unstandardized Coefficients Standardized Coefficients tSig. 95,0% Confidence Interval for B BStd. ErrorBetaLower BoundUpper Bound 1(Constant)-9,0012,449-3,676,000-13,813-4,190 Waist,168,043,2013,923,000,084,252 Hip,252,031,4118,012,000,190,313 Age-,064,018-,126-3,492,001-,101-,028 a. Dependent Variable: BMI Coefficients a Model Unstandardized Coefficients Standardized Coefficients tSig. 95,0% Confidence Interval for B BStd. ErrorBetaLower BoundUpper Bound 1(Constant)3,5811,7842,007,045,0757,086 Waist,168,043,2013,923,000,084,252 Age-,064,018-,126-3,492,001-,101-,028 Thigh,252,031,4118,012,000,190,313 a. Dependent Variable: BMI

Assumptions 1.Dependent variable must be metric continuous 2.Independent must be continuous or ordinal 3.Linear relationship between dependent and all independent variables 4.Residuals must have a constant spread. 5.Residuals are normal distributed 6.Independent variables are not perfectly correlated with each other 14

Non-parametric correlation 15

Ranked Correlation Kendall’s  Spearman’s r s Korrelation koefficienten er mellem -1 og 1. Hvor -1 er perfekt omvendt korrelation, 0 betyder ingen korrelation, og 1 betyder perfekt korrelation. Pearson is the correlation method for normal data Remember the assumptions: 1.Dependent variable must be metric continuous 2.Independent must be continuous or ordinal 3.Linear relationship between dependent and all independent variables 4.Residuals must have a constant spread. 5.Residuals are normal distributed 16

Kendall’s  - Et eksempel 17

Kendall’s  - Et eksempel 18

Spearman – det samme eksempel d2d

Korrelation i SPSS 20

Korrelation i SPSS Correlations ab a Pearson Correlation 1,685 * Sig. (2-tailed),029 N10 b Pearson Correlation,685 * 1 Sig. (2-tailed),029 N10 *. Correlation is significant at the 0.05 level (2-tailed). 21 Correlations ab Kendall's tau_ba Correlation Coefficient 1,000,511 * Sig. (2-tailed).,040 N10 b Correlation Coefficient,511 * 1,000 Sig. (2-tailed),040. N10 Spearman's rhoa Correlation Coefficient 1,000,685 * Sig. (2-tailed).,029 N10 b Correlation Coefficient,685 * 1,000 Sig. (2-tailed),029. N10 *. Correlation is significant at the 0.05 level (2-tailed).

Logistic regression 22

Logistic Regression If the dependent variable is categorical and especially binary? Use some interpolation method Linear regression cannot help us. 23

24 The sigmodal curve

25 The sigmodal curve The intercept basically just ‘scale’ the input variable

26 The sigmodal curve The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability

27 The sigmodal curve The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability Positive regression coefficient → risk factor increases the probability Logistic regession uses maximum likelihood estimation, not least square estimation

Does age influence the diagnosis? Continuous independent variable 28 Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,109,010108,7451,0001,1151,0921,138 Constant-4,213,42399,0971,000,015 a. Variable(s) entered on step 1: Age.

Does previous intake of OCP influence the diagnosis? Categorical independent variable Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a OCP(1)-,311,1802,9791,084,733,5151,043 Constant,233,1233,5831,0581,263 a. Variable(s) entered on step 1: OCP. 29

Odds ratio 30

Multiple logistic regression Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,123,011115,3431,0001,1311,1061,157 BMI,083,01918,7321,0001,0871,0461,128 OCP,528,2195,8081,0161,6951,1042,603 Constant-6,974,76283,7771,000,001 a. Variable(s) entered on step 1: Age, BMI, OCP. 31

Predicting the diagnosis by logistic regression What is the probability that the tumor of a 50 year old woman who has been using OCP and has a BMI of 26 is malignant? z = * * *1 = p = 1/(1+e ) = Variables in the Equation BS.E.WalddfSig.Exp(B) 95% C.I.for EXP(B) LowerUpper Step 1 a Age,123,011115,3431,0001,1311,1061,157 BMI,083,01918,7321,0001,0871,0461,128 OCP,528,2195,8081,0161,6951,1042,603 Constant-6,974,76283,7771,000,001 a. Variable(s) entered on step 1: Age, BMI, OCP.