There is a hypothesis about dependent and independent variables The relation is supposed to be linear We have a hypothesis about the distribution of errors.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Analysis of variance and statistical inference.
Kin 304 Regression Linear Regression Least Sum of Squares
Chapter 12 Simple Linear Regression
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Variance and covariance M contains the mean Sums of squares General additive models.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Chapter 10 Simple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
Correlation and linear regression
Intro to Statistics for the Behavioral Sciences PSYC 1900
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Chapter 11 Multiple Regression.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Topic 3: Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Variance and covariance Sums of squares General linear models.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Correlation & Regression
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
MTH 161: Introduction To Statistics
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Chapter 13 Multiple Regression
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Correlation & Regression Analysis
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Multiple Regression David A. Kenny January 12, 2014.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Venn diagram shows (R 2 ) the amount of variance in Y that is explained by X. Unexplained Variance in Y. (1-R 2 ) =.36, 36% R 2 =.64 (64%)
Summary of the Statistics used in Multiple Regression.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
The simple linear regression model and parameter estimation
Simple Linear Regression
Regression Analysis AGEC 784.
BIVARIATE REGRESSION AND CORRELATION
BPK 304W Correlation.
Multiple Regression Models
Linear regression Fitting a straight line to observations.
Product moment correlation
Chapter 13 Additional Topics in Regression Analysis
Multiple Regression Berlin Chen
Presentation transcript:

There is a hypothesis about dependent and independent variables The relation is supposed to be linear We have a hypothesis about the distribution of errors around the hypothesized regression line There is a hypothesis about dependent and independent variables The relation is non-linear We have no data about the distribution of errors around the hypothesized regression line There is no clear hypothesis about dependent and independent variables The relation is non-linear We have no data about the distribution of errors around the hypothesized regression line Assumptions of linear regression

Least squares method Assumptions: A linear model applies The x-variable has no error term The distribution of the y errors around the regression line is normal

The second example is nonlinear We hypothesize the allometric relation W = aB z Linearised regression model Assumption: The distribution of errors is lognormal Nonlinear regression model Assumption: The distribution of errors is normal

Y=e 0.1X +norm(0;Y)Y=X 0.5 e norm(0;Y) In both cases we have some sort of autocorrelation Using logarithms reduces the effect of autocorrelation and makes the distribution of errors more homogeneous. Non linear estimation instead puts more weight on the larger y-values. If there is no autocorrelation the log-transformation puts more weight on smaller values.

Linear regression European bat species and environmental correlates

N=62 Matrix approach to linear regression X is not a square matrix, hence X -1 doesn’t exist.

The species – area relationship of European bats What about the part of variance explained by our model? 1.16: Average number of species per unit area (species density) 0.24: spatial species turnover

How to interpret the coefficient of determination Statistical testing is done by an F or a t-test. Total variance Rest (unexplained) variance Residual (explained) variance

The general linear model A model that assumes that a dependent variable Y can be expressed by a linear combination of predictor variables X is called a linear model. The vector E contains the error terms of each regression. Aim is to minimize E.

The general linear model If the errors of the preictor variables are Gaussian the error term e should also be Gaussian and means and variances are additive Total variance Explained variance Unexplained (rest) variance

1.Model formulation 2.Estimation of model parameters 3.Estimation of statistical significance Multiple regression

Multiple R and R 2

Adjusted R 2 R: correlation matrix n: number of cases k: number of independent variables in the model D<0 is statistically not significant and should be eliminated from the model.

A mixed model

The final model Is this model realistic? Very low species density (log-scale!) Realistic increase of species richness with area Increase of species richness with winter length Increase of species richness at higher latitudes A peak of species richness at intermediate latitudes The model makes realistic predictions. Problem might arise from the intercorrelation between the predictor variables (multicollinearity). We solve the problem by a step-wise approach eliminating the variables that are either not significant or give unreasonable parameter values The variance explanation of this final model is higher than that of the previous one.

Multiple regression solves systems of intrinsically linear algebraic equations The matrix X’X must not be singular. It est, the variables have to be independent. Otherwise we speak of multicollinearity. Collinearity of r<0.7 are in most cases tolerable. Multiple regression to be safely applied needs at least 10 times the number of cases than variables in the model. Statistical inference assumes that errors have a normal distribution around the mean. The model assumes linear (or algebraic) dependencies. Check first for non-linearities. Check the distribution of residuals Y exp -Y obs. This distribution should be random. Check the parameters whether they have realistic values. Multiple regression is a hypothesis testing and not a hypothesis generating technique!! Polynomial regression General additive model

Standardized coefficients of correlation Z-tranformed distributions have a mean of 0 an a standard deviation of 1. In the case of bivariate regression Y = aX+b, R xx = 1. Hence B=R XY. Hence the use of Z-transformed values results in standardized correlations coefficients, termed  -values