Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Linear regression models
Ch11 Curve Fitting Dr. Deshi Ye
Qualitative Variables and
Objectives (BPS chapter 24)
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
Statistical Methods Chichang Jou Tamkang University.
Curve-Fitting Regression
Least Square Regression
Lecture 14 – Thurs, Oct 23 Multiple Comparisons (Sections 6.3, 6.4). Next time: Simple linear regression (Sections )
Correlation and Regression Analysis
Chapter 12 Section 1 Inference for Linear Regression.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for regression - Simple linear regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
CHAPTER 14 MULTIPLE REGRESSION
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Slide 1 Non-linear regression All regression analyses are for finding the relationship between a dependent variable (y) and one or more independent variables.
Lecturer: Kem Reat, Viseth, PhD (Economics)
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Biostatistics Lecture 17 6/15 & 6/16/2015. Chapter 17 – Correlation & Regression Correlation (Pearson’s correlation coefficient) Linear Regression Multiple.
Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time and body weight in the males of a mammalian species.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Chapter 5 Demand Estimation Managerial Economics: Economic Tools for Today’s Decision Makers, 4/e By Paul Keat and Philip Young.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Xuhua Xia Stepwise Regression Y may depend on many independent variables How to find a subset of X’s that best predict Y? There are several criteria (e.g.,
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Correlation & Regression Analysis
PCB 3043L - General Ecology Data Analysis.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
T tests comparing two means t tests comparing two means.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Regression Chapter 6 I Introduction to Regression
PCB 3043L - General Ecology Data Analysis.
Lecture Slides Elementary Statistics Thirteenth Edition
Correlation and Regression
LESSON 24: INFERENCES USING REGRESSION
CHAPTER 12 More About Regression
Statistics II: An Overview of Statistics
EQUATION 4.1 Relationship Between One Dependent and One Independent Variable: Simple Regression Analysis.
15.1 The Role of Statistics in the Research Process
Presentation transcript:

Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian species. The data he recorded are shown in the table. The objectives are: –Construct an equation relating Time to Wt. –Understand the model selection criteria. –Estimate mean Time for a given Wt with 95% CLM. Time (hr)Wt (kg)

Xuhua Xia The Relationship Is Nonlinear Y = a + b X ? Y = a e X ? Y = a X b ?

Evaluate linearity md <- read.table("poly1.txt",header=T) attach(md) fit <- lm(Time~Wt) install.packages("lmtest") library(lmtest) dwtest(fit) data: fit DW = , p-value = alternative hypothesis: true autocorrelation is greater than 0 Insensitive to deviation from linearity when deviation from linearity occurs in the middle section.

Xuhua Xia Polynomial Regression Polynomial regression is a special type of multiple regression whose independent variables are powers of a single variable X. It is used to approximate a curve with unknown functional form. Y i =  +  1 X +  2 X 2 + … +  k X k +  I Model selection is done by successively testing highest order terms and discarding insignificant highest-order terms. Tests should use a liberal level of significance, such as  = The starting order should usually be k < N/10, where N is the number of observations. An alternative is lowess regression if functional form is unknown, or non-linear regression if function is known.

Xuhua Xia Polynomial Regression The main reason for successively testing/discarding highest degree terms and discarding insignificant terms is because the higher order terms are more prone to random error in X, i.e, the random error is multiplied several times in higher order terms. Suppose the true value for X is 2 but, because of measurement error, we obtain a value of 3. X 2 is then 9. If we had measured the X value accurately, the X 2 value would have been 4. So the value of 9 obtained is units of error. X 3 = 27 = units of error. Thus, if an order-4 regression is not significantly better than an order-3 regression, then the X 4 term is dropped. Contrast with the model selection in multiple regression with X 1, X 2, etc.

R functions md<-read.table("poly1.txt",header=T) nd<-md[order(md$Wt),] attach(nd) Fit polynomial models: fit<-lm(Time~Wt) fit2<-lm(Time~poly(Wt,2,raw=T)) fit3<-lm(Time~poly(Wt,3,raw=T)) fit4<-lm(Time~poly(Wt,4,raw=T)) fit5<-lm(Time~poly(Wt,5,raw=T)) fit6<-lm(Time~poly(Wt,6,raw=T)) Visualize fit: par(mfrow=c(2,3)) plot(Wt,Time,main="linear") lines(Wt,fitted(fit),col="red") plot(Wt,Time,main="y=a+…+b2*Wt^2") lines(Wt,fitted(fit2),col="red") plot(Wt,Time,main="y=a+…+b3*Wt^3") lines(Wt,fitted(fit3),col="red") plot(Wt,Time,main="y=a+…+b4*Wt^4") lines(Wt,fitted(fit4),col="red") plot(Wt,Time,main="y=a+…+b5*Wt^5") lines(Wt,fitted(fit5),col="red") plot(Wt,Time,main="y=a+…+b6*Wt^6") lines(Wt,fitted(fit6),col="red")

Visualization

Test models and predict Xuhua Xia 1.Examine adjusted R 2 summary(fit|fit2|…|fit6) 2.AIC(fit|fit2|…|fit6) 3.polyfit <- function(i) x <- AIC(lm(Time~poly(Wt,i,raw=T))) as.integer(optimize(polyfit,interval = c(1,6))$minimum ) 4.anova(fit, fit2); anova(fit2,fit3); … 5.newd<-data.frame(Wt=51) predict(fit4,newd,interval="confidence") ci95<-predict(fit4,nd,interval="confidence") plot(Wt,Time) lines(Wt,ci95[,1],col="red") lines(Wt,ci95[,2],col="blue") lines(Wt,ci95[,3],col="blue") Help with R: myFun<-function(x)x<-2*x+x^2 optimize(myFun,interval=c(-5,5))$minimum

Xuhua Xia The Danger of Polynomial Regression RandXRandY

Xuhua Xia Polynomial Regression (order 6)