Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian.

Similar presentations


Presentation on theme: "Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian."— Presentation transcript:

1 Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian species. The data he recorded are shown in the table. The objectives are: –Construct an equation relating Time to Wt. –Understand the model selection criteria. –Estimate mean Time for a given Wt with 95% CLM. Time (hr)Wt (kg) 1.2240.9 2.1444.3 2.3944.7 3.5048.6 1.6643.0 2.9745.4 3.9550.0 1.3441.8 2.5145.0 3.5349.0 1.7243.4 3.1746.2 4.1150.8 1.5142.4 2.7845.1 3.8549.7 1.9343.9 3.3247.0 4.1851.1

2 Xuhua Xia The Relationship Is Nonlinear Y = a + b X ? Y = a e X ? Y = a X b ?

3 Evaluate linearity md <- read.table("poly1.txt",header=T) attach(md) fit <- lm(Time~Wt) install.packages("lmtest") library(lmtest) dwtest(fit) data: fit DW = 2.8585, p-value = 0.9766 alternative hypothesis: true autocorrelation is greater than 0 Insensitive to deviation from linearity when deviation from linearity occurs in the middle section.

4 Xuhua Xia Polynomial Regression Polynomial regression is a special type of multiple regression whose independent variables are powers of a single variable X. It is used to approximate a curve with unknown functional form. Y i =  +  1 X +  2 X 2 + … +  k X k +  I Model selection is done by successively testing highest order terms and discarding insignificant highest-order terms. Tests should use a liberal level of significance, such as  = 0.25. The starting order should usually be k < N/10, where N is the number of observations. An alternative is lowess regression if functional form is unknown, or non-linear regression if function is known.

5 Xuhua Xia Polynomial Regression The main reason for successively testing/discarding highest degree terms and discarding insignificant terms is because the higher order terms are more prone to random error in X, i.e, the random error is multiplied several times in higher order terms. Suppose the true value for X is 2 but, because of measurement error, we obtain a value of 3. X 2 is then 9. If we had measured the X value accurately, the X 2 value would have been 4. So the value of 9 obtained is 4 + 5 units of error. X 3 = 27 = 8 + 19 units of error. Thus, if an order-4 regression is not significantly better than an order-3 regression, then the X 4 term is dropped. Contrast with the model selection in multiple regression with X 1, X 2, etc.

6 R functions md<-read.table("poly1.txt",header=T) nd<-md[order(md$Wt),] attach(nd) Fit polynomial models: fit<-lm(Time~Wt) fit2<-lm(Time~poly(Wt,2,raw=T)) fit3<-lm(Time~poly(Wt,3,raw=T)) fit4<-lm(Time~poly(Wt,4,raw=T)) fit5<-lm(Time~poly(Wt,5,raw=T)) fit6<-lm(Time~poly(Wt,6,raw=T)) Visualize fit: par(mfrow=c(2,3)) plot(Wt,Time,main="linear") lines(Wt,fitted(fit),col="red") plot(Wt,Time,main="y=a+…+b2*Wt^2") lines(Wt,fitted(fit2),col="red") plot(Wt,Time,main="y=a+…+b3*Wt^3") lines(Wt,fitted(fit3),col="red") plot(Wt,Time,main="y=a+…+b4*Wt^4") lines(Wt,fitted(fit4),col="red") plot(Wt,Time,main="y=a+…+b5*Wt^5") lines(Wt,fitted(fit5),col="red") plot(Wt,Time,main="y=a+…+b6*Wt^6") lines(Wt,fitted(fit6),col="red")

7 Visualization

8 Test models and predict Xuhua Xia 1.Examine adjusted R 2 summary(fit|fit2|…|fit6) 2.AIC(fit|fit2|…|fit6) 3.polyfit <- function(i) x <- AIC(lm(Time~poly(Wt,i,raw=T))) as.integer(optimize(polyfit,interval = c(1,6))$minimum ) 4.anova(fit, fit2); anova(fit2,fit3); … 5.newd<-data.frame(Wt=51) predict(fit4,newd,interval="confidence") ci95<-predict(fit4,nd,interval="confidence") plot(Wt,Time) lines(Wt,ci95[,1],col="red") lines(Wt,ci95[,2],col="blue") lines(Wt,ci95[,3],col="blue") Help with R: myFun<-function(x)x<-2*x+x^2 optimize(myFun,interval=c(-5,5))$minimum

9 Xuhua Xia The Danger of Polynomial Regression RandXRandY 0.652320.95616 0.107430.70663 0.291660.01942 0.645330.90362 0.951480.67739 0.718220.90728 0.885130.64330 0.025420.07266 0.858520.85366 0.736690.96528 0.222720.18555 0.546210.52321 0.574600.65462 0.336400.21208 0.950800.04560 0.053650.09695 0.069280.35087

10 Xuhua Xia Polynomial Regression (order 6)


Download ppt "Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time (Time) and body weight (Wt) in the males of a mammalian."

Similar presentations


Ads by Google