ENME 392 Regression Theory Regression Analysis Regression analysis is an approach to develop a mathematical function from the experimental data in which uses a rigorous procedure that controls prediction errors. Not just “curve fitting” Guided by physics ENME 392 Regression Theory
ENME 392 Regression Theory Physical modeling – when you do not know the exact relationship, you can use Regression Analysis to identify the important factors AND using your understanding help identify the appropriate relationship. Remember from the previous lecture the possible relationships: Linear regression (Single independent variable) Nonlinear regression Multiple regression (Many independent variables) Polynomial regression ENME 392 Regression Theory
Regression Physical Example In extrusion, fill length as a function of throughput is important. Suppose Dr. Bigio conducts some experiments and he wants to figure out what is the relation between the two variables. Why? This model would help in defining a simple to use relationship between the variables. Could be used in a controls algorithm ENME 392 Regression Theory
ENME 392 Regression Theory Sample data ENME 392 Regression Theory
ENME 392 Regression Theory Data Plot ENME 392 Regression Theory
ENME 392 Regression Theory Method of Least Squares Principle: Minimize [ - Yi]2 Assume = a+bXi (linear relationship) Minimize [Yi-(a+bXi)]2 ENME 392 Regression Theory
ENME 392 Regression Theory Linear regression ENME 392 Regression Theory
ENME 392 Regression Theory Linear regression Minimizing w.r.t. “a” (1) Minimizing w.r.t. “b” (2) ENME 392 Regression Theory
ENME 392 Regression Theory Linear regression Solving (1) and (2) we get: ENME 392 Regression Theory
Linear regression line This is a linear regression line, with Y linearly dependent on X ENME 392 Regression Theory
ENME 392 Regression Theory
ENME 392 Regression Theory Regression Line The regression line from the output comes out as: ENME 392 Regression Theory
ENME 392 Regression Theory Interpretations What do you observe from the results: R-squared value – Nature of the residuals Is the result physically possible NO! you cannot have a negative fill length. Can you use the model? YES!! It could be fine for a controls algorithm, as long as the range of the specific throughput is limited!! ENME 392 Regression Theory
Standard Error -Single Variable Estimates the error between the predicted values of Y and the actual values From the output the value of = 7.29 ENME 392 Regression Theory
Correlation Coefficient Correlation Coefficient ( r ) Estimates the strength of the linear relationship between X and Y. Varies between -1 and 1 The closer “r” is in absolute value to 1, the greater the degree of correlation ENME 392 Regression Theory
Correlation Coefficient Mathematical expression where, From the output r = 0.948 ENME 392 Regression Theory
ENME 392 Regression Theory Nonlinear regression Suppose one assumed a nonlinear relationship between Y and X. Would there be a better fit in that case? ENME 392 Regression Theory
ENME 392 Regression Theory Non linear regression ENME 392 Regression Theory
ENME 392 Regression Theory Nonlinear regression Let us assume: Taking a natural log on both sides: This is now a linear relationship between lnY and X ENME 392 Regression Theory
ENME 392 Regression Theory Nonlinear regression In this case, ENME 392 Regression Theory
ENME 392 Regression Theory Nonlinear regression ENME 392 Regression Theory
ENME 392 Regression Theory
ENME 392 Regression Theory Nonlinear regression From the output data, the regression equation comes out to be: This implies ENME 392 Regression Theory
ENME 392 Regression Theory Nonlinear regression In this case, r=.9899 and Standard error = 0.05 This clearly shows that for the given set of values nonlinear regression achieved a better fit ENME 392 Regression Theory
What happens when Y is dependent on more than one variable? Multiple regression What happens when Y is dependent on more than one variable? ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Suppose gasoline mileage (Y) is dependent on two variables: Fuel Octane rating (X1) Average Speed (X2) How can one establish a regression equation relating Y with X1, X2 given the values of Y, X1 and X2 for some experiments? ENME 392 Regression Theory
ENME 392 Regression Theory Sample data ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Let = a + b1X1+b2X2 Using Method of Least Squares, one gets: ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Minimizing (1) w.r.t. a, b1, b2 (b) (c) ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Solving the three equations, one can get the values of a, b1 and b2 Multiple Linear Regression Equation: ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression In our example, a = -63.535 b1 = 1.1789 b2= -.24743 This implies: ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Standard error: Similar to single variable linear regression, one can have standard error in multiple variables ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression In our example: This summarizes the degree to which points are scattered around the regression plane. ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Advantages rYX1=.74, rYX2=.081 and rX1X2=.53 There is poor correlation between Y and X2, and at the same time a good correlation between X1 and X2 This shows that the effect of X2 on Y is camouflaged by its interaction with X1 ENME 392 Regression Theory
ENME 392 Regression Theory
ENME 392 Regression Theory Multiple regression Advantages (contd.) Thus, it would be a mistake not to include the effect of X2 on Y just on the fact that its correlation coefficient is low. ENME 392 Regression Theory
Polynomial regression ENME 392 Regression Theory
Polynomial regression Suppose one had to develop a regression equation for the stress strain curve. It is difficult to apply any nonlinear transformation. What can one do in this case? ENME 392 Regression Theory
Polynomial regression Assume In other words, incorporate higher powers of X to obtain a better fit ENME 392 Regression Theory
Polynomial regression Data for the stress strain curve ENME 392 Regression Theory
Polynomial regression ENME 392 Regression Theory
ENME 392 Regression Theory
Polynomial regression Regression Equation Correlation Coefficient ( r ) = .922 ENME 392 Regression Theory
ENME 392 Regression Theory Wrap Up Regression Linear (single variable) Non linear Multiple Polynomial Correlation coefficient Standard error ENME 392 Regression Theory