/k 2DS00 Statistics 1 for Chemical Engineering lecture 4.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Probability & Statistical Inference Lecture 9
12-1 Multiple Linear Regression Models Introduction Many applications of regression analysis involve situations in which there are more than.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
12 Multiple Linear Regression CHAPTER OUTLINE
Review of Univariate Linear Regression BMTRY 726 3/4/14.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
11.1 Introduction to Response Surface Methodology
2DS00 Statistics 1 for Chemical Engineering Lecture 2.
Section 4.2 Fitting Curves and Surfaces by Least Squares.
2DS00 Statistics 1 for Chemical Engineering Lecture 3.
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 25 Multiple Regression Diagnostics (Sections )
Regression Hal Varian 10 April What is regression? History Curve fitting v statistics Correlation and causation Statistical models Gauss-Markov.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 6: Multiple Regression
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Multiple Linear Regression
Regression Diagnostics Checking Assumptions and Data.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 15: Model Building
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Correlation & Regression
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
/ department of mathematics and computer science DS01 Statistics 2 for Chemical Engineering
Chapter 12 Multiple Regression and Model Building.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 15 Building Multiple Regression Models.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Byron Gangnes Econ 427 lecture 3 slides. Byron Gangnes A scatterplot.
Anaregweek11 Regression diagnostics. Regression Diagnostics Partial regression plots Studentized deleted residuals Hat matrix diagonals Dffits, Cook’s.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Simple Linear Regression ANOVA for regression (10.2)
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
BUILDING THE REGRESSION MODEL Data preparation Variable reduction Model Selection Model validation Procedures for variable reduction 1 Building the Regression.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
1 MGT 511: Hypothesis Testing and Regression Lecture 8: Framework for Multiple Regression Analysis K. Sudhir Yale SOM-EMBA.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
DSCI 346 Yamasaki Lecture 6 Multiple Regression and Model Building.
Chapter Outline EMPIRICAL MODELS 11-2 SIMPLE LINEAR REGRESSION 11-3 PROPERTIES OF THE LEAST SQUARES ESTIMATORS 11-4 SOME COMMENTS ON USES OF REGRESSION.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Chapter 15 Multiple Regression Model Building
Chapter 9 Multiple Linear Regression
Chapter 12: Regression Diagnostics
Business Statistics, 4e by Ken Black
CHAPTER 29: Multiple Regression*
Chapter 13 Additional Topics in Regression Analysis
Adequacy of Linear Regression Models
Essentials of Statistics for Business and Economics (8e)
Business Statistics, 4e by Ken Black
3.2. SIMPLE LINEAR REGRESSION
Regression and Correlation of Data
Presentation transcript:

/k 2DS00 Statistics 1 for Chemical Engineering lecture 4

/k Week schedule Week 1: Measurement and statistics Week 2: Error propagation Week 3: Simple linear regression analysis Week 4: Multiple linear regression analysis Week 5: Nonlinear regression analysis

/k Detailed contents of week 4 multiple linear regression polynomial regression interaction multicollinearity measures of model adequacy selection of regression models

/k Specific warmth specific warmth of vapour at constant pressure as function of temperature data set from Perry’s Chemical Engineers’ Handbook thermodynamic theories say that quadratic relation between temperature and specific warmth usually suffices:

/k Scatter plot of specific warmth data

/k Regression output specific warmth data

/k Issues in regression output significance of model significance of individual regression parameters residual plots: –normality (density trace, normal probability plot) –constant variance (against predicted values + each independent variable) –model adequacy (against predicted values) –outliers –independence influential points

/k Residual plot specific warmth data This behaviour is visible in plot of fitted line only after rescaling!

/k Plot of fitted quadratic model for specific warmth data

/k Conclusion regression models for specific warmth data we need third order model (polynomial of degree 3) careful with extrapolation original data set contains influential points original data set contains potential outliers

/k Yield data yield of chemical reaction as function of both temperature and pressure goal of regression analysis is to find optimal settings of temperature and pressure start with simplest linear models: –no interaction –interaction

/k No interaction Temperature Yield Pressure = 5.5 Pressure =

/k Interaction Temperature Yield Pressure = 5.5 Pressure =

/k Interaction plot for yield data

/k First-order interaction model for yield data

/k Comments on first-order interaction model model significant, but R-squared relatively low residual plots suggest quadratic terms are missing:

/k Full quadratic model for yield data

/k Comments on full quadratic model for yield data strong improvement on R-squared independent variable T is no longer significant other independent variables involving T remain significant refit model omitting independent variable T while keeping the other independent variables

/k Incomplete quadratic model for yield data all parameters significant residual plots OK normality OK 3 influential points but standard deviations of parameter estimates are OK, so no action 2 possible outliers at predicted yield of 61% accept model and use it for finding optimal settings for yield

/k Optimal settings for yield

/k Problems with model selection variables may be significant in one model but not in another number of possible models increases rapidly with number of independent variables independent variables may influence each other (multicollinearity)

/k Multicollinearity Phenomenon: variables x i (almost) satisfy a linear relation Cause: large variances of parameter estimates. Not harmful for predictions Unpleasant for finding causal relations Ways to check for multicollinearity: –wrong signs of parameter estimates – significant model, but (many) non significant parameters – large variances of parameter estimates

/k Procedures for model selection compute all possible regression models –only possible with few independent variables –choice of best model through adequacy measures: determination coefficient (adjusted for number of ind. variables) MSE (directly related to standard error) Mallow’s C p (estimates total mean square error) sequentially add terms (forward regression) sequentially delete terms from full model (backward regression) These procedures do not necessarily yield the same result Final models should always be checked!