1 5. Multiway calibration Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.

Slides:



Advertisements
Similar presentations
PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Advertisements

Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
LSRL Least Squares Regression Line
BA 555 Practical Business Analysis
Data mining and statistical learning, lecture 4 Outline Regression on a large number of correlated inputs  A few comments about shrinkage methods, such.
CALIBRATION Prof.Dr.Cevdet Demir
Review: The Logic Underlying ANOVA The possible pair-wise comparisons: X 11 X 12. X 1n X 21 X 22. X 2n Sample 1Sample 2 means: X 31 X 32. X 3n Sample 3.
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
1 2. The PARAFAC model Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
Linear Regression Analysis
Regression / Calibration MLR, RR, PCR, PLS. Paul Geladi Head of Research NIRCE Unit of Biomass Technology and Chemistry Swedish University of Agricultural.
Regression and Correlation Methods Judy Zhong Ph.D.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
1 6. Other issues Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
LECTURE UNIT 7 Understanding Relationships Among Variables Scatterplots and correlation Fitting a straight line to bivariate data.
Stat 13, Thur 5/24/ Scatterplot. 2. Correlation, r. 3. Residuals 4. Def. of least squares regression line. 5. Example. 6. Extrapolation. 7. Interpreting.
Get out your Interpretation WS! You will be able to predict values based on a regression line. You will be able to communicate the risk in extrapolation.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Research Methods I Lecture 10: Regression Analysis on SPSS.
Chapter 10 Correlation and Regression
Chapter 2 Looking at Data - Relationships. Relations Among Variables Response variable - Outcome measurement (or characteristic) of a study. Also called:
Regression. Population Covariance and Correlation.
WARM-UP Do the work on the slip of paper (handout)
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
Least Squares Regression Remember y = mx + b? It’s time for an upgrade… A regression line is a line that describes how a response variable y changes as.
Linear Regression Day 1 – (pg )
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
1 4. Model constraints Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
Principal Component Analysis (PCA)
Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 7 Linear Regression. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable.
Get out p. 193 HW and notes. LEAST-SQUARES REGRESSION 3.2 Interpreting Computer Regression Output.
1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Chapter 3 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Statistics 200 Lecture #6 Thursday, September 8, 2016
CHAPTER 3 Describing Relationships
Unit 4 LSRL.
LSRL.
Least Squares Regression Line.
AP Statistics Chapter 14 Section 1.
Statistics 200 Lecture #5 Tuesday, September 6, 2016
Chapter 5 LSRL.
LSRL Least Squares Regression Line
Cautions about Correlation and Regression
Chapter 3.2 LSRL.
Example of PCR, interpretation of calibration equations
Least Squares Regression Line LSRL Chapter 7-continued
Section 3.3 Linear Regression
Least-Squares Regression
What is Regression Analysis?
BA 275 Quantitative Business Methods
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 LSRL.
Linear Regression and Correlation
Least-Squares Regression
Warmup A study was done comparing the number of registered automatic weapons (in thousands) along with the murder rate (in murders per 100,000) for 8.
Linear Regression and Correlation
Presentation transcript:

1 5. Multiway calibration Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

2 Multiway regression problems e.g. batch reaction monitoring Process measurements Product quality X process variable time batch product quality Y

3 Multiway regression problems e.g. tandem mass spectrscopy X5X5 X4X4 X3X3 X2X2 X1X1 samples parent ion m/z daughter ion m/z sample compound MS-MS spectra Compound concentrations

4 Some terminology Univariate calibration (OLS – ordinary least squares) Multivariate calibration (ridge regression, PCR, PLS etc.) Second-order advantage (PARAFAC, restricted Tucker, GRAM, RBL etc.) zero-order first-order second-order Cannot handle interferents Can handle interferents if they are present in the training set Can handle unknown interferents (although see work of K.Faber) N-PLS(?)

5 Multiway calibration methods PARAFAC (already discussed on first day) (Unfold-PLS) Multiway PCR N-PLS MCovR (multiway covariates regression) (see work of Smilde & Gurden) GRAM, NBRA, RBL (see work of Kowalski et al.)

6 Unfold-PLS Matricize (or ‘unfold’) the data and use standard two- way PLS: X J K I X1X1...XIXI I JK But if a multiway structure exists in the data, multiway methods have some important advantages!! M Y I

7 Two-way PCR Standard PCR for X (I  J) and y (I  1). 1.Calculate PCA model of X: X = TP T + E 2.Use PCA scores for ordinary regression: y = Tb + E b = (T T T) -1 T T y 3.Make predictions for new samples: T new = X new P y new = T new b Y b 1.Calculate PCA model of X: X = TP T + E 2.Use PCA scores for ordinary regression: y = Tb + E b = (T T T) -1 T T y X E PTPT T + = 1.Calculate PCA model of X: X = TP T + E

8 Multiway PCR Multiway PCR for X (I  J  K) and y (I  1). 1.Calculate multiway model: X = A(C|  |B) T + E 2.Use scores for regression: y = A b PCR + E b PCR = (A T A) -1 A T y 3.Make predictions for new samples: A new = X new P(P T P) -1 where P = (C|  |B) y new = A new b PCR Y b PCR 1.Calculate multiway model: X = A(C|  |B) T + E 2.Use scores for regression: y = A b PCR + E b PCR = (A T A) -1 A T y BTBT A + = CTCT XE 1.Calculate multiway model: X = A(C|  |B) T + E

9 N-PLS N-PLS is a direct extension of standard two-way PLS for N-way arrays. The advantages of N-PLS are the same as for any multiway analysis: –a more parsimonious model –loadings which are easier to plot and interpret

10 N-PLS The standard two-way PLS algorithm (see ‘Multivariate Calibration’ by Martens and Næs): The N-PLS algorithm (R.Bro) uses PARAFAC-type loadings, but is otherwise very similar

11 N-PLS graphic (taken from R.Bro)

12 Other methods Multiway covariates regression (MCovR) –different to PLS-type models –choice of structure on X (PARAFAC, Tucker, unfold etc.) –sometimes loadings are easier to interpret – standard, N mixture, N + M Restricted Tucker, GRAM, RBL, NBRA etc. –for more specialized use –second-order advantage, i.e. able to handle unknown interferents 11 00 N M restricted loadings, A

13 Conclusions There are a number of different calibration methods for multiway data. N-PLS is a extension of two-way PLS for multiway data. All the normal guidelines for multivariate regression still apply!! –watch out for outliers –don’t apply the model outside of the calibration range

14 Outliers are objects which are very different from the rest of the data. These can have a large effect on the regression model and should be removed. Outliers (1) Remove outlier bad experiment

15 Outliers (2) Outliers can also be found in the model space or in the residuals Scores PC 1 Scores PC 2

16 Model extrapolation... Univariate example: mean height vs age of a group of young children A strong linear relationship between height and age is seen. For young children, height and age are correlated. Moore, D.S. and McCabe G.P., Introduction to the Practice of Statistics (1989).

17... can be dangerous! Linear model was valid for this age range......but is not valid for 30 year olds!