1 6. Other issues Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.

Slides:



Advertisements
Similar presentations
Autocorrelation and Heteroskedasticity
Advertisements

Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
pH Emission Spectrum Emission(3 λ) λ1 λ2 λ3 A λ λ1λ2λ3λ1λ2λ3 A Ex 1 Emission(3 λ) λ1λ2λ3λ1λ2λ3 A Ex 2 Emission(3 λ) λ1λ2λ3λ1λ2λ3 A Ex 3 λ1λ2λ3λ1λ2λ3.
Fitting the PARAFAC model Giorgio Tomasi Chemometrics group, LMT,MLI, KVL Frederiksberg. Denmark
Ch.6 Simple Linear Regression: Continued
Catalysis/ Rothenberg, ISBN Catalysis: Concepts and Green Applications Lecture slides for Chapter 6: Computer.
1 Bootstrap Confidence Intervals for Three-way Component Methods Henk A.L. Kiers University of Groningen The Netherlands.
1 Detection and Analysis of Impulse Point Sequences on Correlated Disturbance Phone G. Filaretov, A. Avshalumov Moscow Power Engineering Institute, Moscow.
Segmentation and Fitting Using Probabilistic Methods
S-SENCE Signal processing for chemical sensors Martin Holmberg S-SENCE Applied Physics, Department of Physics and Measurement Technology (IFM) Linköping.
Response Surface Method Principle Component Analysis
Chemometric Investigation of Polarization Curves: Initial Attempts
A 4-WEEK PROJECT IN Active Shape and Appearance Models
PARAFAC and Fluorescence Åsmund Rinnan Royal Veterinary and Agricultural University.
By: S.M. Sajjadi Islamic Azad University, Parsian Branch, Parsian,Iran.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Designing Experiments In designing experiments we: Manipulate the independent.
CALIBRATION Prof.Dr.Cevdet Demir
Multivariate R e g r e s s i o n
A quick introduction to the analysis of questionnaire data John Richardson.
Contrast Enhancement Crystal Logan Mentored by: Dr. Lucia Dettori Dr. Jacob Furst.
Response Surfaces max(S(  )) Marco Lattuada Swiss Federal Institute of Technology - ETH Institut für Chemie und Bioingenieurwissenschaften ETH Hönggerberg/
Two-way Analysis of Three-way Data. Two-way Analysis of Two-way Data = X D Y D = X Y 23.
Independent Component Analysis (ICA) and Factor Analysis (FA)
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
PARAFAC is an N -linear model for an N-way array For an array X, it is defined as  denotes the array elements  are the model parameters  F is the number.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Basics of regression analysis
1 5. Multiway calibration Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
1 2. The PARAFAC model Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
Uncertainty analysis is a vital part of any experimental program or measurement system design. Common sources of experimental uncertainty were defined.
Classification of Instruments :
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Dominant Eigenvalues & The Power Method
Matrices Write and Augmented Matrix of a system of Linear Equations Write the system from the augmented matrix Solve Systems of Linear Equations using.
Component Reliability Analysis
First Al-Khawarezmi Conference: Qatar, December 6-8, 2010 Ali Hadi 0 0 The Effects of Centering and Scaling the Rows of Multidimensional Data on Their.
Summarized by Soo-Jin Kim
CONVENTIONAL AND MODEL BASED TEST ANALYSIS GasTurb 12 Copyright © GasTurb GmbH.
Threeway analysis Batch organic synthesis. Paul Geladi Head of Research NIRCE Chairperson NIR Nord Unit of Biomass Technology and Chemistry Swedish University.
Cosmic Microwave Background Carlo Baccigalupi, SISSA CMB lectures at TRR33, see the complete program at darkuniverse.uni-hd.de/view/Main/WinterSchoolLecture5.
Calibrated imputation of numerical data under linear edit restrictions Jeroen Pannekoek Natalie Shlomo Ton de Waal.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
In the name of GOD. Zeinab Mokhtari 1-Mar-2010 In data analysis, many situations arise where plotting and visualization are helpful or an absolute requirement.
CWWUC Presentation April 8, 2009 Application of the Integrated Impact Analysis Tool.
THREE-WAY COMPONENT MODELS pages By: Maryam Khoshkam 1.
Equilibrium systems Chromatography systems Number of PCs original Mean centered Number of PCs original Mean centered
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
1 4. Model constraints Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
Principal Component Analysis (PCA)
Psychology 202a Advanced Psychological Statistics October 22, 2015.
Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP
Tutorial I: Missing Value Analysis
1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,
Intro. ANN & Fuzzy Systems Lecture 16. Classification (II): Practical Considerations.
Irena Váňová. B A1A1. A2A2. A3A3. repeat until no sample is misclassified … labels of classes Perceptron algorithm for i=1...N if then end * * * * *
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
1 Statistics & R, TiP, 2011/12 Multivariate Methods  Multivariate data  Data display  Principal component analysis Unsupervised learning technique 
= the matrix for T relative to the standard basis is a basis for R 2. B is the matrix for T relative to To find B, complete:
Independent Component Analysis features of Color & Stereo images Authors: Patrik O. Hoyer Aapo Hyvarinen CIS 526: Neural Computation Presented by: Ajay.
An Introduction to Model-Free Chemical Analysis Hamid Abdollahi IASBS, Zanjan Lecture 3.
MECH 373 Instrumentation and Measurements
Conventional and Model Based Test analysis
Refitting PCA/MPCA and CLS/PARAFAC Models to Incomplete Data Records
Part 5 - Chapter
Chapter 3 Component Reliability Analysis of Structures.
Probabilistic Models with Latent Variables
STIS Status Report Kailash C. Sahu Apr 18, 2002 TIPS.
Presentation transcript:

1 6. Other issues Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP

2 How many components to use? Use ‘unfolding trick’ i.e. look at rank of each mode. –does not have strict statistical basis, but generally works well! Use core-consistency diagnostic (PARAFAC). –also seems to work well in practice Split-half analysis. Does algorithm converge without problems? Use full cross-validation. –N-way Toolbox now has a routine for this – can be slow! Look at loadings and residuals. Use chemical knowledge.

3 Preprocessing: centering (1) We are often interested in the differences between objects, not in their absolute values. –building calibration models: differences between samples Mean-centering removes offsets from the data –removes constant background effects –can help to linearize data, i.e.

4 Preprocessing: centering (2) When performing a calibration, it is most common to remove the mean value from each column: X object variable Two-way X primary variable secondary variable object x jk Three-way

5 Preprocessing: scaling (1) Sometimes we want to analyse variables measured in different units –chemical engineering: temperatures, pressures, flow rates –QSAR: ionization constants, Hammett constants, dipole moments These variables should be scaled in order to give variables an equal chance to appear in the model.

6 Preprocessing: scaling (2) For two-way arrays (object  variables), it is common to divide by the standard deviation after mean- centering the data (‘autoscaling’): X object variable Two-way X primary variable secondary variable object x jk Three-way Autoscaling can destroy multilinear structure!

7 Preprocessing: scaling (3) process variable time object X XjXj Slab scaling maintains the multilinear structure! process variable 1 process variable 2 object X XjXj XkXk Double slab scaling may also be useful - ITERATIVE

8 Tucker models Tucker1: X = AG + E –Tucker1 = PCA Tucker2: X = G(B  A) T + E –G (I  R 2 R 3 ) –very rarely used Tucker3: X = AG(C  B) T + E

9 PARAFAC2 time shift wavelength (J) time (K) object (I) In PARAFAC2, only the matrix product X i X i T (J  J) is modelled. It works if the correlation structures in the objects are the same. time shift

10 Missing data Expectation-maximization (EM) is a technique for estimating models (PARAFAC, Tucker, PLS, PCA etc.) when some of the data is missing: X = [X* X # ] known missing 0. Initialize X # 1. Estimate model, (maximization) 3. Repeat until convergence 2. Replace missing values with model values (expectation)

11 Muitoobrigado parasua atenção!