Grey modeling approaches to investigate chemical processes Romà Tauler 1 and Anna de Juan 2 IIQAB-CSIC 1, UB 2 Spain

Slides:



Advertisements
Similar presentations
Analysis of the Visible Absorption Spectrum of I 2 in Inert Solvents Using a Physical Model Joel Tellinghuisen Department of Chemistry Vanderbilt University.
Advertisements

PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
By: Bahram Hemmateenejad. Complexity in Chemical Systems Unknown Components Unknown Numbers Unknown Amounts.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Evolving Factor Analysis The evolution of a chemical system is gradually known by recording a new response vector at each stage of the process under study.
pH Emission Spectrum Emission(3 λ) λ1 λ2 λ3 A λ λ1λ2λ3λ1λ2λ3 A Ex 1 Emission(3 λ) λ1λ2λ3λ1λ2λ3 A Ex 2 Emission(3 λ) λ1λ2λ3λ1λ2λ3 A Ex 3 λ1λ2λ3λ1λ2λ3.
Facing non-bilinearity in the multivariate analysis of voltammetric data José Manuel Díaz-Cruz *, Cristina Ariño, Miquel Esteban Electroanalysis Group.
The General Linear Model Or, What the Hell’s Going on During Estimation?
PCA + SVD.
S-SENCE Signal processing for chemical sensors Martin Holmberg S-SENCE Applied Physics, Department of Physics and Measurement Technology (IFM) Linköping.
Advanced process modelling with multivariate curve resolution Anna de Juan 1,(*) and Romà Tauler Chemometrics group. Universitat de Barcelona. Diagonal,
Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft NDACC H2O workshop, Bern, July 2006 Water vapour profiles by ground-based FTIR Spectroscopy:
A B C k1k1 k2k2 Consecutive Reaction d[A] dt = -k 1 [A] d[B] dt = k 1 [A] - k 2 [B] d[C] dt = k 2 [B] [A] = [A] 0 exp (-k 1 t) [B] = [A] 0 (k 1 /(k 2 -k.
Multivariate Resolution in Chemistry
Newton-Gauss Algorithm iii) Calculation the shift parameters vector R (p 0 )dR(p 0 )/dR(p 1 )dR(p 0 )/dR(p 2 )=- - p1p1 p2p2 - … - The Jacobian Matrix.
The rank of a product of two matrices X and Y is equal to the smallest of the rank of X and Y: Rank (X Y) =min (rank (X), rank (Y)) A = C S.
4 Th Iranian chemometrics Workshop (ICW) Zanjan-2004.
CALIBRATION Prof.Dr.Cevdet Demir
Multiple-view Reconstruction from Points and Lines
Two-way Analysis of Three-way Data. Two-way Analysis of Two-way Data = X D Y D = X Y 23.
Initial estimates for MCR-ALS method: EFA and SIMPLISMA
Including trilinear and restricted Tucker3 models as a constraint in Multivariate Curve Resolution Alternating Least Squares Romà Tauler Department of.
1 MCR-ALS analysis using initial estimate of concentration profile by EFA.
Study of the interaction between the porphyrin TmPyP4 and the Thrombin Binding Aptamer (TBA) G-quadruplex Miquel del Toro 1, Ramon Eritja 2, Raimundo Gargallo.
Ordinary least squares regression (OLS)
Elaine Martin Centre for Process Analytics and Control Technology University of Newcastle, England The Conjunction of Process and.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
1 2. The PARAFAC model Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
CALIBRATION METHODS.
Factor Analysis Psy 524 Ainsworth.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
To determine the rate constants for the second order consecutive reactions, a number of chemometrics and hard kinetic based methods are described. The.
Spectroscopy Chromatography PhysChem Naming Drawing and Databasing Enterprise Solutions Software for Interactive Curve Resolution using SIMPLISMA Andrey.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.
Threeway analysis Batch organic synthesis. Paul Geladi Head of Research NIRCE Chairperson NIR Nord Unit of Biomass Technology and Chemistry Swedish University.
Geographic Information Science
Quality Assurance How do you know your results are correct? How confident are you?
CALIBRATION METHODS. For many analytical techniques, we need to evaluate the response of the unknown sample against the responses of a set of standards.
Synchronous map Asynchronous map Order of spectral changes: 1)1654 cm-1 (  -helix) 2) 1641, 1594 cm -1 (disordered structures, COO - ) 3) 1675, 1616 cm-1.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
Advanced Analytical Chemistry – CHM 6157® Y. CAIFlorida International University Updated on 9/26/2006Chapter 3ICPMS Interference equations Isobaric.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
A Flexible New Technique for Camera Calibration Zhengyou Zhang Sung Huh CSPS 643 Individual Presentation 1 February 25,
Equilibrium systems Chromatography systems Number of PCs original Mean centered Number of PCs original Mean centered
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
1 4. Model constraints Quimiometria Teórica e Aplicada Instituto de Química - UNICAMP.
Introducing Error Co-variances in the ARM Variational Analysis Minghua Zhang (Stony Brook University/SUNY) and Shaocheng Xie (Lawrence Livermore National.
Rotational Ambiguity in Soft- Modeling Methods. D = USV = u 1 s 11 v 1 + … + u r s rr v r Singular Value Decomposition Row vectors: d 1,: d 2,: d p,:
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
? Investigate the effects of extent of spectral overlapping on the results of RAFA.
1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,
1 Tom Edgar’s Contribution to Model Reduction as an introduction to Global Sensitivity Analysis Procedure Accounting for Effect of Available Experimental.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Self-Modeling Curve Resolution and Constraints Hamid Abdollahi Department of Chemistry, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan,
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Chemical Data Fitting as a Powerful Chemometric Method.
An Introduction to Model-Free Chemical Analysis Hamid Abdollahi IASBS, Zanjan Lecture 3.
CSE 554 Lecture 8: Alignment
St. Petersburg State University, St. Petersburg, Russia March 1st 2016
Curve fitting methods for complex and rank deficient chemical data
Rotational Ambiguity in Hard-Soft Modeling Method
Strategies for Eliminating Interferences in Optical Emission Spectroscopy Best practices to optimize your method and correct for interferences to produce.
Structure from motion Input: Output: (Tomasi and Kanade)
Unfolding Problem: A Machine Learning Approach
X.6 Non-Negative Matrix Factorization
Unfolding with system identification
WOODWARD-FEISER RULE It is used for calculating the absorption maxima
Structure from motion Input: Output: (Tomasi and Kanade)
Presentation transcript:

Grey modeling approaches to investigate chemical processes Romà Tauler 1 and Anna de Juan 2 IIQAB-CSIC 1, UB 2 Spain

Grey modeling approaches to investigate chemical processes Introduction to chemical modeling: white (hard), black (soft) and grey modeling in chemistry Multivariate Curve Resolution as a grey modeling method Grey modeling applications using MCR- ALS

Modeling approaches Hard Modeling  White Modeling Models based on Physical/Chemical Laws Soft Modeling  Black Modeling Empirical Models with no knowledge/assumptions about the Physical/chemical laws of the system (usually non-linear) Models with no assumptions about the physical/chemical model but with assumptions about the measurement model (usually multivariate and linear) Soft+Hard Modeling?  Grey Modeling? Mixed Models partially using information about physical/ /chemical laws

Chemical model (variation of compound contribution) MIXTURE Non-existent PROCESS Known Too complex Unknown Chemical multicomponent systems. Structure Measurement model (variation of the instrumental signal) Simple additive linear model (Factor Analysis tools) DD1D1 D2D2 DnDn = D =+ s 1 + c 2 s 2 c n s n c 1 D = C S s n s 1 c n c 1

Hard (White) Modeling Data modeling and data fitting in chemical sciences has been traditionally done by hard modeling techniques. They are based on physical/chemical models which are already known (or assumed, proposed,...) The parameters of these model are not known and they are estimated by least squares curve fitting This approach may be also called white modeling and it is valid for well known phenomena and laboratory data, where the variables of the model are under control during the experiments and only the phenomena under study affect the data.

Hard (White) Modeling Find the optimal parameters of the Model, 

Hard (White) Modeling Case 1 Kinetic Systems: Y ij = A ij measured absorbances of sample/solution i wavelength j Measurement model assumptions: Chemical Model assumptions: Defining the residuals: Finding the best model and its parameters

Hard (White) Modeling Case 2 Solution Equilibria: Y ij = A ij measured absorbances of sample/solution i wavelength j. Measurement model assumptions: Chemical Model assumptions: Defining the residuals: Finding the best model and its parameters

mp=0 guess parameters, k 0 calculate residuals, r(k 0 ) and the sum of squares, ssq calculate Jacobian J calculate shift vector  k, and k 0 = k 0 +  k end; display results ssq old ssq mp=0 mp / 3 mp  5 <  > yes no The Newton-Gauss- Levenberg/Marquardt (NGL/M) algorithm Hard (White) Modeling

In soft (black) modeling no physical model is assumed. In some cases a linear measurement model is assumed (factor analysis methods) In other cases dependencies among variables and sources of variation are considered to be non linear (neural networks, genetic algorithms, …) The goal of these methods is the explanation of data variance using the minimal or softer assumptions about data Soft (Black) Modeling

Example of Soft (Black) Modeling Factor Analysis/Principal Component Analysis Bilinear Model D = U V T + E Unique solutions but without physical meaning Constraints: U orthogonal, V T orthonormal V T in the direction of maximum variance N  D U VTVT E + I JJJ I I N N << I or J

Hard (white)- vs. Soft (black)-modelling Pros HM Well defined behaviour model (useful chemical information). Unique solutions. Reduced number of parameters to be optimised (e.g., K, k,..) Pros SM  No explicit model is required.  Information on the process or signal may be used (constraints).  May help to set or to validate a physicochemical model.

Cons HM The underlying model should be correct and completely known. No variations other than those related to the model should be present in the data set. Cons SM  Ambiguous solutions.  Does not provide directly physicochemical (kinetic or thermodynamic,...) information. Hard (white)- vs. Soft (black)-modelling

Use HM The variation of the system is completely described by a reliable physicochemical model. Clean reaction systems (kinetic or thermodynamic processes) Use SM  The model describing the variation of the data is too complex, unknown or non- existent. Images. Chromatographic data. Macromolecular processes.

Mixed systems with hard-modelable and soft- modelable parts are proposed –Hard-model: kinetic process, equilibrium reaction..... –Soft-model: interferent, background, drift, unknown.... Introducing a hard-model part decreases the ambiguity related to pure soft-modeling methods and gives additional information (parameters). Introducing a soft-model part, may help to clarify the nature of the physicochemical model and give more reliable results. Grey (hard+soft) modeling

Grey modeling approaches to investigate chemical processes Introduction to chemical modeling: white (hard), black (soft) and grey modeling in chemistry Multivariate Curve Resolution as a grey modeling method Grey modeling applications using MCR- ALS

Multivariate Curve Resolution (MCR) Goal Knowing the identity and contribution of each pure compound (entity) in the process or in the mixture. PROCESS The composition changes in a continuous evolutionary manner. E.g. chemical reactions, processes, HPLC-DAD. MIXTURE The composition changes with a random pattern variation. E.g. Series of independent samples. The composition changes with a non-random pattern variation. E.g. environmental data, spectroscopic images. A tool to analyse (resolve) changes in composition and response in multicomponent systems.

Multivariate Curve Resolution Pure component information  C STST snsn s1s1 c n c 1 Wavelengths Retention times Pure concentration profiles Chemical model Process evolution Compound contribution Pure signals Compound identity D Mixed information tRtR

Multivariate Curve Resolution methods D = CS T + E Investigation of chemical reactions (kinetics, equilibria, …) using multivarite measurements (spectrometric,...) Industrial processes (blending, syntheses,…). Macromolecular processes. Biochemical processes (protein folding). Spectroscopic images. Mixture Analysis (in general) Hyphenated separation techniques (HPLC-DAD, GC-MS, CE-DAD,...). Environmental data (model of pollution sources) ……………..

Multivariate Curve Resolution Bilinear Model: Factor Analysis Model D = C S T + E  D C STST E + I JJJ I I K N << I or J N Non-unique solutions but with physical meaning (rotational/ intensity ambiguities are present) Constraints: C and S T non-negative C or S T scaled (normalization, closure) Other constraints (unimodality, local rank, selectivity, previous knowledge... )

D1D1 D2D2 D3D3 STST C1C1 C2C2 C3C3 Z DC Multivariate Curve resolution Alternating Least Squares MCR-ALS Extension to multiple data matrices quantitative information row-, concentration profiles column-, spectra profiles column-wise augmented data matrix NR 1 NR 2 NR 3 NC NM = 3

Advantages of matrix augmetation (multiway data) Resolution local rank conditions are achieved in many situations for well designed experiments (unique solutions!) Rank deficiency problems can be more easily solved Unique decompositions are easily achieved for trilinear data (trilinear constraints) Constraints (local rank/selectivity and natural constraints) can be applied independently to each component and to each individual data matrix. J,of Chemometrics 1995, 9, J.of Chemometrics and Intell. Lab. Systems, 1995, 30, 133

Multivariate Curve Resolution – Alternating Least Squares (MCR-ALS) Determination of the number of components (i.e. by SVD) Building of initial estimates (C or S T ) Iterative optimisation of C and/or S T by Alternating Least Squares (ALS) subject to constraints. Check for satisfactory CS T data reproduction. Data exploration Input of external information as CONSTRAINTS The aim is the optimal description of the experimental data using chemically meaningful pure profiles. Fit and validation

Optional constraints (local rank, non-negativity, unimodality,closure,…) are applied at each iteration Initial estimates of C or S are obtained from EFA or from pure variable detection methods. C and S T are obtained by solving iteratively the two LS equations: An algorithm for Bilinear Multivariate Curve Resolution Models : Alternating Least Squares (MCR-ALS)

Constraints Definition Any chemical or mathematical feature obeyed by the profiles of the pure compounds in our data set. C and S T can be constrained differently. The profiles within C and S can be constrained differently. Constraints transform resolution algorithms into problem-oriented data analysis tools

Soft constraints Non-negativity C* Retention times CcCc Retention times Concentration profiles spectra

Unimodality C* Retention times CcCc Retention times Reaction profiles Chromatographic peaks Voltammograms Soft constraints

Selectivity/local rank Concentration selectivity/local rank constraint C* Retention times C c < threshold values Retention times We know that this region is not rank 3, but rank 2!

D Select Updated STST ALS c ALS Local model b, b 0 C Concentration correlation constraint (multivariate calibration)

STST C = D D1D1 D2D2 D3D3 Trilinearity Constraint (flexible to every species) Extension of MCR-ALS to multilinear systems 1st score loadings PCA, SVD Folding species profile 1st score gives the common shape Loadings give the relative amounts! Trilinearity Constraint Unfolding species profile Unique Solutions ! Substitution of species profile C Selection of species profile R.Tauler, I.Marqués and E.Casassas. Journal of Chemometrics, 1998; 12, 55-75

Hard modeling: Mass balance or Closure constraint C* pH c total pH CcCc  = c total c total Mass balance Closed reaction systems Hard modeling constraints

Hard modeling: Mass action law and rate laws Hard modeling constraints C pH C cons pH Physicochemical model Kinetic processes Equilibrium processes

The hard model is introduced as a new and essential constraint in the soft-modelling resolution process. It is applied in a flexible manner, as the soft- modelling constraints. –To some or to all process profiles. –To some or to all matrices in a three-way data set. –Different hard models can be applied to different matrices in a three-way data set. Grey modeling using MCR-ALS soft + hard modeling constraints

C pH C cons pH physicochemical model (mass action law, rate law) Kinetic processes Equilibrium processes C SM C HM Grey modeling using MCR-ALS soft model (non-negativity ) HM SM

1.Select the soft-modelled profiles to be constrained (C SM ). 2.Non-linear fit of the selected profiles according to the hard model selected. 3.Update the soft-modelled profiles C SM.by the fitted C HM. min(ssq(C SM -C HM )) ssq=f(C SM, model, parameters) Grey modeling using MCR-ALS

Grey modeling approaches to investigate chemical processes Introduction to chemical modeling: white (hard), black (soft) and grey modeling in chemistry Multivariate Curve Resolution as a grey modeling method Grey modeling applications using MCR- ALS

Grey modeling approaches to investigate chemical processes Examples: 1.Getting kinetic and analytical information from mixed systems (drift and interferents) 2.Using a physicochemical model to decrease resolution ambiguity and getting analytical information 3.pH induced transitions in hemoglobin

Time Concentration x 10 4 Wavelengths Absorbance A B C C B A i D a d Concentration drift D d Time Kinetic process + drift Time Concentration interf. D i Kinetic process + interferent k 1 = k 2 = 1 Model Grey modeling applications using MCR-ALS consecutive irreversible Example 1 Getting kinetic information from mixed systems (drift and interferents) Anna de Juan, Marcel Maeder, Manuel MartÍnez, Romà Tauler Analytica Chimica Acta 442 (2001) 337–350;

Kinetic model C HM = f(k 1, k 2 ) Kinetic process + drift/interferent A, B, C HM Drift, inter SM

Time Concentration (a.u.) x 10 4 Wavelength channel Absorptivities (a.u.) Time Concentration (a.u.) x 10 4 a) b) Kinetic process + drift Time Concentration x 10 4 Wavelength channel Absorbance Time Concentration x 10 4 Wavelength channel Absorbance a) b) Kinetic process + interferent HM HSM Grey modeling applications using MCR-ALS Example 1 Getting kinetic information from mixed systems (drift and interferents) SystemAlgorithmk 1 = 1k 2 = 1 A,B,C (drift)HM HSM A,B,C (interferent)HM HSM

Anna de Juan, Marcel Maeder, Manuel MartÍnez, Romà Tauler Chemometrics and Intelligent Laboratory Systems –141

Example 2. Using a physicochemical model to decrease resolution ambiguity. Getting analytical information. Chemical problem: multiequilibria systems Quantitation of an analyte (H 2 A) in the presence of an interferent (H 2 B). Measurements FT-IR monitored pH titrations H 2 A (malic acid) H 2 B (tartaric acid) Grey modeling applications using MCR-ALS Highly overlapped concentration profiles

Example 2. Using a physicochemical model to decrease resolution ambiguity. Getting analytical information. Too correlated concentration profiles Too overlapped spectra Too ambiguous SM solutions Quantitation fails Data set Standard H 2 A Sample H 2 A/H 2 B pH Grey modeling applications using MCR-ALS

Time effect on pH transitions (UV) pH pH Grey modeling applications using MCR-ALS Example 3 Time effect on pH induced transitions in hemoglobin SM 1,2 Heme group unbound 3 Native 4 Heme bound (change in coordination)

Time-dependent acidic conformations evolve very similarly with pH (rank- deficiency). The kinetic matrix helps in the resolution of the acidic conformations in the pH- dependent process. Hard-modelling constraint applied to the kinetic process helps to a less ambiguous recovery of the acidic conformations in the pH- dependent process. time pH D C STST = Global description of the process After 48 hours Grey modeling applications using MCR-ALS Example 3 Time effect on pH induced transitions in hemoglobin SM HM

All the pH-dependent conformations can be resolved, even those time- dependent. Additional kinetic information is obtained. k 1 = 1.424e e -8 Complete description SM + HM Wavelengths (nm) Grey modeling applications using MCR-ALS Example 3 Time effect on pH induced transitions in hemoglobin HM SM

Some References Soft+Hard (Grey) Modelling A. de Juan, M. Maeder, M. Martínez, R. Tauler. Chemom. Intell. Lab. Sys. 54 (2000) 123. A. de Juan, M. Maeder, M. Martínez, R. Tauler. Anal. Chim. Acta, 442 (2001) 337 J.Diework, A. de Juan, R.Tauler and B.Lendl. Applied Spectroscopy, 2002, 56, J. Diewok; A. de Juan; M. Marcel; R. Tauler; B. Lendl. Analytical Chemistry, 2003, 76, 641-7

Acknowledgements Chemometrics Group (UB and IIQAB-CSIC) –Staff: Romà Tauler, Javier Saurina, Anna de Juan, Raimundo Gargallo –Post-doc: Montse Vives, Mónica Felipe –PhD : Susana Navea, Joaquim Jaumot, Emma Peré-Trepat, Elisabeth Teixido –Master: Silvia Termes, Silvia Mas, Gloria Muñoz, Marta Terrado, Xavier Puig. Manel Martínez, University of Barcelona (Spain) Marcel Maeder (University of Newcastle, Australia) Josef Diewok (University of Viena, Austria)