Download presentation
Presentation is loading. Please wait.
1
Multivariate Resolution in Chemistry
Lecture 1 Roma Tauler IIQAB-CSIC, Spain
2
Lecture 1 Introduction to data structures and soft-modelling methods.
Factor Analysis of two-way data: Bilinear models. Rotation and intensity ambiguities. Pseudo-rank, local rank and rank deficiency. Evolving Factor Analysis.
3
Chemical sensors and analytical data structures
one variable x1 e.g. pH two variables x1,x2 e.g pH i T three variables x1, x2 and x3 e.g. pH, T i P n variables ????? * ***** * ** *** * pH pH T * * * * pH T * * * * P
4
Data Structures: Zero order Zero-way data
h x; one sample gives one scalar (tensor 0th order) Examples: - selective electrodes, pH - absorption at one wavelength - height/area chromatographic peak Assumptions: - total selectivity - known lineal response Tools: - univariate algebra and statistics Advantages: - simple and easy to understand Disadvantages: - only one compound information - total selectivity - one sensor for every analyte - low information content Time x x x x x hi x x x x ci
5
Data Structures: First order One-way data
x1, x2, ....., xn; one sample gives one vector (tensors of order 1) Examples: - matrix of sensors - absorption at many (spectra) - chromatograms at a single - current intensities at many E - readings with time (kinetics) Assumptions: known lineal responses - different and independent responses Tools: - linear algebra - multivariate statistics - spectral analysis - chemometrics (PCA,MLR, PCR, PLS...) Advantages: Calibration in presence of interferences is possible - Multicomponent analysis is possible Disadvantages: Interferences should be present in calibration samples Spectrum Wavelength (nm) Absorbance min 10 20 30 40 Chromatogram Time
6
Data Structures: Second order / Two-way data
xij; each sample gives a data table/matrix; tensor of order 2 X = xkykT Examples: - LC-DAD; LC-FTIR; GC-MS; LC-MS; FIA-DAD; CE-MS,.. (hyphenated techniques) - esp. excitation/emission (fluorescence) - MS/MS, NMR 2D, GCxGC-MS ... - spectroscopic/voltammetric monitoring of chemical reactions/processes with pH, time, T, etc. Assumptions: - linear responses - sufficient rank (of the data matrices) Tools: - linear algebra - chemometrics Advantages: - calibration for the analyte in the presence of interferences not modelled in calibration samples is possible - full characterization of the analyte and interferents may be possible - few calibration samples are needed (only one sample calibration)
7
Multi-way data analysis multivariate resolution
Data Structures: Third order Three-way data D Di time xijk; each sample gives a data cube; tensor of order 3 X = xkykzk Examples - Several spectroscopic matrices - Several hyphenated chromatographic - Hyphenated multidimensional chromatography (GC x GC / MS) - excitation/emission/time Assumptions: - bilinear/trilinear responses - sufficient rank (of the data matrices) Tools: - multilinear and tensor algebra - chemometrics Advantages: - unique solutions (no ambiguities) - calibration for the analyte in the presence of interferences not modelled in calibration samples is possible - full characterization of the analyte and interferents is possible - few calibration samples are needed (only one sample calibration) D time Run nr. Multi-way data analysis (PARAFAC, GRAM) Extended multivariate resolution
8
0th order data: ISE, pH,.. 1th order data: spectra 2nd order data: LC/DAD GC/MS fluorescence 3rf order data: time/ /excitation/ /emission
9
Examples Chemical reaction systems monitored using spectroscopic measurements (even at femtosecond scale) to follow the evolution of a reaction with time, pH, temperature, etc., and the detection of the formation and disappearance of intermediate and transient species Monitoring chemical reactions. P e r i s t a l t i c p u m p D (NR,NC) C o m p u t e r P i n S - 1 2 5 . 3 p H m e t r A u o b S p e c t r o p h o t o m e t e r pH . 5 m l wavelength T = 3 7 o C T h e r m o s t a t i c b a t h
10
Examples Quality control and optimisation of industrial batch reactions and processes, where on-line measurements are applied to monitor the process. Process analysis C o m p u t e r Spectrometer * probe wavelength D (NR,NC) time
11
Examples Analytical characterisation of complex environmental, industrial and food mixtures using hyphenated (chromatography, continuous flow methods with spectroscopic detection) Chromatographic Hyphenated techniques LC-DAD, GC-MS, LC-MS, LC-MS/MS.... D (NR,NC) time wavelength
12
Examples FIA-DAD-UV with pH gradient for the analysis of a mixture of drugs. D (NR,NC) pH wavelength
13
Examples Analytical characterisation of complex sea-water samples by means of Excitation-Emission spectra for an unknown with tripheniltin (in the reaction with flavonol) Excitation emission (fluorescence) EEM techniques D (NR,NC) excitation emission
14
Examples Protein folding and dynamic protein-nucleic acid interaction processes. In the post-genomic era, understanding these biochemical complex evolving processes is one of the main challenges of the current proteomics research. Conformation changes Primary structure Secondary structure Tertiary structure Quaternary structure Val Leu Ser Ala Asp Trp Gly His -helix -sheet turn Random coil Amino acids Helix, sheet formation Globule formation Assembled subunits D (NR,NC) Temperature wavelength
15
Examples Image analysis of spatially distributed chemicals on 2D surfaces measured using coupled microscopy-spectroscopy techniques in geological samples, biological tissues or food samples. Spectroscopic Image analysis Total number of pixels (x y) x y
16
Data Structures in Chemistry two orders/ways/modes of measurement
Experimental Data two orders/ways/modes of measurement D(NR,NC) row-order (way,mode) i.e. usually change in chemical composition (concentration order) column order (way,mode) i.e usually change in system properties like in spectroscopy, voltammetry,... (spectral order)
17
D Chemical data tables (two-way data) J variables (wavelengths)
Instrumental measurements (spectra, voltammograms,...) Data table or matrix concentration changes measurements (time, tempera-ture, pH, .... I spectra (times) D Plot of spectra (rows) Plot of elution profiles (columns)
18
Chemical data modelling
Chemical data modelling methods may be divided in: Hard- modelling methods (deterministic) Soft-modelling methods (data driven) Hybrid hard-soft modelling methods Hard modelling Soft modelling Physical Hard Model Analytical Information Data Data Data driven soft model ? Physical Model Analytical Information
19
Hard-modelling Hard-modelling approaches for chemical (stationary, dynamic, evolving…) systems are based on an accurate physical description of the system and on the solution of complex systems of (differential) equations fitting the experimental measurements describing the evolution and dynamics of these systems. They are deterministic models. Hard-modelling methods usually use non-linear least squares regression (Marquardt algorithm) and optimisation methods to find out the best values for the parameters of the model. Hard-modelling usually deal with univariate data. It has been often used in the past until the advent of modern instrumentation and computers giving large amounts of data outputs. Hard-modelling is often successful for laboratory experiments, where all the variables are under control and the physicochemical nature of the dynamic model is known and can be fully described using a known mathematical model
20
Hard-modelling However, and even at a laboratory level, there are examples where hard-modelling requirements and constraints are not totally fulfilled or no physicochemical model is known to describe the process (e.g. in chromatographic separations or in protein folding experiments). Data sets obtained from the study of natural and industrial evolving processes are too complex and difficult to analyse using hard-modelling methods. In these cases, there is no known physical model available or it is too complex to be set in a general way. Advanced hard-modelling in industrial applications has been attempted to model experimental difficulties, such as changes in temperature, pH, ionic strength and activity coefficients. This is a very difficult task! Data Fitting in the Chemical Sciences P. Gans, John Wiley and Sons, New York 1992
21
Hard modelling D C ST Output: C, S and model parameters.
10 20 30 40 50 60 70 80 90 100 0.5 1 1.5 2 2.5 Wavelength Absorbance 3 4 5 6 7 8 9 0.1 0.2 0.3 0.4 0.6 0.7 0.8 0.9 Time Concentration Non-linear model fitting min(D(I-CC+) C = f(k1, k2) D C 10 20 30 40 50 60 70 80 90 100 0.5 1 1.5 2 2.5 3 x 10 4 Wavelengths Absortivities LS (D, C) (ST) ST Output: C, S and model parameters. The model should describe all the variation in the experimental measurements.
22
Soft-modelling Soft-modelling instead, attempts the description of these systems without the need of an a priori physical or (bio)chemical model postulation. The goal of the latter methods is the explanation of the variations observed in the systems using the minimal and softer assumptions about data. They are data driven models. Soft models usually give an improved analytical description of the analysed process. Soft modelling needs more data than hard-modelling. Soft modelling methods deal with multivariate data. Its use has augmented in the recent years because of the advent of modern analytical instrumentation and computers providing large amounts of data outputs. The disadvantage of soft models is their poorer extrapolating capabilities (compared with hard-modelling).
23
Soft-modelling A soft model is hardly able to predict the behaviour of the system under very different conditions from which it was derived. Complex multivariate soft-modelling data analysis methods have been introduced for the study of chemical processes/systems like Factor Analysis derived methods. Factor Analysis is a multivariate technique for reducing matrices of data to their lowest dimensionality by the use of orthogonal factor space and transformations that yield predictions and/or recognizable factors. Factor Analysis in Chemistry 3rd Edition, E.R.Malinowski, Wiley, New York 2002
24
Constrained ALS optimisation
Soft modelling D C ST 10 20 30 40 50 60 70 80 90 100 0.5 1 1.5 2 2.5 3 x 10 4 Wavelengths Absortivities 10 20 30 40 50 60 70 80 90 100 0.5 1 1.5 2 2.5 Wavelength Absorbance 1 2 3 4 5 6 7 8 9 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Time Concentration , Constrained ALS optimisation LS (D,C) S* LS (D,S*) C* min (D –C*S*) Output: C and S. All absorbing contributions in and out of the process are modelled.
25
Lecture 1 Introduction to data structures and soft-modelling methods.
Factor Analysis of two-way data: Bilinear models. Rotation and intensity ambiguities. Pseudo-rank, local rank and rank deficiency. Evolving Factor Analysis.
26
Soft-modelling Factor Analysis (Bilinear Model) experimental data
is modelled as a linear sum of weighted (scores) factors (loadings) In matrix form data scores loadings
27
Soft-modelling BILINEARITY
= + ... A B C E Assumption: Bilinearity (the contributions of the components in the two orders of measurement are additive) 7
28
Soft-modelling GOALS OF BILINEAR MODEL =
0.35 0.35 0.3 0.3 0.25 0.25 = 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 20 40 60 50 100 Recovery of the responses of every component (chemical species) in the different modes of measurement 8
29
Soft-modelling: Factor Analysis
Experimental Data Matrix Principal Components Cluster Analysis Factor Identification Target testing Real Factor Models Predictions
30
Soft-modelling: Factor Analysis (traditional approach)
matrix multiplication Covariance matrix Data matrix decomposition combination abstract reproduction Abstract Factors Real Factors target transformation abstract rotation New Abstract Factors
31
Soft-modelling methods (I)
Factor Analysis methods based on the use of latent variables or eigenvalue/singular value data matrix decompositions. Examples PCA, SVD, rotation FA methods Evolving Factor Analysis methods Rank Annihilation methods Window Factor Analysis methods Heuristic Evolving Latent Projections methods Subwindow Factor Analysis methods …..
32
Soft-modelling methods (II)
Multivariate Resolution methods do a data matrix decomposition into their ‘pure’ components without using explicitly latent variables analysis techniques. Examples: SIMPLISMA Orthogonal Projection Approach (OPA), Positive Matrix Factorization methods (and Multilinear Engine extensions) Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) Gentle .....
33
Soft-modelling methods (III)
Three-way and Multiway methods which decompose three-way or multiway data structures. Examples: Multiway and multiset extensions of PCA Genralized rank Annihilation, GRAM; Direct Trilear Decomposition (DTD, TLD) Multiway and multiset extensions of MCR-ALS methods PARAFAC-ALS Tucker3-ALS
34
Soft-modelling Factor Analysis in Chemistry, 3rd Ed., E.R.Malinowski, John Wiley & Sons, New York, 2002 Principal Component Analysis, I.T. Jollife, 2nd Ed., Springer, Berlin, 2002 Multiway Analysis, Applications in the Chemical Sciences, A.Smilde, R.Bro and P.Geladi, John Wiley & Sons, New York, 2004 Multivariate Image Analysis, P.Geladi, John Wiley and Sons, 1996 Soft modeling of Analytical Data. A.de Juan, E.Casassas and R.Tauler, Encyclopedia of Analytical Chemistry: Instrumentation and Applications, Edited by R.A.Meyers, John Wiley & Sons, 2000, Vol 11,
35
Soft-modelling Data structures Type of Models
One way data (vectors) Linear and non-linear models di = b0 + b ci; di = fnon-linear(ci) Two way data (matrices) Bilinear and non-bilinear models Non-bilinear data can still be linear in one of the two modes Three-way data (cubes) Trilinear and non-trilinear models Non-trilinear data can still be bilinear in two modes di I samples J variables dij I samples D k=1,...,K conditions i=1,...,I j=1,...,J
36
D Soft-modelling Bilinear models for two way data: I
J dij I D dij is the data measurement (response) of variable j in sample i n=1,...,N are the number of components (species, sources...) cin is the concentration of component n in sample i; snj is the response of component n at variable j
37
D Soft-modelling Bilinear models for two way data U VT or ST E C I +
J J J U or C VT or ST N D E I + I I N << I or J N PCA D = UVT + E U orthogonal, VT orthonormal VT in the direction of maximum variance Unique solutions but without physical meaning Useful for interpretation but not for resolution! MCR D = CST + E Other constraints (non-negativity, unimodality, local rank,… ) U=C and VT =ST non-negative,... C or ST normalization Non-unique solutions but with physical meaning Useful for resolution (and obviously for interpretation)!
38
PCA Model (Principal Component Analysis)
X = U VT + E U ‘scores’ matrix (orthogonal) VT loadings matrix (orthonormal) SVD Model (Singular Value Decomposition) D = U* S VT + E U* ‘scores’ matrix (orthonormal) S diagonal matrix of the singular values s s = 1/2 eigenvalues of the covariances matrix DDT VT ‘loadings’ matrix (orthonormal)
39
PCA Model: D = U VT unexplained variance VT D E = loadings
(projections) + U scores D = u1v1T + u2v2T + ……+ unvnT + E n number of components (<< number of variables in D) D = u1v1T + u2 v2T +….+ unvnT + E rank 1 rank 1 rank 1
40
X = structure + noise PCA Model X = U VT + E
It is an approximation to the experimental data matrix X Loadings, Projections: VT relationships between original variables and the principal components (eigenvectors of the covariances matrix). Vectors in VT (loadings) are orthonormals (orthogonal and normalized). Scores, Targets: U relationships between the samples (coordinates of samples or objects in the space defined by the principal components Vectors in U (scores) are orthogonal Noise E Experimental error, non-explained variances
41
Summary of Principal Component Analysis PCA
Formulation of the problem to solve Plot of the original data 3. Data pretreatment. (data centering, autoscaling, logarithmic transformation…) 4. Built PCA model. Determination of the number of components. Graphical inspection of explained/residual plots) 5. Study of the PCA model PCA. Multivariate data exploration - ‘loadings’ plot ==> map of the variables - ‘scores’ plot ==> map of the samples Interpretation of the PCA mode. Identification of the main sources of data variance 7. Analysis of the residuals matrix E = D -U VT
42
D PCA U VT scores loadings Data set Scores plot Loadings plot Biplot
Site 1 Site 2 Site 3 Site 22 Sampling sites [org]1 [org]2 [org]3 [org]96 Pollutant concentration Data set D PCA -2 -3 -1 1 2 3 PC2 (27%) Scores plot -0.8 -0.6 -0.4 -0.2 0.2 Loadings plot 4 PC1 (41%) 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 -0.5 0.5 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 B A Biplot U scores VT loadings
43
Multivariate Curve Resolution (MCR)
D Mixed information tR Pure component information s1 sn ST c c 1 n C Wavelengths Retention times Pure concentration profiles Chemical model Process evolution Compound contribution relative quantitation Pure signals Compound identity source identification and Interpretation
44
Lecture 1 Introduction to data structures and soft-modelling methods.
Factor Analysis of two-way data: Bilinear models. Rotation and intensity ambiguities. Pseudo-rank, local rank and rank deficiency. Evolving Factor Analysis.
45
Factor Analysis Ambiguities in the analysis of a data matrix (two-way data)
Rotation and scale/intensity ambiguities Rotation Ambiguities Factor Analysis (PCA) Data Matrix Decomposition D = U VT + E ‘True’ Data Matrix Decomposition D = C ST + E 10
46
How to find the rotation matrix T?
Factor Analysis Ambiguities in the analysis of a data matrix (two-way data) Rotation and scale/intensity ambiguities Rotation Ambiguities D = U T T-1 VT + E = C ST + E C = U T; ST = T-1 VT How to find the rotation matrix T?
47
Matrix decomposition is not unique!
Rotation and scale/intensity ambiguities D = C ST + E = D* + E Cnew = C T ( NR,N) (NR,N) (N,N) STnew = T-1 ST (N,NC) (N,N) (N,NC) D* = C ST = CnewSTnew Matrix decomposition is not unique! T(N,N) is any non-singular matrix Rotational freedom for any T 6
48
= Rotation and scale/intensity ambiguities
Rotation ambiguities and rotation matrix T(N,N) = Cnew,2 Cold,1, Cold,2 t1,2 t2,2 STnew,1 STold,1 STold,2 STnew,2 t-12,2 t-12,1 T t-11,1 t-11,2 T-1 Cnew,1 t1,1 t2,1
49
Intensity (scale) ambiguities:
Rotation and scale/intensity ambiguities Intensity (scale) ambiguities: For any scalar k d c s ij in nj n k 1 Intensity/scale ambiguities make difficuly to obtain quantitative information When they are solved then it is also possible to have quantitative information 11
50
cold sold = = ( cold x k)(1/k x sold) = cnew snew
Rotation and scale/intensity ambiguities Intensity (scale) ambiguities: cold x k = cnew cold sold = = ( cold x k)(1/k x sold) = cnew snew x x 1/k x sold = snew
51
Rotation and scale/intensity ambiguities
Questions to answer: Is it possible to have unique solutions? What are the conditions to have unique solutions? If total unique solutions are not possible: Is it still possible at least to find out some of the possible solutions? Is it possible to have an estimation of the band or range of possible/feasible solutions? How this range of feasible solutions can be reduced?
52
Lecture 1 Introduction to data structures and soft-modelling methods.
Factor Analysis of two-way data: Bilinear models. Rotation and intensity ambiguities. Pseudo rank, local rank and rank deficiency. Evolving Factor Analysis.
53
Definitions Mathematical rank of a data matrix is the minimum number of linearly independent rows or columns describing the variance of the whole data set. Minimum number of basis vectors spanning the row and column vector spaces. It may be obtained by SVD or PCA. Pseudo-rank or Chemical rank is the mathematical rank in absence of experimental error/noise. Usually it is equal to the number of chemical/physical components contributing to the observed data variance apart from experimental noise/error. Obtained from the number of larger components from PCA, SVD or other FA methods Local Rank is the chemical rank of data submatrices. Obtained from EFA, EFF, SIMPLISMA, OPA, or other FA submatrix analysis methods Rank deficiency when chemical rank is lower than the known number of contributions. Rank deficiency may be broken/solved by data matrix augmentation and perturbation strategies. Rank overlap rank deficiency caused by equal vector profiles of different chemical/physical components in one or more modes.
54
Pseudo Rank: Number of contributions (factors, components)
Principal Component Analysis Gives an abstract (orthogonal) bilinear model to describe optimally the variation in our data set. D = U un u1 VT vn v1 Useful chemical information Size of the model (chemical rank) Number of chemical contributions
55
Pseudo Rank: Number of contributions (factors, components)
Principal Component Analysis (SVD algorithm) D = TPT = USVT Diagonal matrix (singular values) Magnitude of singular value Importance of contribution
56
Pseudo Rank: Number of contributions (factors, components)
Principal Component Analysis (SVD algorithm) D = TPT = USVT Diagonal matrix (singular values) 1 2 3 4 5 6 7 8 9 10 18 19 20 21 22 23 24 25 Number of components log(eigenvalues) 4 contributions Plot log(eigenvalues) Plot singular values Eigenvalue = (sing. value)2 large size small size
57
Pseudo Rank: Number of contributions (factors, components)
Overestimations of rank (overfitting). Large overestimation: the measurements may not follow a bilinear model. Small overfestimation: presence of structured noise or high noise levels. Underestimations of rank (rank deficiency). Linear dependencies Contributions with very similar signals or concentration profiles. Compounds with non-measurable signals. Minor compounds.
58
Rank deficiency No Rank- deficient systems
Are all the signals distinguishable and independent? Are all the concentration profiles distinguishable and independent? No Rank- deficient systems Detectable rank < nr. of process contributions Examples: 1) 2nd order reaction A = B + C, [B] = [C], 3 chemical species/contributions, but Rank =2 2) Enantiomer conversion monitored by UV and the spectrum D = spectrum L, two chemical species/components but Rank =1 (Rank overlap)
59
Rank deficiency Closed reaction systems. Some concentration profiles are described as linear combinations of others. System HA / A-, HB / B- CA = [HA] + [A-] CB = [HB] + [B-] CB = kCA [HA], [HB] [A-] = CA - [HA] [B-] = CB - [HB] = kCA - [HB] CA CB [HA], [HB], [A-], [B-] f ([HA], [HB], CA) Rank 3
60
Breaking rank-deficiency by matrix augmentation
Matrix Augmentation in the rank-deficient direction Data set HA HA / HB pH CB kCA [B-] = CB - [HB] kCA - [HB] [HA], [HB], [A-], [B-] f ([HA], [HB], CA) Rank 4 Breaking rank-deficiency by matrix augmentation
61
Lecture 1 Introduction to data structures and soft-modelling methods.
Factor Analysis of two-way data: Bilinear models. Rotation and intensity ambiguities. Pseudo-rank and rank deficiency. Local Rank and Evolving Factor Analysis.
62
Local exploratory analysis
Study of the variation of the number of contributions in the process or system. Study of the rank variation during the process. Evolving Factor Analysis (EFA) Fixed Size Moving Window - Evolving Factor Analysis (FSMW-EFA)
63
Evolving Factor Analysis
Stepwise chemometric monitoring of a process. Forward Evolving FA (from beginning to end) Backward Evolving FA (from end to beginning) Working procedure Display of subsequent PCA analyses along gradually increasing data set windows.
64
Evolving Factor Analysis
HPLC-DAD example Wavelengths Retention times Spectrum Chromatogram D
65
Evolving Factor Analysis
Forward Evolving Factor Analysis PCA 5 10 15 20 25 30 35 40 45 50 7.5 8 8.5 9 9.5 10.5 11 Retention times log(eigenvalues)
66
Evolving Factor Analysis
Forward Evolving Factor Analysis Location of the emergence of compounds Total number of compounds (PCA in last window) 5 10 15 20 25 30 35 40 45 50 7.5 8 8.5 9 9.5 10.5 11 Retention times log(eigenvalues) Selective zone Noise level
67
Evolving Factor Analysis
Backward Evolving Factor Analysis PCA 5 10 15 20 25 30 35 40 45 50 7.5 8 8.5 9 9.5 10.5 11 Retention times log(eigenvalues)
68
Evolving Factor Analysis
Backward Evolving Factor Analysis Location of the disappearance of compounds Total number of compounds (PCA last total window) 5 10 15 20 25 30 35 40 45 50 7.5 8 8.5 9 9.5 10.5 11 Retention times log(eigenvalues) Selective zone Noise level
69
Evolving Factor Analysis
Combined EFA plot (forward and backward EFA) 5 10 15 20 25 30 35 40 45 50 7.5 8 8.5 9 9.5 10.5 11 Retention times log(eigenvalues) Total number of components (PCA of extreme windows) Detection of selective zones (extremes) Location of emergence and decay of compounds
70
Evolving Factor Analysis
Consecutive emergence-decay profiles. No embedded compounds. Sequential processes Approximate concentration profiles Noise level 5 10 15 20 25 30 35 40 45 50 7.5 8 8.5 9 9.5 10.5 11 Retention times log(eigenvalues) Concentration window Zero-component windows
71
Evolving Factor Analysis
Approximate concentration profiles EFA derived concentration profiles Real concentration profiles
72
Fixed Size Moving Window-Evolving FA (FSMW-EFA)
Local rank map along the process direction or the signal direction. Working procedure Subsequent PCA in fixed size windows moving stepwisely along the data set. Window size min(number of components + 1)
73
FSMW-EFA PCA log(eigenvalues) Retention times 5 10 15 20 25 30 35 40
5 10 15 20 25 30 35 40 45 50 3.5 4 4.5 5.5 Retention times log(eigenvalues)
74
Detection of selective zones along the whole process
FSMW-EFA 5 10 15 20 25 30 35 40 45 50 3.5 4 4.5 5.5 Retention times log(eigenvalues) Detection of selective zones along the whole process 1 2 Variation of local rank along the process direction (complexity, degree of overlap among compounds) Noise level
75
Local rank detection EFA EFA FSMW-EFA FSMW-EFA window size 5
20 40 60 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 -5 80 100 0.5 1.5 2.5 3 3.5 4 0.05 0.1 0.15 0.25 0.3 -3 -2 -1 10 30 50 -1.5 -0.5 window size 5 window size 8 EFA EFA Local rank detection FSMW-EFA FSMW-EFA 22
76
FSMW-EFA vs. EFA EFA Displays the evolution of the process.
The compounds are well identified (concentration windows) Local rank information is not easily interpreted. FSMW-EFA Clear definition of local rank. Sensitive to detection of minor compounds. The idea of process evolution is not preserved.
77
Getting Local rank information from Evolving Factor Analysis methods
Detection of the selective windows or regions where only one species exists (total selectivity) Detection of zero concentration windows or regions (no species is present) Detection of windows or regions where a particular species is not present Detection of the concentration windows or regions where one species is present (other species can coexist) 18
78
References EFA FSMW-EFA SIMPLISMA
H. Gampp, M. Maeder, C.J. Meyer and A.D. Zuberbühler. Talanta, 32, (1985). M. Maeder. Anal. Chem. 59, (1987). FSMW-EFA H.R. Keller and D.L. Massart. Anal. Chim. Acta, 246, (1991). SIMPLISMA W. Windig and J. Guilment. Anal. Chem., 63, (1991).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.