Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chimiometrie 2009 Proposed model for Challenge2009 Patrícia Valderrama

Similar presentations


Presentation on theme: "Chimiometrie 2009 Proposed model for Challenge2009 Patrícia Valderrama"— Presentation transcript:

1 Chimiometrie 2009 Proposed model for Challenge2009 Patrícia Valderrama pativalderrama@gmail.com patricia.valderrama@agroparistech.fr

2 1° step) Models development ModelRMSECR2R2 Mean center 20VL2.70190.8325 Mean center 20VL + 1° derivative 1.29910.9613 Mean center 20VL + Baseline 2.80220.8198 Mean center 20 VL + Smoothing 2.77790.8229 Mean center 20VL + Smoothing + 1° derivative 1.70050.9336 Variable Selection Genetic Algorithm iPLS Obs.: Impossible to estimate the RMSEP because of the need of reference values to X_TST! y ref is the reference value y est is the estimate value by the model I is the number of samples

3 2° step) Models with Variable Selection ModelRMSECR2R2 GA Mean Center 20VL 1.29400.9616 iPLS 10 intervals 15VL 3.71390.6674 iPLS 5 intervals 14VL 2.01820.9008 iPLS 3 intervals 19VL 1.64380.9374 iPLS 2 intervals 16VL 1.25260.9625 Best Model

4 3° step) Outliers detection for the Best Model ModelRMSECR2R2 iPLS 2 intervals 16VL 1.25260.9625 iPLS 2 intervals 16 VL after outliers detection 0.83200.9834 The outliers detection in calibration matrix were based on : Extreme Leverages (zero outliers) Unmodeled Residuals in Spectra (zero outliers) Unmodeled Residuals in Dependent Variables (7 outliers) Outliers total in calibration = 7 The outliers detection in validation matrix were based on: Extreme Leverages (129 outliers) Unmodeled Residuals in Spectra (106 outliers) Outliers total in validation = 153 Best Model Optimized

5 3° step) Outliers detection in calibration and validation matrix Based on: Extreme Leverages: Leverage represents how much one sample is distant from the center of the data.  where T represents the scores of all calibration samples, ti is the score vector of a particular sample, and A is the number of latent variables. n = number of samples According to ASTM E1655-00, samples with higher than a limit value (hi), should be removed from the calibration set.

6 3° step) Outliers detection in calibration and validation matrix Based on: Unmodeled Residuals in Spectra: Identification of outliers based on unmodeled residuals in spectral data were obtained by comparison of the standard deviation total residuals (s(e)) with the standard deviation of a particular sample (s(ei)):   s(ê i )>2s(ê) n = number of samples J = number of variables A = number of latent variables Xi,j = absorbance value of the sample i at wavelength j = estimated value with A latent variables If a sample presents s(ei) > 2s(e), the sample should be removed from the calibration set.

7 3° step) Outliers detection in calibration matrix Based on: Unmodeled Residuals in Dependent Variables: Outliers are identified through comparison of the root mean square error of calibration (RMSEC) with the absolute error of that sample. n = number of samples J = number of variables A = number of latent variables yi = reference value for the i sample = estimated value for I samples If a sample presents a difference between its reference value (yi) and its estimate (yˆi) larger 2 times the RMSEC, it is identified as an outlier 

8 4° step) Figures of Merit for the Best Model Optimized Accuracy Fit Precision – impossible to estimate because of the need of replicates to the validation samples Sensitivity Analytical Sensitivity Selectivity Linearity Limit of Detection (LOD) Limit of Quantification (LOQ) Signal-to-noise ratio

9 4° step) Figures of Merit for the Best Model Optimized Accuracy: This parameter reports the closeness of agreement between the reference value and the value found by the calibration model. In chemometrics, this is generally expressed as the root mean square error of calibration (RMSEC) prediction (RMSEP). However, RMSEP is a global parameter that incorporates both systematic and random errors. Hence, an F-test with the RMSEC/RMSEP of two methods is not appropriate to compare the accuracy, a better indicator is the regression of found versus nominal concentrations values and estimation of the linear regression slope and intercept, including the consideration of the elliptical joint confidence regions. The ellipses contain the ideal point (1, 0), for slope and intercept respectively, showing that the reference calibration values and PLS results do not present a significant difference with 99% of confidence.

10 4° step) Figures of Merit for the Best Model Optimized Fit: Net Analyte Signal Versus Reference Values: Presentation pseudo-univariate of the multivariate calibration model

11 4° step) Figures of Merit for the Best Model Optimized Sensitivity: This parameter is the fraction of analytical signal due to the increase of the concentration of a particular analyte at unit concentration. = 2.3932x10 -5 Analytical Sensitivity: The inverse of this parameter reports the minimum concentration difference between two samples that can be determined by the model, considering that the spectral noise represents the larger source of error. = 0.5737 And the minimum concentration difference between two samples that can be determined by the model is  -1 = 1.7431

12 4° step) Figures of Merit for the Best Model Optimized Selectivity: Signal fraction utilized in the quantification = 0.21 Linearity: in multivariate calibration a liner model should presents errors with alleatory behavior

13 4° step) Figures of Merit for the Best Model Optimized Limit of Detection: Following IUPAC recommendations, the LOD can be defined as the minimum detectable value of net signal (or concentration). = 5.7518 Limit of Quantification: The ability of quantification is generally expressed in terms of the signal or analyte concentration value that will produce estimatives having a specified standard deviation, usually 10%. = 17.4296

14 4° step) Figures of Merit for the Best Model Optimized Signal-to-noise ratio: How much the net analyte signal is superior to instrumental noise Max = 26.1264 Min = 9.5815


Download ppt "Chimiometrie 2009 Proposed model for Challenge2009 Patrícia Valderrama"

Similar presentations


Ads by Google