Addressing THE Problem of NIR
SYMPTOMS The apparent need for, and use of more variables in calibrations, than can reasonably be justified Algebra dictates that no more equations should be needed than variables Difficulty in reproducing calibrations for the same constituents in the same type of samples. Inability to reproduce wavelength selections (MLR models) Difficulty and/or inability in relating wavelengths chosen (for MLR) or prominent wavelength bands (for PCR/PLS) to spectral features Unexpected “Outliers” SEC and SEP should drop precipitously when the number of wavelengths/factors equals the number of variations in the samples Spectroscopic measurements should be accurate over the entire range of concentrations, not only dilute solutions or a small range of values Calibrations should be extrapolatable, with a calculated reduced accuracy at the extremes Calibration transfer should be as easily and readily performed as comparing two mid-IR spectra
Explanations Given OPTICAL SCATTER Lab error Instrument variations “Wrong” transforms of spectral data “Wrong” calibration algorithm
Published in Appl. Spect., 64(9), p.995-1006 (2010) Pittcon 2011 Presentation Published in Appl. Spect., 64(9), p.995-1006 (2010)
Simplified Experimental Design: Three Miscible Liquids, 15 Samples
Simplified Experimental Design: Liquids = no scatter, Beer’s Law holds Symmetric design = Balanced Ranges = 0 – 100% for all components Data Analysis No data transform applied CLS algorithm Simplest algorithm “Absolute” method Origin: comes directly from Beer’s Law No conventional calibration needed No computation of factors: no overfitting Requirements: Spectra of pure materials known No interactions between constituents No Partial Molal Volume changes
Spectral Results vs “Concentrations” Concentrations as Weight Fractions Concentrations as Volume Fractions
Underlying Cause: Weight Fraction vs Volume Fraction Toluene Dichloromethane N-heptane
Underlying Cause: Weight Fraction vs Volume Fraction Concentrations expressed in different units are NOT linearly related. Spectroscopy is sensitive to VOLUME fraction The nature of the curvature depends on the constituent. On the average, toluene shows little or no non-linearity; the other components have opposite senses. The amount of curvature depends on the exact mixture. Toluene
Underlying Cause: Weight Fraction vs Volume Fraction Toluene Dichloromethane When samples are selected at random, you don’t see the underlying curvature, The error values seem randomly scattered around the average curve, just as though the error was random instead of systematic, as a result of the random sample selection. A calibration, however, will show curvature in the residuals. N-heptane
Expanding Beyond the Small Dataset The foregoing makes a compelling case that for spectroscopic analysis, EM radiation “sees” the absorbance according to the volume fractions of their components (given some assumptions to ensure ideality) The CLS algorithm responds according to the theory This does not necessarily ensure that conventional calibration algorithms will respond the same way. The fifteen samples we used is not sufficient number to reliably test conventional calibrations.
Expanding Beyond the Small Dataset Seeking another, larger, suitable sample set was successful in finding a pair of reports, by Willem Windig, et al, based on data suitable for our purposes: Chemom. & Intell. Lab. Syst.,36, p.3-16 (1997) Anal. Chem., 54, p.2735-2742 (1992) Data supplied by Tony Davies and Tormod Naes
Expanding Beyond the Small Dataset The set contained mixtures of 5 ingredients: 2-butanol Methylene chloride Methanol 1,2-Dichloropropane Acetone Orthogonal mixture design provided 70 different mixtures, each measured twice. Each ingredient had one of the following percentage concentrations: 10, 22.5, 35, 47.5, 60 weight percent. The properties of the mixture design are described on the following slide:
Orthogonal Mixture Design Used Each of the ten plots of ingredients taken pairwise look like this
Pure Material Spectra
Pure Material Spectra Butanol Methanol Dichloromethane Dichloropropane Acetone
Plot of Data Spectra (1st & 2nd readings)
Plot of Mixture Spectra (Maximum Acetone) (1st & 2nd readings)
Plot of Mixture Spectra (Maximum Butanol) (1st & 2nd readings)
Plot of Mixture Spectra (Maximum Dichlorprop.) (1st & 2nd readings)
Plot of Mixture Spectra (Maximum Methanol) (1st & 2nd readings)
Plot of Mixture Spectra (Maximum Methyl Chl) (1st & 2nd readings)
Densities Component Density 2-Butanol 0.808 Methylene Chloride 1.336 Methanol 0.7928 1,2-Dichloropropane 1.1593 Acetone 0.792
CLS Results (Reconstruction of max Acetone mixture)
CLS Results (Reconstruction of max Butanol mixture)
CLS Results (Reconstruction of max Dichlorpropane mixture)
CLS Results (Reconstruction of max Methanol mixture)
CLS Results (Reconstruction of max Methyl Chl mixture)
CLS Results (Spectral vs Reference) Methanol Spectral Results vs Weight Fraction Acetone Spectral Results vs Weight Fraction Methanol Spectral Results vs Volume Fraction Acetone Spectral Results vs Volume Fraction
CLS Results (Spectral vs Reference) Acetone Spectral Results vs Weight Fraction (Butanol and Methylene Chloride held constant) Methanol Spectral Results vs Weight Fraction (Butanol and Methylene Chloride held constant) Methanol Spectral Results vs Volume Fraction Acetone Spectral Results vs Volume Fraction
CLS Results (Spectral vs Reference) Butanol Spectral Results vs Weight Fraction Methylene Chloride Spectral Results vs Weight Fraction Butanol Spectral Results vs Volume Fraction Methylene Chloride Spectral Results vs Volume Fraction
CLS Results (Spectral vs Reference) Dichloropropane Spectral Results vs Weight Fraction Dichloropropane Spectral Results vs Volume Fraction
Expanding Beyond CLS Apply PCA, PCR, PLS
Explained PCA Variance (SVD): Each Half PCA Variance Fraction PC # Variance 30% 29% 3 22% 4 19% 5 0% 6 0% Legend: Blue = Calibration Red = Validation
PCA Residual Variance of Calibration Spectra
PCA Residual Variance of Components (Wt %)
PCA Residual Variance of Components (Vol %)
PLS Wt % (Q) Residual Variance of Spectra
PLS Vol % (Q) Residual Variance of Spectra
PLS Residual Variance of Components (Wt %)
PLS Residual Variance of Components (Vol %)