Download presentation
Presentation is loading. Please wait.
1
Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu LCDR Gregory J. Hall Glenn S. Frysinger Chemometric Methods for GC x GC
2
LCDR Gregory J. Hall 1995 B.S. Marine Science – U.S. Coast Guard Academy 1995 – 1997 Operations Officer, USCGC SPAR 1997-1998 M.S. Chemistry, Tufts University 1998-2000 Rotating Military Faculty, USCGA 2000 – Appointed to the PCTS 2002 – 2004 Ph.D. sabbatical, Tufts University 2006 – Ph.D. Chemistry, Tufts University “Chemometric Characterization and Classification of Estuarine Water through Multidimensional Fluorescence”
3
Permanent Commissioned Teaching Staff (PCTS) About 23 officers ranked from LT to CAPT Provide the “interpreters” between the military and civilian faculty and leadership for the college Teaching, Service, and Scholarship expected Ph.D. required
4
LCDR Gregory J. Hall
5
What IS Chemometrics? Chemometrics is the chemical discipline that uses mathematical, statistical and other methods employing formal logic to design or select optimal measurement procedures and experiments, and to provide maximum relevant chemical information by analyzing chemical data. (D.L. Massart: Chemometrics:, Elsevier, NY,1988)
6
Chemometrics already covered and to come 1.Difference Chromatograms 2.Property Modeling 3.Clustering 4.Chromatograph Prediction 5.Mass Spec searching 6.Template Construction 7.XICs 8.Retention Indices You are all already chemometricians!
7
Today 1.Data Structures – How I view GC x GC data 2.Variance - PCA 3.Classification – SIMCA, PCR-DA 4.Regression – PLS 5.Peak Resolution - PARAFAC 6.Preprocessing – Alignment 7.The way forward, humble opinions
8
Data – GC x GC - FID X J K I sample First Dimension Second Dimension 1 2 40 50 67 32 32 25 10 1 2 5 64 90 45 1 18 5 67 10 1 7 41 7 80 23 4 41 50 42 20 Intensity Values Chromatogram “Two way” 3 Dimensions Chromatogram Stack “Three way” 4 Dimensions Dataset Data Object First Dimension Second Dimension
9
Data – GC x GC -TOF Sample (Date?) First Dimension Second Dimension m/z X Dataset “Four way” 5 Dimensions !
10
variable 1 variable 2 variable 3 i j PC 1 PC 2 T2T2 Q Principal Components Analysis (PCA)
11
= T P “model” Samples X data E+ residuals “components” Goal - Variance capture
12
Multi-way Principal Components Analysis (MPCA) Wise, B. M.; Gallagher, N. B.; Bro, R.; Shaver, J. M.; Windig, W.; Koch, R. S. PLS Toolbox 4.0; Eigenvector Research, Inc.: Wenatchee, WA, 2006. Our data 15 x 410,000
13
0510152025303540 3.0 2.0 1.0 0.0 Time (min) Time (s) 4.0 GC × GC/MS TIC of Fire Debris 6 clean carpet samples 5 gasoline samples 6 “doped” carpet samples
14
PCA Model Specifics 1.Only two carpet classes included 2.4 PCs = 98% variance 3.Two random samples per class left out, all gasoline samples left out of “training set” 4.Left out samples “projected” onto the model later.
15
PC 1 - Loadings 05101520253035404550 2.0 1.5 1.0 0.5 0 Time (min) Time (s) Red = positive loadings, correlated Blue = negative loadings, anti-correlated
16
PC 2 - Loading 05101520253035404550 2.0 1.5 1.0 0.5 0 Time (min) Time (s) Chemically interpretable results! Next step - classification
17
Principal Components Regression Discriminant Analysis (PCR-DA) w/ accelerant wo/ accelerant 01 01 01 01 10 10 10 10 10 Y 05101520253035404550 2.0 1.5 1.0 0.5 0 Time (min) Time (s) 05101520253035404550 2.0 1.5 1.0 0.5 0 Time (min) Time (s) 05101520253035404550 2.0 1.5 1.0 0.5 0 Time (min) Time (s) X variable 1 variable 2 variable 3 i j PC 1 PC 2 T2T2 Q Regression Vector
18
05101520253035404550 2.0 1.5 1.0 0.5 0 Time (min) Time (s) Red = positive loadings Blue = negative loadings
19
20 25 30 150 100 O Regression Vector Zoom
20
Principal Components Regression Predictions 1671217 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Sample Scores on the Regression Vector Unaltered Carpet Arson Debris Gasoline Discriminant Analysis 1 = Member of Arson Class
21
Classification – Soft Independent Model of Class Analogy (SIMCA) variable 1 variable 2 variable 3 x y z k variable 1 variable 2 variable 3 i j PC 1 PC 2 T2T2 Q
22
SIMCA Model Specifics 1.PCA modeled for 2 classes – Arson, not Arson 2.Each model had 2 PCs with 99% variance captured 3.One random samples per class left out, all gasoline samples left out of “training set” 4.Left out samples “projected” onto each model later.
23
Arson “Case” SIMCA Results CarpetDopedGasoline 0 1 In Carpet Class 0 1 In Doped Class 1 2 Nearest Class 0 1 Not in any Class CarpetDopedGasoline CarpetDopedGasoline CarpetDopedGasoline Carpet Samples Carpet Test Doped Samples Doped Test Gasoline Test
24
Arson “Case” SIMCA Fit Statistics -100102030 -0.01 0 0.01 0.02 0.03 0.04 Q Residuals T^2 Residuals 050100150200250 0 0.05 0.1 0.15 0.2 0.25 Q Residuals T^2 Residuals Fit Statistics for Doped Carpet Class -4-202468 0.01 0.015 0.02 0.025 0.03 Q Residuals T^2 Residuals 05001000 0 0.2 0.4 0.6 0.8 1 Q Residuals T^2 Residuals Fit Statistics for Carpet Class Carpet Samples Carpet Test Doped Samples Doped Test Gasoline Test
25
Parallel Factor Analysis (PARAFAC) + = B A C G X E + = X E a1a1 b1b1 c1c1 a2a2 b2b2 c2c2 a3a3 b3b3 c3c3 ++ J K I J K I J R KR R I J K I J K I
26
PARAFAC Sample Score Loading First DimensionSecond Dimension X J K I Factor 1 Factor 2 a1a1 b1b1 c1c1 a2a2 b2b2 c2c2 Sample Score Loading First Dimension GC x GC - FID Chromatogram Stack Second Dimension
27
Parallel Factor Analysis (PARAFAC) GC x GC - TOF Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.
28
Parallel Factor Analysis (PARAFAC) PARAFAC m/z Score Loading First DimensionSecond Dimension X J K I Factor 1 Factor 2 a1a1 b1b1 c1c1 a2a2 b2b2 c2c2 m/z Score Loading First Dimension GC x GC - TOF Sample Second Dimension
29
Parallel Factor Analysis (PARAFAC) GC x GC - TOF “Complex Environmental Sample” Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.
30
PARAFAC Results Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.
31
PARAFAC Results Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.
32
GCImage screen capture NIJ0221 100 µg 75% Wx gasoline / nylon carpet matrix GC × GC/MS Peak Deconvolution PARAFAC?
33
Partial Least Squares (PLS) = T P “model” samples X data E + residuals “latent variables” Y variables samples properties = T F Q +
34
PLS Results Naphthalenes in Jet Fuel Johnson, K. J.; Prazen, B. J.; Young, D. C.; Synovec, R. E. Journal of Separation Science 2004, 27, 410-416.
35
Alignment Strategy 1 Experimental Design Alignment Strategy 2 Templates / Peak Tables Alignment Strategy 3 Retention Index
36
Alignment Strategy 4 Piecewise Correlation Maximization Pierce, K. M.; Wood, L. F.; Wright, B. W.; Synovec, R. E. Analytical Chemistry 2005, 77, 7735-7743.
37
Alignment Strategy 5 “Warping” Kaczmarek, K.; Walczak, B.; de Jong, S.; Vandeginste, B. G. M. Journal of Chemical Information and Computer Sciences 2003, 43, 978-986.
38
Alignment Strategy Proposal # 1 Anchor Warping
39
Alignment Strategy Proposal # 1 Anchor Warping
40
Alignment Strategy Proposal #2 DTW – Piecewise Hybrid 1 st Dimension DTW Alkanes? 2 nd Dimension Piecewise
41
Humble Opinions 1.GC x GC is tremendously interesting data 2.Tremendous amounts of work possible, even with data that presently exists. Good alignment will open up even more possibilities 3.Include the Chemist in the analysis 4.Include the Chemometrician in the experimental design
42
Future? 1.More PCA, PCR, PLS, PARAFAC 2.Regression certainty calculations 3.NPLS, NPLS-DA 4. Holistic, automatic alignment strategies 2D COW or DTW ? PARAFAC 2 ? 5.User driven alignment strategies Anchor warping 6. Inclusion on m/z axis Purity, CODA?
43
U.S. Coast Guard Academy Alexander Trust You all! Acknowledgements
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.