10th Winter Symposium on Chemometrics 29 February 2016 Samara, Russia Hybrid chemometric approaches to increase efficiency of classification and data fusion techniques Yulia B. Monakhova, Monika Hohmann, Svetlana P. Mushtakova, Norbert Christoph, Helmut Wachter, Bernd Diehl, Ulrike Holzgrabe, Douglas N. Rutledge Institute of Chemistry, Saratov State University, Saratov, Russia Spectral Service AG, Cologne, Germany Institute of Pharmacy and Food Chemistry, University of Würzburg, Würzburg, Germany Bavarian Health and Food Safety Authority, Würzburg, Germany UMR Ingénierie Procédés Aliments, AgroParisTech, Inra, Université Paris-Saclay, France
linear classfication models??? Introduction How can we cope with cluster overlap in linear classfication models??? By using synergetic combination of existing multivariate approaches
ICA + DA Problem statement Variable reduction is necessary for DA PCA is a routine tool for dimention reduction Is ICA an alternative for the DA preprocessing?
5 groups: MSR, PFL, RHH, Sac, Wue ICA + DA NMR wine data set Data set Classified parameter 1 Riesling (n=334) Year 5 groups: 2005, 2006, 2007, 2009, 2010 2 Riesling (n=217) Origin 5 groups: MSR, PFL, RHH, Sac, Wue 3 2009 (n=111) 4 groups: PFL, NAH, MSR, RHH 4 Red wine (n=303) Grape variety 6 groups: Pinot noir, Dornfelder, Lemberger, Portugieser, Trollinger, Regent
ICA + DA Number of PCs/ICs
ICA + DA Number of PCs/ICs
ICA + DA Number of PCs/ICs Data matrix Classified parameter FDA LDA Data matrix Classified parameter FDA LDA PCA ICA 1 Riesling (n=334) Year 10 7 8 6 2 Riesling (n=217) Origin 12 3 2009 (n=111) 9 4 Red wine (n=303) Grape variety 15 13
Number of samples for validation ICA + DA Classification results Data matrix Classified parameter Number of samples for validation FDA LDA PCA ICA Riesling (n=334) Year 56 70 75 84 90 Riesling (n=217) Origin 36 69 89 95 2009 (n=111) 22 76 85 87 92 Red wine (n=303) Grape variety 61 43 46 79
Sum of ranking difference ICA + DA Sum of ranking difference Ref.: K. Heberger, Sum of ranking differences compares methods or models fairly. Trends Anal. Chem. 29 (2010) 101-109.
Classification results ICA + DA Classification results PCA-FDA ICA-FDA
Common components and specific weights analysis (CCSWA) Better CCSWA + PLS-DA Problem statement PLS-DA Common components and specific weights analysis (CCSWA) Better classification???
CCSWA + PLS-DA Algorithm Global scores calculation PLS-DA on global scores 1-Latent Variable PLS regression between WG matrix and binary-coded groups matrix PCA decomposition of the WG matrix
CCSWA + PLS-DA Normally fruited tomatoes CCSWA Wilks' lambda = 0.14 PLSDA-CCSWA Wilks' lambda = 0.05
CCSWA + PLS-DA Small fruited tomatoes MB hierarchical PLS CCSWA Wilks' lambda = 0.13 CCSWA Wilks' lambda=0.11 PLSDA-CCSWA Wilks' lambda = 0.05
CCSWA + PLS-DA Preprocessing
CCSWA + PLS-DA Number of LVs and CCs CCSWA CCs=11 / LVs=2 PLSDA-CCSWA CCs=5-12 / LVs=1-2
CCSWA + PLS-DA Prediction
Thank you for your attention!!! CCSWA PLS-DA DA ICA