Presentation is loading. Please wait.

Presentation is loading. Please wait.

A NEW USE OF TARGET FACTOR ANALYSIS (TFA) John H. Kalivas, Kevin Higgins Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA Erik.

Similar presentations


Presentation on theme: "A NEW USE OF TARGET FACTOR ANALYSIS (TFA) John H. Kalivas, Kevin Higgins Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA Erik."— Presentation transcript:

1 A NEW USE OF TARGET FACTOR ANALYSIS (TFA) John H. Kalivas, Kevin Higgins Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA Erik Andries Department of Mathematics Central New Mexico Community College Albuquerque, New Mexico, Idaho 87106 USA

2 Classification Situation Numerous classification approaches –KNN, LDA, MD, ANN, SVM, … As the number of classes increases for a problem, the more difficult classification can become Target factor analysis (TFA) and net analyte signal (NAS) –TFA and NAS have concurrent calculations of analogous angles between a test sample vector and respective spaces spanned by library classes –Useful for binary or multiclass situations 2

3 Requirements X i = m × n library information matrix for the ith class –m = number of samples –n = number of measurements Wavelengths for spectra, other physical or chemical variables –Samples making up a library class must span variances making up the class Instrument profile, temperature effects, measurement process, others y = m × 1 test sample measurement vector 3

4 Orthogonal Projection Spatial Angle (OPSA) Identical to TFA and NAS –Use same orthogonal projection 4 y

5 Process No data preprocessing Perform SVD of each library class Retain d eigenvectors (class-wise) where 1 ≤ d ≤ k and k = rank(X) ≤ min(m,n) Compute OPSA, MD, and KNN for the test sample relative to each library class –Use leave one out cross-validation (LOOCV) Library class with smallest angle or MD is the test sample classification KNN classification trends evaluated 5

6 Assessment Accuracy = (TP + TN)/(TP +TN + FP + FN) –TP = true positives –TN = true negatives –FP = false positives –FN = false negatives Receiver operator characteristic (ROC) –True positive rate = sensitivity = TP/(TP + FN) –False positive rate = 1- specificity = 1 – TN/(TN + FP) 6

7 Determining Eigenvectors Numerous approaches exist to determine the minimum number of eigenvectors to span X Determination of rank by augmentation (DRAUG) –Malinowski ER. J. Chemom. 2011; 25: 323-328 Distinguishes primary eigenvectors (chemical, instrumental, etc.) from secondary eigenvectors (experimental error) independent of the experimental uncertainties distribution 7

8 Plastic Data Six classes (six of seven commercial plastic types 1-6) –Allen V, Kalivas JH, Rodriguez RG. Applied Spec. 1999; 53: 672-681 Raman spectroscopy (850 – 1800 cm -1, 1093 wavenumbers per spectrum) –Type 1 = polyethylene terephthalate (PET); 30 samples –Type 2 = high-density polyethylene (HDPE); 29 samples –Type 3 = polyvinyl chloride (PVC); 13 samples –Type 4 = low-density polyethylene (LDPE); 22 samples –Type 5 = polypropylene (PP); 23 samples –Type 6 = polystyrene (PS); 29 samples 8

9 Plastic Score and Scree Plots 9 Type 1 Type 2 Type 3 Type 4 Type 5 Type 6 Unique clusters are not formed Most of the spectral variance is captured with the first eigenvector Score Plot Scree Plot

10 Plastic Classification Results 10 a Parenthesis values are DRAUG eigenvector number rounded to nearest whole number Numbers indicate number of eigenvectors Total Accuracy Across All Classes OPSA MD ROC Plot KNN Specificity Sensitivity Accuracy

11 Archeological Data Four classes (four archeological sources of obsidian) –Kowalski BR, Schatzki TF, Stross FH. Anal. Chem. 1972; 44: 2176-2180 10 trace metal concentrations from X-ray fluorescence spectroscopy (Fe, Ti, Ba, Ca, K, Mn, Rb, Sr, Y, and Zr) –Source 1 = 10 samples –Source 2 = 9 samples –Source 3 = 23 samples –Source 4 = 21 samples 11

12 Archeological Classification Results 12 OPSA MD Source 1 Source 2 Source 3 Source 4 Score PlotScree Plot Total Accuracy Across All Classes a Parenthesis values are DRAUG eigenvector number rounded to nearest whole number KNN Specificity Sensitivity Accuracy

13 Gasoil Data Three classes (three commercial sources of gasoil) –Wentzell P, Andrews D, Walsh J, Cooley J, Spencer P. Can. J. Chem. 1999; 77: 391-400 Ultraviolet spectroscopy (200 – 400 nm, 572 wavelengths per spectrum) –Source 1 = 59 samples –Source 2 = 25 samples –Source 3 = 30 samples 13

14 Gasoil Classification Results 14 OPSA MD Source 1 Source 2 Source 3 Score Plot Scree Plot Total Accuracy Across All Classes a Parenthesis values are DRAUG eigenvector number rounded to nearest whole number KNN Specificity Sensitivity Accuracy

15 Extra Virgin Olive Oil (EVOO) Data Six classes (six adulterant oils) –Poulli KI, Mousdis GA, Georgiou CA. Food Chem. 2007; 105: 369-375 Synchronous fluorescence spectroscopy (250 – 400 nm at Δ20nm,151 wavelengths per spectrum) –Adulterant 1 = corn –Adulterant 2 = olive-pomace –Adulterant 3 = soybean –Adulterant 4 = sunflower –Adulterant 5 = rapeseed –Adulterant 6 = walnut 31 samples each at 0.5 to 95 % adulterant 15

16 EVOO Classification Results OPSA MD Corn, Olive-pomace, Rapeseed, Soybean, Sunflower, Walnut Score Plot Scree Plot Total Accuracy Across All Classes Specificity Sensitivity Accuracy KNN

17 EVOO Concentrations 17 Corn, Olive-pomace, Rapeseed, Soybean, Sunflower, Walnut Concentration Coded Score Plot Score Plot % Sunflower a Parenthesis values are DRAUG eigenvector number rounded to nearest whole number

18 Summary TFA or NAS angular measure OPSA out-performs MD and KNN over a variety of data sets –If normalize y to unit length, same results if use (TFA) Score plots need not be obvious Need to determine number of eigenvectors (basis vectors) to characterize each library class Samples making up a library class need to span variances making up that library class –Instrument profile –Temperature effects –Others 18


Download ppt "A NEW USE OF TARGET FACTOR ANALYSIS (TFA) John H. Kalivas, Kevin Higgins Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA Erik."

Similar presentations


Ads by Google