10th Winter Symposium on Chemometrics

Slides:



Advertisements
Similar presentations
Agenda of Week V Review of Week IV Inference on MV Mean Vector One population Two populations Multi-populations: MANOVA.
Advertisements

Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Linear discriminant analysis (LDA) Katarina Berta
Clustering: Introduction Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Dimension reduction (1)
Ole Mathis Kruse, IMT Feature extraction techniques to use in cereal classification Department of Mathematical Sciences and Technology 1.
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Discriminant Analysis Objective Classify sample objects into two or more groups on the basis of a priori information.
Discriminant Analysis Testing latent variables as predictors of groups.
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Midterm Review. 1-Intro Data Mining vs. Statistics –Predictive v. experimental; hypotheses vs data-driven Different types of data Data Mining pitfalls.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc.
Machine Learning CS 165B Spring Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Chemometric functions in Excel
MODEGAT Chalmers University of Technology Use of Latent Variables in the Parameter Estimation Process Jonas Sjöblom Energy and Environment Chalmers.
ArrayCluster: an analytic tool for clustering, data visualization and module finder on gene expression profiles 組員:李祥豪 謝紹陽 江建霖.
Multivariate Analysis Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
NMR AND CHEMOMETRICS: A POWERFUL COMBINATION FOR FOOD ANALYSIS
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
The Extraction and Classification of Craquelure for Geographical and Conditional Based Analysis of Pictorial Art Mouhanned El-Youssef, Spike Bucklow, Roman.
A B S T R A C T The study presents the application of selected chemometric techniques to the pollution monitoring dataset, namely, cluster analysis,
Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical.
PATTERN RECOGNITION : CLUSTERING AND CLASSIFICATION Richard Brereton
Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Understand how to use cluster analysis with discriminant.
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Data analysis tools Subrata Mitra and Jason Rahman.
Outline of Today’s Discussion 1.Introduction to Discriminant Analysis 2.Assumptions for Discriminant Analysis 3.Discriminant Analysis in SPSS.
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
Scikit-Learn Intro to Data Science Presented by: Vishnu Karnam A
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
1 Robustness of Multiway Methods in Relation to Homoscedastic and Hetroscedastic Noise T. Khayamian Department of Chemistry, Isfahan University of Technology,
D/RS 1013 Discriminant Analysis. Discriminant Analysis Overview n multivariate extension of the one-way ANOVA n looks at differences between 2 or more.
1 Statistics & R, TiP, 2011/12 Multivariate Methods  Multivariate data  Data display  Principal component analysis Unsupervised learning technique 
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Strategies for Metabolomic Data Analysis Dmitry Grapov, PhD.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Study of polypropylene-polylactide / olive oil interaction using 3D front face fluorescence coupled with independent.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
JMP Discovery Summit 2016 Janet Alvarado
PREDICT 422: Practical Machine Learning
How to solve authentication problems
School of Computer Science & Engineering
Dimension Reduction in Workers Compensation
Fabien LOTTE, Cuntai GUAN Brain-Computer Interfaces laboratory
Food adulteration analysis without laboratory prepared or determined reference food adulterant values John H. Kalivasa*, Constantinos A. Georgioub, Marianna.
Dimension Reduction via PCA (Principal Component Analysis)
Interval selection complexity
Machine Learning for High-Throughput Stress Phenotyping in Plants
Example of PCR, interpretation of calibration equations
Lecture 14 PCA, pPCA, ICA.
Yulia Monakhova, Bernd W.K. Diehl
What is Regression Analysis?
Cheng-Yi, Chuang (莊成毅), b99
Abdur Rahman Department of Statistics
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Multivariate Methods Berlin Chen
Machine Learning – a Probabilistic Perspective
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
8/22/2019 Exercise 1 In the ISwR data set alkfos, do a PCA of the placebo and Tamoxifen groups separately, then together. Plot the first two principal.
Implementation of infrared tool at key steps
An introduction to Machine Learning (ML)
Presentation transcript:

10th Winter Symposium on Chemometrics 29 February 2016 Samara, Russia Hybrid chemometric approaches to increase efficiency of classification and data fusion techniques Yulia B. Monakhova, Monika Hohmann, Svetlana P. Mushtakova, Norbert Christoph, Helmut Wachter, Bernd Diehl, Ulrike Holzgrabe, Douglas N. Rutledge Institute of Chemistry, Saratov State University, Saratov, Russia Spectral Service AG, Cologne, Germany Institute of Pharmacy and Food Chemistry, University of Würzburg, Würzburg, Germany Bavarian Health and Food Safety Authority, Würzburg, Germany UMR Ingénierie Procédés Aliments, AgroParisTech, Inra, Université Paris-Saclay, France

linear classfication models??? Introduction How can we cope with cluster overlap in linear classfication models??? By using synergetic combination of existing multivariate approaches

ICA + DA Problem statement Variable reduction is necessary for DA PCA is a routine tool for dimention reduction Is ICA an alternative for the DA preprocessing?

5 groups: MSR, PFL, RHH, Sac, Wue ICA + DA NMR wine data set   Data set Classified parameter 1 Riesling (n=334) Year 5 groups: 2005, 2006, 2007, 2009, 2010 2 Riesling (n=217) Origin 5 groups: MSR, PFL, RHH, Sac, Wue 3 2009 (n=111) 4 groups: PFL, NAH, MSR, RHH 4 Red wine (n=303) Grape variety 6 groups: Pinot noir, Dornfelder, Lemberger, Portugieser, Trollinger, Regent

ICA + DA Number of PCs/ICs

ICA + DA Number of PCs/ICs

ICA + DA Number of PCs/ICs Data matrix Classified parameter FDA LDA   Data matrix Classified parameter FDA LDA PCA ICA 1 Riesling (n=334) Year 10 7 8 6 2 Riesling (n=217) Origin 12 3 2009 (n=111) 9 4 Red wine (n=303) Grape variety 15 13

Number of samples for validation ICA + DA Classification results Data matrix Classified parameter Number of samples for validation FDA LDA PCA ICA Riesling (n=334) Year 56 70 75 84 90 Riesling (n=217) Origin 36 69 89 95 2009 (n=111) 22 76 85 87 92 Red wine (n=303) Grape variety 61 43 46 79

Sum of ranking difference ICA + DA Sum of ranking difference Ref.: K. Heberger, Sum of ranking differences compares methods or models fairly. Trends Anal. Chem. 29 (2010) 101-109.

Classification results ICA + DA Classification results PCA-FDA ICA-FDA

Common components and specific weights analysis (CCSWA) Better CCSWA + PLS-DA Problem statement PLS-DA Common components and specific weights analysis (CCSWA) Better classification???

CCSWA + PLS-DA Algorithm Global scores calculation PLS-DA on global scores 1-Latent Variable PLS regression between WG matrix and binary-coded groups matrix PCA decomposition of the WG matrix

CCSWA + PLS-DA Normally fruited tomatoes CCSWA Wilks' lambda = 0.14 PLSDA-CCSWA Wilks' lambda = 0.05

CCSWA + PLS-DA Small fruited tomatoes MB hierarchical PLS CCSWA Wilks' lambda = 0.13 CCSWA Wilks' lambda=0.11 PLSDA-CCSWA Wilks' lambda = 0.05

CCSWA + PLS-DA Preprocessing

CCSWA + PLS-DA Number of LVs and CCs CCSWA CCs=11 / LVs=2 PLSDA-CCSWA CCs=5-12 / LVs=1-2

CCSWA + PLS-DA Prediction

Thank you for your attention!!! CCSWA PLS-DA DA ICA