Presentation is loading. Please wait.

Presentation is loading. Please wait.

Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research.

Similar presentations


Presentation on theme: "Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research."— Presentation transcript:

1 Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research Centre, Teagasc, (formerly The National Food Centre), Dublin 15 2 Dept of Statistics, School of Computer Science and Statistics, Trinity College Dublin, Dublin 2

2 Outline Food authenticity Spectroscopic data Current mathematical methods Proposed alternative –Dimension reduction –Model-based clustering –Updating Example near-infrared data with results

3 Food Authenticity – what and why? Detecting when foods are not what they are claimed to be Tampering/adulteration, mislabelling Economic fraud worth millions of US dollars globally Promote quality products Build consumer trust

4 Food Authenticity – how? Near infrared spectroscopy –Non-invasive –Relatively inexpensive Multivariate Mathematics –Partial Least Squares Regression –Factorial Discriminant Analysis –Model-based Clustering Other methods available (sp..)

5 Spectroscopic Data Near infrared transflectance spectroscopy –High dimensional data –Range 1100-2498 nm, reading every 2 nm –700 values for each sample

6 Current Mathematical Methods Discriminant Partial Least Squares Regression Factorial Discriminant Analysis Problem? –Limited to “two-group” classification problems –No quantification of certainty

7 Proposed Alternative Model-based clustering –Expansion of discriminant analysis –Allows clusters to vary in shape and size –Gives probability of a sample being in each cluster/group –Can classify situations with more than two groupings

8 Possible Cluster Shapes

9 The Dimensionality Problem Model-based clustering requires dimension reduction –for efficient computation –to prevent singular covariance matrices Use wavelet analysis with thresholding

10 EM Algorithm & Updating EM algorithm –expected value of the likelihood function –maximises the expected value –commonly used in statistics for estimating missing values Updating –uses previous estimates of labels as a starting point for iteration

11 Example: Honey Adulteration Irish honey extended with –fructose:glucose mixtures –fully inverted beet syrup –high fructose corn syrup Total of 478 spectra: –157 pure and 321 adulterated 225 with fructose:glucose mixtures 56 with fully inverted beet syrup 40 with high fructose corn syrup

12 Classification Achieved Classification rates on test set data achieved with correct proportions of each type of adulterant in the training set for “pure or adulterated” question. Training / TestEMEM & Updating 50% / 50%94.72% (1.12)94.43% (1.10) 25% / 75%93.22% (1.08)93.05% (1.03) 10% / 90%90.82% (1.76)92.22% (1.11)

13 Classification Achieved Classification rates on test set data achieved with correct proportions of pure / adulterated in the training set for “pure or adulterated” question. Training / TestEMEM & Updating 50% / 50%94.38% (1.16)94.11% (0.89) 25% / 75%93.50% (1.08)93.03% (1.02) 10% / 90%90.54% (1.80)92.05% (1.09)

14 Classification Achieved Classification rates on test set data achieved using 50% training, 50% test data with correct proportion of pure / adulterated in the training data set for “type of adulteration” question. QuestionEMEM & Updating Pure or adulterated? 91.09% (1.40)90.64% (1.36) Type of adulteration 86.23% (1.20)84.12% (1.67)

15 Classification Achieved Classification rates on test set data achieved using 50% training, 50% test data with correct proportions of each type of adulterant in the training set for “type of adulteration” question. QuestionEMEM & Updating Pure or adulterated? 89.41% (1.76)88.61% (1.82) Type of adulteration 85.70% (1.96)83.57% (2.23)

16 Probability v Accurate Classification Probability of group membership - by colour (black being pure, red being adulterated)

17 Conclusions EM algorithm gives a method of predicting group membership Updating procedures effective with small training sets Quantifying certainty Allows cost of misclassification to be easily incorporated into modelling

18 Questions? Funded by: Teagasc under the Walsh Fellowship Scheme Irish Department of Agriculture & Food (FIRM programme) Science Foundation of Ireland Basic Research Grant scheme (Grant 04/BR/M0057)


Download ppt "Model-based Classification in Food Authenticity Studies D. Toher 1,2, G. Downey 1 and T.B. Murphy 2 Presented by: Deirdre Toher 1 Ashtown Food Research."

Similar presentations


Ads by Google