How to solve authentication problems

Slides:



Advertisements
Similar presentations
On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach Author: Steven L. Salzberg Presented by: Zheng Liu.
Advertisements

Cross Cultural Research
Application of NIR for counterfeit drug detection Another proof that chemometrics is usable: NIR confirmed by HPLC-DAD-MS and CE-UV Institute of Chemical.
Contact: Eric Rozet, Statistician +32 (0)
June 19, Proposal: An overall Plan Design to obtain answer to the research questions or problems Outline the various tasks you plan to undertake.
1 Simple Interval Calculation (SIC-method) theory and applications. Rodionova Oxana Semenov Institute of Chemical Physics RAS & Russian.
Simple Interval Calculation bi-linear modelling method. SIC-method Rodionova Oxana Semenov Institute of Chemical Physics RAS & Russian.
1 Status Classification of MVC Objects Oxana Rodionova & Alexey Pomerantsev Semenov Institute of Chemical Physics Russian Chemometric Society Moscow.
Chapter 4 Validity.
WSC-6 Critical levels in projection Alexey Pomerantsev Semenov Institute of Chemical Physics, Moscow.
Research Methods in MIS
Discriminant Analysis To describe multiple regression analysis and multiple discriminant analysis. Discriminant Analysis.
CSE 300: Software Reliability Engineering Topics covered: Software metrics and software reliability Software complexity and software quality.
Chapter Three Research Design.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
CHAPTER 4 Research in Psychology: Methods & Design
Workshop at VUT Chemometrics in Excel Semenov Institute of Chemical Physics Russian Chemometrics Society Alexey Pomerantsev, Oxana Rodionova.
Lecture at VUT 1 PAT solution to the drug release prediction Semenov Institute of Chemical Physics Russian chemometric society A.L. Pomerantsev,
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
Chemometric functions in Excel
Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.
/CITAC / 1 Identification, Measurement and Decision in Analytical Chemistry Steve Ellison LGC, England.
Metrological Experiments in Biomarker Development (Mass Spectrometry—Statistical Issues) Walter Liggett Statistical Engineering Division Peter Barker Biotechnology.
WSC-4 Simple View on Simple Interval Calculation (SIC) Alexey Pomerantsev, Oxana Rodionova Institute of Chemical Physics, Moscow and Kurt Varmuza.
AP STATISTICS LESSON 10 – 4 ( DAY 1 ) INFERENCE AS DECISION.
Subset Selection Problem Oxana Rodionova & Alexey Pomerantsev Semenov Institute of Chemical Physics Russian Chemometric Society Moscow.
Linear Discriminant Analysis and Its Variations Abu Minhajuddin CSE 8331 Department of Statistical Science Southern Methodist University April 27, 2002.
PATTERN RECOGNITION : CLUSTERING AND CLASSIFICATION Richard Brereton
Constructing hypotheses & research design. The definition of a hypothesis A tentative proposition. Its validity is unknown. It specifies a relationship.
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
3-1 Copyright © 2010 Pearson Education, Inc. Chapter Three Research Design.
SECOND EDITION Chapter 5 Standardized Measurement and Assessment
Inference as Design Target Goal: I can calculate and interpret a type I and type II error. 9.1c h.w: pg 547: 15, 19, 21.
Dual data driven SIMCA as a one-class classifier WSC-9 Alexey Pomerantsev ICP RAS.
Effects of origin, genotype, harvest year and their interactions on stable isotope, multi-element and near-infrared fingerprints in wheat Boli Guo, Yimin.
Studies on the feasibility of using chemometric modeling of spectral data for the determination of post-mortem interval of skeletal remains. Kenneth W.
Matteo Reggente Giulia Ruggeri Satoshi Takahama
JMP Discovery Summit 2016 Janet Alvarado
Comparing Decision Rules
Formulation of hypothesis and testing
Course survey: what has been done, and what should be done
ELECTRONIC TONGUE BY R.PAVAN KUMAR, RIPER-ANANTAPUR.
CHAPTER 6, INDEXES, SCALES, AND TYPOLOGIES
Chemometrics for Analysis of NIR Spectra
Face Detection EE368 Final Project Group 14 Ping Hsin Lee
Authenticity and Geographic Origin of Tomato Products with Non Targeted Screening Mass Spectrometry and Chemometrics. Authors: Emiliano De Dominicis1,
CHAPTER 4 Research in Psychology: Methods & Design
Zaman Faisal Kyushu Institute of Technology Fukuoka, JAPAN
Erich Smith Coleman Platt
Association between two categorical variables
Discriminant Analysis
Louise Fortunato, Sulaf Assi, Paul Kneller and David Osselton
As a scientist, you are a Professional writer. Lecture II
Multivariate statistics
Classification Discriminant Analysis
Sampling and Sampling Distributions
Classification Discriminant Analysis
Multi-class PLS-DA: soft and hard approaches
Food Chemicals Toxicity
Yulia Monakhova, Bernd W.K. Diehl
As a scientist, you are a Professional writer. Lecture II
Rai University , November 2014
Data Driven SIMCA – more than One-Class Classifier
Measurement Concepts and scale evaluation
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
1 Chapter 8: Introduction to Hypothesis Testing. 2 Hypothesis Testing The general goal of a hypothesis test is to rule out chance (sampling error) as.
Recognition of the 'high quality’ forgeries among medicines
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

How to solve authentication problems Semenov Institute of Chemical Physics, RAS Moscow Russian Chemometric Society Oxana Rodionova, Alexey Pomerantsev WSC 10

What is authentication? It is the process of determining whether an object is, in fact, what it is declared to be ! Quick/ relatively cheap / often non-destructive measurements + chemometrics Direct chemical analysis WSC 10

Typical authentication problems Counterfeit drug Illegal additives in fuels Food adulteration Confirmation of geographical origin WSC 10

Discriminant analysis Fisher Iris data (1936) setosa versicolor virginica WSC 10

Discrimination vs. Class Modeling Target class WSC 10

Main differences Class modeling problems Discriminant problems The goal Determination whether an object is, in fact, what it is declared to be Determination of a membership of an object to one of the predefined classes Data sets Objects that represent a target class Several sets of objects that represent predefined classes Statistical/Chemometric methods UNECO, SIMCA, SVDD, etc… LDA, QDA, PLS-DA, SVM, etc ... Result of data modeling/ Decision rule development Decision rule for a given  value Boundaries/delineators between classes Figures of merit Sensitivity is given a priori. Specificity can be found theoretically when an alternative class is given. Sensitivity and specificity are found empirically post factum WSC 10

Main steps of class modeling Definition of a target class, objects which undoubtedly belong to the target class Data are divided into training and validation sets. Data processing. Establish a decision rule, acceptance area and/or values of thresholds Validation. Carefully trained decision rules has to be suspiciously validated against new genuine objects. Figures of merit. Type I error, sensitivity, type II error, specificity WSC 10

Figures of merit α β ‘Pure’ one-class classifier Type I error α is the rate of wrong rejections of the target class samples Sensitivity=(1-α)100% Availability of alien class/classes α Type II error β is the rate of wrong acceptances of aliens Specificity=(1-β)100% β WSC 10

PLS DA Training Validation/Prediction CLASS 1 PLS CLASS 2 CLASS 3 x11 x12 … x1k xi1 Xi2 xik xi+1,1 xi+1,2 x1+1,k ... xn1 xn2 Xnk xnk 1 CLASS 1 PLS CLASS 2 CLASS 3 Fingerprints Class membership 1.01 0.02 -0.05 0.98 1.02 -0.03 0.95 0.06 0.04 -0.02 1.05 0.08 1.1 x11 x12 … x1k x21 x22 x2k x31 x32 x3k x41 x42 x4k x51 x52 x5k x61 x62 ... x6k PLS CLASS 1 CLASS 2 CLASS 3 Validation/Prediction WSC 10

DD-SIMCA PCA Acceptance area Orthogonal distance vi Score distance hi WSC 10

Example Raw spectra Measurements in the diffuse reflection mode through a PVC blister Working range: 7482–4056 cm-1 (889 wavenumbers). WSC 10

Data description ● ■ A3 ▲ 50 20 200 A4 30 180 A7 80 Calcium channel blocker, uncoated tablets ( API 10 mg) Name Marker Number of training objects Number of validation objects Tablet mass, (mg) A3 ▲ 50 20 200 A4 ■ 30 180 A7 ● 80 O.Ye. Rodionova, K.S. Balyklova, A.V. Titova, A.L. Pomerantsev "Quantitative risk assessment in classification of drugs with identical API content", J. Pharm. Biomed. Anal. 2014, 98, 186-192 WSC 10

Discriminat analysis PLS 1 PLS 2 Target class A4 Target class A7 Sensitivity/specificity PLS2-DA (3 PLS-components) A3 A4 A7 100% 97% 96% WSC 10

Class modeling approach Target class A4 Training Validation/Prediction WSC 10

DD-SIMCA. Sensitivity and specificity a=0.05 expected/observed (%) A3 A4 A7 95/96 100/100 95/95 95/99 a=0.01 expected/observed (%) A3 A4 A7 99/97 100/100 99/100 99/99 WSC 10

Inactive substance Batch 1 FT-NIR DR spectra Batch 2 Training set 15 samples Test Set 10 samples FT-NIR DR spectra Batch 2 Training set 15 samples Test Set 10 samples WSC 10

Difference in modeling PLS-DA PCA Prediction WLoading WSC 10

Conclusions 1 Class-modeling methods develop the acceptance area around the target class, and, thus, delimit the target objects from any other objects and classes. 2 Using one-class classifier, we should always account for a risk of misclassification of alien objects. It is both important to validate the model using an independent set of the target objects, and to verify the model against a wide variety of the alien objects. 3 A well constructed discrimination method will perfectly classify a new sample only if this sample is a member of one of the predefined classes . If the new sample does not belong to any of such classes, the discriminant analysis is unable to properly define the membership of the sample. Discriminant methods are inappropriate for solving authentication problems. 4 Every task at hand requires an application of a pertinent chemometric method best suited to answer the posed question. WSC 10

Thank you for your attention! WSC 10