Download presentation
Presentation is loading. Please wait.
Published byDomenic Barker Modified over 9 years ago
1
Definition and overview of chemometrics
2
Paul Geladi Head of Research NIRCE Chairperson NIR Nord Unit of Biomass Technology and Chemistry Swedish University of Agricultural Sciences Umeå Technobothnia Vasa paul.geladi @ btk.slu.se paul.geladi @ syh.fi
5
Project geography
6
Chemometrics Mathematics Statistics Computer Science In Chemistry
7
Similar fields Biometrics ±1900 Psychometrics ±1930 Econometrics ±1950 Technometrics ±1960
8
Chemometrics Design of Experiments (DOE) Exploratory Data Analysis Classification Regression and Calibration
9
Design of Experiments Most important where possible Uses: ANOVA F-test t-test Plots Response Surfaces
10
Design of Experiments y = b 0 + b 1 x 1 + b 2 x 2 +...+b K x K + b 11 x 1 2 + b 22 x 2 2 +...+ b KK x K 2 + b 12 x 1 x 2 +...+ Factors x 1, x 2,...x K changed systematically Response y measured and modeled
11
Exploratory Data Analysis Design not possible Sampling situations Find structure Find groupings Find outliers
12
Classification Check for groupings = UNSUPERVISED Existing groupings = SUPERVISED Visualize groupings Classify Test
13
Regression / Calibration Two types of variables X / y Relationship linear / nonlinear Model Diagnostics Residual
14
x y
15
Multivariate Data Analysis
16
Sampled data and design with too many reponses: Mining Hospitals Agriculture Food industry More
17
Nomenclature Samples are objects What is measured on the object is a variable
18
34.92 Spectrum SamplesSamples Vectors 1 K 1 I
19
12 3.6 11.1 5.9 34 0.5 1.4 17 A vector is a collection of numbers. It is always a column vector.
20
The transpose of a vector is a row vector. Symbols for transpose are ’ and T. a’ or a T. 12 3.6 11.1 5.9 34 0.5 1.4 17
21
Particle size, 1 sample
22
Small particles, 35 samples
23
The Data Matrix A data matrix is a vector of vectors I K
24
Size histograms, all samples Particle area
25
NIR wavelengths Times in batch reaction
26
Geometry of multivariate space
27
Problem I and K can be large Correlation Univariate statistics does not apply
28
I patients 3 variables: blood oxygen, iron, hemoglobin
29
O2O2 Fe Hb
30
O2O2 Fe Hb
31
O2O2 Fe Hb
32
O2O2 Fe Hb
33
O2O2 Fe Hb
34
O2O2 Fe Hb
35
O2O2 Fe Hb
36
O2O2 Fe Hb
37
O2O2 Fe Hb
38
Properties of multivariate space Rotation vectors unchanged / distance unchanged Translation vectors changed / distance unchanged Rescaling / change units all changes
39
Consequences We can move the coordinate sytem around The relative distances between objects do not change We can rotate the coordinate system Scale changes are important Move coordinate system to center of data Scale properly
40
Vectors (physics) x = [ x 1, x 2, x 3 ] || x || = ( x 1 2 + x 2 2 + x 3 2 ) 1/2
41
Geometry a b c c 2 = a 2 + b 2
42
Vectors (K dimensions) x = [ x 1, x 2,..., x K ] || x || = ( x 1 2 + x 2 2 +...+ x K 2 ) 1/2
43
Problem We can not see in more than 3 dimensions Paper, computer screen: 2-2.5 dimensions
44
O2O2 Fe Hb
45
O2O2 Fe Hb
46
Projection 2D plane (screen, paper) Many projections possible Find a good one Find a few good ones What is good?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.