Download presentation
Presentation is loading. Please wait.
Published byOswald McDaniel Modified over 9 years ago
1
BIOSYST-MeBioSwww.biw.kuleuven.be The potential of Functional Data Analysis for Chemometrics Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius
2
BIOSYST-MeBioS The Potential of FDA for Chemometrics Introduction to FDA Introduction to Chemometrics Using FDA in chemometrics For prediction For Analysis Of Variance Conclusions
3
BIOSYST-MeBioS What is Functional Data Analysis? Developed by Ramsay & Silverman (1997) Analyse Data By approximating it Using some kind of functional basis Mainly for longitudinal data High correlation between neighbouring datapoints
4
BIOSYST-MeBioS Why use FDA? Data as single entity individual observations Make a function of your data Derivatives Reduce the amount of data Noise -> smoothing Impose some known properties on the data Monotonicity, non-negativeness, smoothness,...
5
BIOSYST-MeBioS Basis Functions? Polynomials: 1, t, t², t³,... Fourier: 1, sin(ωt), cos(ωt), sin(2ωt), cos(2ωt) Splines Wavelets Depends on your data
6
BIOSYST-MeBioS Chemometrics Measure optical properties of material Transmission or reflection of light At a large number of wavelengths Use these properties to predict something else
7
BIOSYST-MeBioS Why Chemometrics? Fast Cheap Non-destructive Environment-friendly
8
BIOSYST-MeBioS Classical methods Ignore correlation between neighbouring wavelengths:
9
BIOSYST-MeBioS FDA in chemometrics NIR spectra Absorption peaks Width and height Basis: B-splines ~ shape of absorption peaks Preserve the vicinity constraint
10
BIOSYST-MeBioS Spline Functions Piecewise joining polynomials of order m Fast evaluation Continuity of derivatives Up to order m-2 In L interior knots Degrees of freedom: L + m Flexible
11
BIOSYST-MeBioS
12
Constructing a spline basis Order What to use the model for Mostly cubic splines (order 4) Number and position of knots Use enough Look at the data !Overfitting
13
BIOSYST-MeBioS Position of knots More variation -> more knots
14
BIOSYST-MeBioS B-spline approximation
15
BIOSYST-MeBioS FDA for prediction Functional regression models P-Spline Regression (Marx and Eilers) Non-Parametric Functional Data Analysis (Ferraty and Vieu)
16
BIOSYST-MeBioS Functional Regression Models Project spectra to spline basis Apply Multivariate Linear Regression to the spline coefficients Great reduction in system complexity Natural shape of absorption peaks is used
17
BIOSYST-MeBioS Functional Regression Models: case study 420 samples of hog manure Reflectance spectra Total nitrogen (TN) and dry matter (DM) content PLS and Functional Regression applied
18
BIOSYST-MeBioS Functional Regression: case study (ct'd)
19
BIOSYST-MeBioS Functional Regression: case study: results
20
BIOSYST-MeBioS P-Spline Regression (PSR) By Marx and Eilers Construct with B-splines : Use roughness parameter on Minimize Full spectra are used for regression
21
BIOSYST-MeBioS P-Spline Regression: case study 121 samples of seed pills y is % humidity PLS: RMSEP = 1,19 PSR: RMSEP = 1,115 # B-spline coefficients = 7 λ = 0.001
22
BIOSYST-MeBioS Non-Parametric Functional Data Analysis By F. Ferraty and P. Vieu No regression model is involved Prediction by applying local kernel functions in function space So far, no good results yet
23
BIOSYST-MeBioS FDA in Anova setting: FANOVA ANOVA: “Study the relation between a response variable and one or more explanatory variables” is overall mean are the effects of belonging to a group g are residuals
24
BIOSYST-MeBioS FANOVA: theory Constraint: Introduce so that Introduce functional aspect: Constraint: introduce
25
BIOSYST-MeBioS FANOVA: goal and solution Goal: estimate from Solution:
26
BIOSYST-MeBioS FANOVA: significance testing Locally: Globally:
27
BIOSYST-MeBioS FANOVA: case study Spectra of manure 4 types of animals: dairy, beef, calf, hog 3 ambient temperatures: 4°C, 12°C, 20°C 3 sample temperatures: 4°C, 12°C, 20°C 9 replicates => 324 samples Model:
28
BIOSYST-MeBioS FANOVA: case study (ct'd)
29
BIOSYST-MeBioS FANOVA: case study (ct'd)
30
BIOSYST-MeBioS Conclusions Splines are a good basis for fitting spectral data Using FDA, it is possible to include vicinity constraint in prediction models in chemometrics FANOVA is a good tool to explore the variance in spectral data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.