Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIOSYST-MeBioSwww.biw.kuleuven.be The potential of Functional Data Analysis for Chemometrics Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius.

Similar presentations


Presentation on theme: "BIOSYST-MeBioSwww.biw.kuleuven.be The potential of Functional Data Analysis for Chemometrics Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius."— Presentation transcript:

1 BIOSYST-MeBioSwww.biw.kuleuven.be The potential of Functional Data Analysis for Chemometrics Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius

2 BIOSYST-MeBioS The Potential of FDA for Chemometrics Introduction to FDA Introduction to Chemometrics Using FDA in chemometrics For prediction For Analysis Of Variance Conclusions

3 BIOSYST-MeBioS What is Functional Data Analysis? Developed by Ramsay & Silverman (1997) Analyse Data By approximating it Using some kind of functional basis Mainly for longitudinal data High correlation between neighbouring datapoints

4 BIOSYST-MeBioS Why use FDA? Data as single entity individual observations Make a function of your data Derivatives Reduce the amount of data Noise -> smoothing Impose some known properties on the data Monotonicity, non-negativeness, smoothness,...

5 BIOSYST-MeBioS Basis Functions? Polynomials: 1, t, t², t³,... Fourier: 1, sin(ωt), cos(ωt), sin(2ωt), cos(2ωt) Splines Wavelets Depends on your data

6 BIOSYST-MeBioS Chemometrics Measure optical properties of material Transmission or reflection of light At a large number of wavelengths Use these properties to predict something else

7 BIOSYST-MeBioS Why Chemometrics? Fast Cheap Non-destructive Environment-friendly

8 BIOSYST-MeBioS Classical methods Ignore correlation between neighbouring wavelengths:

9 BIOSYST-MeBioS FDA in chemometrics NIR spectra Absorption peaks Width and height Basis: B-splines ~ shape of absorption peaks Preserve the vicinity constraint

10 BIOSYST-MeBioS Spline Functions Piecewise joining polynomials of order m Fast evaluation Continuity of derivatives Up to order m-2 In L interior knots Degrees of freedom: L + m Flexible

11 BIOSYST-MeBioS

12 Constructing a spline basis Order What to use the model for Mostly cubic splines (order 4) Number and position of knots Use enough Look at the data !Overfitting

13 BIOSYST-MeBioS Position of knots More variation -> more knots

14 BIOSYST-MeBioS B-spline approximation

15 BIOSYST-MeBioS FDA for prediction Functional regression models P-Spline Regression (Marx and Eilers) Non-Parametric Functional Data Analysis (Ferraty and Vieu)

16 BIOSYST-MeBioS Functional Regression Models Project spectra to spline basis Apply Multivariate Linear Regression to the spline coefficients Great reduction in system complexity Natural shape of absorption peaks is used

17 BIOSYST-MeBioS Functional Regression Models: case study 420 samples of hog manure Reflectance spectra Total nitrogen (TN) and dry matter (DM) content PLS and Functional Regression applied

18 BIOSYST-MeBioS Functional Regression: case study (ct'd)

19 BIOSYST-MeBioS Functional Regression: case study: results

20 BIOSYST-MeBioS P-Spline Regression (PSR) By Marx and Eilers Construct with B-splines : Use roughness parameter on Minimize Full spectra are used for regression

21 BIOSYST-MeBioS P-Spline Regression: case study 121 samples of seed pills y is % humidity PLS: RMSEP = 1,19 PSR: RMSEP = 1,115 # B-spline coefficients = 7 λ = 0.001

22 BIOSYST-MeBioS Non-Parametric Functional Data Analysis By F. Ferraty and P. Vieu No regression model is involved Prediction by applying local kernel functions in function space So far, no good results yet

23 BIOSYST-MeBioS FDA in Anova setting: FANOVA ANOVA: “Study the relation between a response variable and one or more explanatory variables” is overall mean are the effects of belonging to a group g are residuals

24 BIOSYST-MeBioS FANOVA: theory Constraint: Introduce so that Introduce functional aspect: Constraint: introduce

25 BIOSYST-MeBioS FANOVA: goal and solution Goal: estimate from Solution:

26 BIOSYST-MeBioS FANOVA: significance testing Locally: Globally:

27 BIOSYST-MeBioS FANOVA: case study Spectra of manure 4 types of animals: dairy, beef, calf, hog 3 ambient temperatures: 4°C, 12°C, 20°C 3 sample temperatures: 4°C, 12°C, 20°C 9 replicates => 324 samples Model:

28 BIOSYST-MeBioS FANOVA: case study (ct'd)

29 BIOSYST-MeBioS FANOVA: case study (ct'd)

30 BIOSYST-MeBioS Conclusions Splines are a good basis for fitting spectral data Using FDA, it is possible to include vicinity constraint in prediction models in chemometrics FANOVA is a good tool to explore the variance in spectral data


Download ppt "BIOSYST-MeBioSwww.biw.kuleuven.be The potential of Functional Data Analysis for Chemometrics Dirk De Becker, Wouter Saeys, Bart De Ketelaere and Paul Darius."

Similar presentations


Ads by Google