Download presentation
Presentation is loading. Please wait.
1
A Novel Bayesian Approach for Uncovering Potential Spectroscopic Counterparts for Clinical Variables in 1 H NMR Metabonomic Applications Aki Vehtari 1 *, Ville-Petteri Mäkinen 1,2, Pasi Soininen 3, Petri Ingman 4, Sanna Mäkelä 5, Markku Savolainen 5, Minna Hannuksela 5, Kimmo Kaski 1, and Mika Ala-Korpela 1 * 1 Laboratory of Computational Engineering, Systems Biology and Bioinformation Technology, Helsinki University of Technology, P.O. Box 9203, FI-02015 HUT, Finland; 2 Folkhälsan Research Center, University of Helsinki, Finland; 3 Department of Chemistry, University of Kuopio, Finland; 4 Department of Chemistry, University of Turku, Finland; 5 Department of Internal Medicine, University of Oulu, Finland. {*Aki.Vehtari, *Mika.Ala-Korpela}@hut.fi
2
Protein lipid aggregates The ‘omics’ Revolution and Systems Biology Lipoproteins – the lipid transporters in human circulations Metabo*omics
3
A T H E R O S C L E R O S I S Underlies the clinical conditions leading to death of approximately half of the people in Western countries. A systemic disease characterised by the local build-up of lipid-rich plaques within the walls of large arteries.
4
The Trade-Off between Metabolic Coverage and the Quality of Metabolic Analysis Fernie, Trethewey, Krotzky and Willmitzer, Nat Rev Molec Cell Biol 5, 1 (2004). Systems Biology & the ‘omics’ Revolution Feasible, done To be explored… … metabo*omics…
5
Lipoprotein subclasses are a key issue in atherothrombosis 510204060801000 Diameter (nm) 1.20 1.10 1.06 1.02 1.006 0.95 Density (g/ml) HDL 2 HDL 3 Chylomicron Remnants VLDL IDL Chylo- microns Lp(a) LDL http://www.liposcience.com/ > million NMR LipoProfile® tests J. D. Otvos et al., LipoScience Inc. -CH 3 15 subclasses 1 spectrum at a time
6
Quantification of Biomedical NMR Data using Artificial Neural Network Analysis: Lipoprotein Lipid Profiles from 1 H NMR Data of Human Plasma Ala-Korpela, Hiltunen and Bell. NMR in Biomedicine 8, 235 (1995) 1 H NMR biochemistry versus clinical biochemistry
7
Metabolic information by 1 H NMR spectroscopy of serum
8
Principal Component Analysis
9
Lipoprotein Subclass Profiles via 1 H NMR Spectra Self-Organising Maps SOM – rather easy and rather fast The SOM clearly organised according to the lipoprotein subclass profiles, i.e., according to the spectral information in the lipoplasma spectra. Lipoprotein Subclass Profiling by 1 H NMR METABOLIC SYNDROME METABOLIC PATHWAY NORMAL -N(CH 3 ) 3 region / SOM U-matrix Suna, et al., NMR in Biomedicine, submitted. 1 H NMR biochemistry versus clinical biochemistry
10
Metabolic and other individual characteristics Metabolite profiles of pre-dose biofluids Inter-subject variation in effects of drugs Influence Predictable…?! 1 H NMR biochemistry versus clinical biochemistry
11
Individual Risk Assessment and Diagnostics (of Atherothrombosis) T H E R E I S A C A L L F O R M E T A B O N O M I C A P P R O A C H E S … … p a r t i c u l a r l y s i n c e: The 1 H NMR Profile of Serum –in principle– Contains ALL the Relevant Information for the CHD Risk Assessment = Lipoprotein Subclasses + many other metabolites…
12
1 H NMR Spectra of Human Serum at 500 MHz Molecular windows
13
A Novel Bayesian Approach for Uncovering Potential Spectroscopic Counterparts for Clinical Variables in 1 H NMR Metabonomic Applications Aki Vehtari 1 *, Ville-Petteri Mäkinen 1,2, Pasi Soininen 3, Petri Ingman 4, Sanna Mäkelä 5, Markku Savolainen 5, Minna Hannuksela 5, Kimmo Kaski 1, and Mika Ala-Korpela 1 * 1 Laboratory of Computational Engineering, Systems Biology and Bioinformation Technology, Helsinki University of Technology, P.O. Box 9203, FI-02015 HUT, Finland; 2 Folkhälsan Research Center, University of Helsinki, Finland; 3 Department of Chemistry, University of Kuopio, Finland; 4 Department of Chemistry, University of Turku, Finland; 5 Department of Internal Medicine, University of Oulu, Finland. {*Aki.Vehtari, *Mika.Ala-Korpela}@hut.fi; vmakine2@lce.hut.fi
14
Objectives and requirements ● Quantitative target: Estimating the value of a clinical variable from 1 H NMR spectrum. - Accuracy must be maximized. ● Explanatory target: What are the spectral features that best explain the clinical variable? - Results must be easy to interpret. ● The two requirements can be conflictive.
15
Dataset ● 100 serum samples from an ongoing clinical study of the effects of alcoholism (Dept Internal Medicine, Univ Oulu). ● Two 1 H NMR molecular windows (LIPO & LMWM) were measured from each sample. ● A 500 MHz NMR-spectrometer with a double-tube system that enables absolute metabolite quantification. ● Automatic sample changer (24 samples in 16h).
16
VLDL / LDL / HDL Overlapping resonances Phospholipid choline headgroup Lipoprotein spectra Triglycerides HDL particles Lipid signals Lactate doublet Cholesterol
17
Low-molecular weight metabolites Glucose peaks Creatinine Acetate Lactate Alanine Valine Creatinine
18
Correlation analysis HDL particles Triglycerides Abnormal triglycerides and HDL cholesterol are associated with cardiovascular diseases and are components of the metabolic syndrome, a clinically established condition with increased risk for atherosclerosis. 1 H NMR biochemistry versus clinical biochemistry
19
Bayesian inference* Regression model - robust against outliers - linearity preferred Feature extraction - biologically motivated - easy to interpret - relevant wrt to target Feature selection Feature weights Spectral parameterisation *This is only a schematic view and not a precise description of the posterior density.
20
Regression model ● Heteroscedastic linear regression with scale-mixture Gaussian noise model (asymptotically Student-t). ● μ is the α 2 U-scaled predictor, effectively reducing the effect of outliers. ● The purpose of α is to improve convergence of Gibbs’ sampling (Gelman et al. 2004).
21
Kernel-based feature extraction width location 3σ3σ
22
Spectral parameterisation ● Gaussian kernels truncated at 3σ. ● Each kernel is represented by width and location. ● Posterior widths and locations were obtained by slice sampling from [0.008, 0.8] ppm. ● The number of kernels was obtained by reversible jump MCMC (max 20 kernels). ● MCMCStuff software for Matlab was written by Aki Vehtari et al.
23
Additional details ● The degrees of freedom for the residual model was obtained by slice sampling within [2, 40]. ● Prior for the number of kernels was a decaying exponential to eliminate the “tail saturation” effect. ● Ten independent chains of 10 000 samples. ● Predictive replicates were found to closely match10-fold cross-validation results.
24
Quantification results Non-spectroscopic measurement 1 H NMR VLDL-TG R 2 = 0.97 HDL-C R 2 = 0.87 n = 75 n = 67 1 H NMR biochemistry versus clinical biochemistry
25
Relevant spectral features for VLDL-TG Frequency [ppm] Best observable VLDL-TG signal is located at the biochemically expected frequencies. -CH 3 (-CH 2 -) n -N(CH 3 ) 3
26
Relevant spectral features for HDL-C Frequency [ppm] Best observable HDL cholesterol signals are in the phospholipid choline headgroup region -CH 3 (-CH 2 -) n -N(CH 3 ) 3
27
Conclusions ● Kernel based parameterisation is able to describe the data effectively. ● A Bayesian treatment with linear regression gives both accuracy in quantification and relative ease of interpretation. ● The results are biochemically fully coherent.
28
Future work ● Continued assessment of usefulness of different approaches within clinical research environment (interpretation). ● Systematic testing of kernel-based regression with Bayesian and other approaches (accuracy). ● Analysis across the full frequency range in both molecular windows. ● Application of Bayesian kernel parameterisation to classification problems.
29
Good life is antiCHD…! It is most probable that the Bayesian methodology will have a crucial role in paving the way for metabonomics on the clinical arena. Life is about probabilities (and lipoproteins)… THANK YOU
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.