Metabolomic Profiling in Drug Discovery: Understanding the Factors that Influence a Metabolomics Study and Strategies to Reduce Biochemical and Chemical Noise Mark Sanders1;Serhiy Hnatyshyn2; Don Robertson2; Michael Reily2; Thomas McClure1; Michael Athanas3, Jessica Wang1, Pengxiang Yang1 and David Peake1 1Thermo Fisher Scientific, San Jose, CA; 2Bristol Myers Squibb, Princeton, NJ 3Vast Scientific, Boston, MA Talk is too long and will need to be edited. Edits will depend on how data looks after it has been processed. Slides also need to be cleaned up in terms of consistent font sizes, bullet formats, etc. Some graphics need to be cleaned up.
Metabolomics in Drug Discovery Identification and quantitation of endogenous “markers” Compound selection Target effects – efficacy markers Off-target effects – toxicity/liability markers Identification of markers provides mechanistic insights Target validation Mechanism of toxicity Early evaluation of potential clinical markers Ensure homogeneity within an animal study Prescreen for outliers and remove them from the study
Metabolomics in Drug Discovery Targeted Analysis Metabolite target analysis Analysis restricted to metabolites of an enzyme system that are known to be affected by a certain perturbation Metabolite profiling Analysis focused on a class of compounds associated with a particular pathway (e.g. nucleoside triphosphates, lipids, steroids, etc.) Only find what you are looking for Metabonomics A comprehensive analysis of all metabolites A measure of the fingerprint of biochemical perturbations – pattern recognition Useful when you don’t know what to expect Hypothesis generation
Metabolomics Analysis Goals Quantitative assessment of the biochemical makeup of the samples Differential analysis between sample groups Identify compounds responsible for changes Challenges Complexity of a biological sample Diversity of small molecule metabolites Wide range of metabolite concentration Incomplete information – majority of components seen by LC/MS are unknowns Structure elucidation of unknowns is expensive Need sophisticated data reduction tools and strategies to minimize “noise”
Sources of Noise in a Metabolomics Experiment Instrumental Mass stability Retention time stability Sufficient resolution to resolve isobaric components Chemical Background from column/solvents Multiple signals per compound Biological Different response rates to a stimulus Stress status Feeding status Other health factors Study Design Proper controls and randomized sampling/analysis to minimize systematic errors Statistical Analysis Limited sampling Over fitting data Going to concentrate on Instrument, Chemical, Biological
Instrumentation: Q Exactive Precursor ion selection for SIM and MS/MS functionality HCD Collision Cell Quadrupole Mass Filter Quadrupole mass filter Quadrupole: 4 mm, hyperbolic rods Isolation down to 0.4 amu HCD collision cell Analogous to LTQ Orbitrap Velos
Instrumentation: Q Exactive More sensitive ion source S-lens Stacked Ring Ion Guide Analogous to Velos Pro Shorter inject times for MS/MS and SIM S-lens
Instrumentation: Q Exactive Higher scan speed Spectrum Multiplexing Spectrum Multiplexing MX SIM: Detect multiple C-trap fillings in one FT scan MX MS/MS: Detect multiple HCD Cell fillings in one FT scan Enhanced FT Improved resolution Or Faster acquisition speed Enhanced FT
Instrumentation: HCD MS/MS to Confirm Identity Extracted Rat Serum Base Peak Chromatogram 1.7e7 60 80 100 120 140 160 180 200 m/z 94.0652 C 6 H 8 N 146.0602 9 O 136.0758 10 174.0551 2 192.0656 3 120.0445 7 150.0551 74.0240 4 118.0652 84.9598 MS/MS of 209.1 10.01 Kynurenine 10.84 2.22 1.51 5.99 9.68 0.57 8.10 6.34 8.94 3.71 5.18 1 2 3 4 5 6 7 8 9 10 11 Time (min)
Instrumentation: Cycle Time for TOP10 HCD Method 100 80 60 40 20 20 25 30 35 40 45 50 Time (min) 32.2464 32.2645 1.086 sec 32.245 32.250 32.255 32.260 32.265 32.270 Time (min) Apex Triggered MS/MS 2.8 2.9 3.0 Time (min)
Instrumentation: Q Exactive - Mass Stability -0.45ppm XIC 185.0969 ± 5ppm 185.0968 4.28 Resolution = 82,000 7:51pm Ext.Cal + 52.85 hrs -0.04ppm 186.1003 -0.72ppm 4.27 185.0968 FWHM = 1.86 sec 11:14pm -0.31ppm 186.1002 4.28 185.0969 -0.24ppm 3:32am 0.39ppm 186.1003 4.27 -0.72ppm 185.0968 8:08am Ext.Cal + 65.13 hrs -0.47ppm 186.1002 4.15 4.20 4.25 4.30 4.35 4.40 Time (min) 185.5 m/z 186.0
Instrumentation: Q Exactive - Resolution 311.1689 34S 313.1641 Calculated Measured 13C2 313.1741 Exactive 25,000 Resolution 313.1648 313.14 313.16 313.18 313.20 m/z 312.1715 313.1641 313.12 313.14 313.16 313.18 313.20 313.22 311.0 311.5 312.0 312.5 313.0 313.5 m/z m/z
Instrumentation: Q Exactive - Speed Example of a 3 sec wide peak run at 35K and 70K on Q Exactive
Instrumentation: Quantitation Example of Full Scan and SIM for Quantitation Does not have to be an endogenous compound Markus should have this data
Need to be able to reduce the data to chemical entities Chemical: Anatomy of a U-HPLC/Orbitrap Data Set m/z window = ± 1 Da 100 200 300 400 500 600 700 800 900 1000 [M+H]+ m/z 853.0 853.5 854.0 854.5 855.0 852.9720 853.4727 853.9745 854.4817 z =+2 5 >1,000,000 data points ~100,000 extracted ion peaks. Peak area ranges ~ 7 orders Much irrelevant data Much redundant data High quality data from the Orbitrap allows for more precise automated data processing Removing chemical noise +3 +2 +4 Need to be able to reduce the data to chemical entities +1
~98% of lower intensity signals are eliminated Chemical: Background Subtraction Sample - Solvent blank = Analyte signals Removing chemical noise ~98% of lower intensity signals are eliminated
Chemical: Spectral Interpretation m/z 180.0652 [M+H]+ 2 3 4 5 3.37 4e7 3e5 1e6 9e5 m/z 576.1277 [3M+Ca-H]+ m/z 591.0930 [3M+Fe-2H]+ m/z 413.0427 [2M+Fe-H]+ Time (min) [M+H]+ 180.07 Hippuric Acid ESI+ 12 related ions ESI - 24 related ions [2M+H]+ [M+Na]+ 359.12 202.05 413.04 576.13 100 150 200 250 300 350 400 450 500 550 600 m/z
Chemical: Spectral Interpretation m/z 180.0652 [M+H]+ 2 3 4 5 3.37 4e7 3e5 1e6 9e5 m/z 576.1277 [3M+Ca-H]+ m/z 591.0930 [3M+Fe-2H]+ m/z 413.0427 [2M+Fe-H]+ Time (min) Fe Isotope Pattern Detected [M+H]+ 180.07 590 592 594 m/z 591.0929 592.0975 593.1014 589.0975 591.0935 592.0965 593.0987 589.0981 594.0981 594.1010 590.1034 590.1013 Measured Theoretical C27H25N3O9Fe [2M+H]+ [M+Na]+ 359.12 202.05 413.04 576.13 100 150 200 250 300 350 400 450 500 550 600 m/z
Chemical: Varying Response with Different Ion Species
Chemical: Removing Noise from Statistics Components Loss of group separation m/z Peaks Increased intra group variability Fed Fasted Female Male Clear distinction between groups once noise is removed
Software: Approach to Data Reduction Sample A complex mixture of compounds. Component Ion An ion that can be used to represent a detected compound. Both dimensions of LC/MS analysis have challenges. Reproducibility of chromatographic peak position from injection to injection Electrospray ionization generates complex, redundant MS data Automation Software should automatically analyze the data as an analyst would manually Interpret the spectra and identify component ions Charge state Isotopes Adducts Set the component parameters based on the run with the most intense signal Slight shifts of chromatographic peak can be observed even for different adducts of the same component in the same sample. Irregular retention time shifts between injection requires special peak alignment procedure to compare different samples. The same chemical entity may be represented by several related peaks – isotopic peaks and/or adducts, dimers, fragments, etc. Monoisotopic molecular weight of the component is calculate based of composition of m/z values of the peaks in the component’s group. Quantitative representation for the component can be the sum of areas of all contributing peaks or any individual signal which shows the linear response to the component’s concentration. Want a 1:1 correlation of component ions to compounds
Software: Sieve 2.0 Workflow Provide screen shots when available
Biological: Study Description/Purpose Study designed to monitor the effect of fasting on metabolic profiles Don’t want to confuse fasting effects with drug or disease induced effects Waiting on BMS for clearance on how detailed the study description can be.
Biological: Results in Sieve 2.0 Michael to provide plots once software is ready
Biological: Findings Fasting has profound impact on metabolomic profiles Most metabolic changes are modest in extent Fasting-status may exacerbate or obscure drug-induced metabolic effects. Fasting data help contextualize drug-induced changes in many metabolites Fasting is neither “right” or “wrong” but it is a significant variable in model design Waiting on BMS Toxicologist interpretation of the data and provide biological significance – should look similar to this
Summary Comprehensive or untargeted metabolomics is very challenging. It is fraught with numerous sources of noise and the cost of going down the wrong path is high With the right software and hardware many of the sources of noise can be minimized Biological noise needs to be understood through systematic studies such as the one described here More detailed summary once the data has been processed and interpreted.