Canadian Bioinformatics Workshops www.bioinformatics.ca.

Slides:



Advertisements
Similar presentations
Protein Quantitation II: Multiple Reaction Monitoring
Advertisements

Improvements in Mass Spectrometry for Life Science Research – Does Agilent Have the Answer? Ashley Sage PhD.
Welcome! Mass Spectrometry meets Cheminformatics Tobias Kind and Julie Leary UC Davis Course 7: Concepts for LC-MS Class website: CHE Spring 2008.
Pesticide screening LC-QTOF, Agilent. National Food Institute, Technical University of Denmark Disposition National Food Institute –EURL –NRL –Personale.
Proposal for a Standard Representation of the Results of GC-MS Analysis: A Module for ArMet Helen Fuell 1, Manfred Beckmann 2, John Draper 2, Oliver Fiehn.
Chem. 133 – 4/28 Lecture. Announcements Lab Report 2.3 due Today Pass back graded materials (lab reports 2.2, Q5, and AP3.1) Today’s Lecture Mass Spectrometry.
LC-MS Based Metabolomics. Analysing the METABOLOME 1.Metabolite Extraction 2.Metabolite detection (with or without separation) 3.Data analysis.
Molecular Mass Spectrometry
LC/MS WORKSHOP IOWA STATE UNIVERSITY Kamel Harrata  Instrument Description  Data Acquisition  Data Processing.
Previous Lecture: Regression and Correlation
My contact details and information about submitting samples for MS
Russell Rouseff FOS 6355 Summer 2005 What is Mass Spectroscopy Analytical Chemistry Technique Used to identify and quantify unknown compounds Can also.
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Chemalys September 2009 Chemalys Jan Nordin Chemalys Massworks Extend the Limits of Your LC/MS-System Brukermøte i Massespektrometri 27 mai 2010.
Evaluated Reference MS/MS Spectra Libraries Current and Future NIST Programs.
Mass Spectrometry Mass spectrometry (MS) is not true “spectroscopy” because it does not involve the absorption of electromagnetic radiation to form an.
2007 GeneSpring MS GeneSpring for Metabolite BioMarker Analysis using Mass Spectrometry data Agilent Q-TOF VIP Visit Jan 16-17, 2007 Santa Clara, CA Thon.
Organic Mass Spectrometry
© 2010 SRI International - Company Confidential and Proprietary Information Quantitative Proteomics: Approaches and Current Capabilities Pathway Tools.
Molecular mass spectrometry Chapter 20 The study of “molecular ions” M + e -  M e -
Mass spectrometry session. Summary Fiehn (1) Standardization important Reporting important, but has to be feasible Does not matter which MS instrument.
Common parameters At the beginning one need to set up the parameters.
1 Chemical Analysis by Mass Spectrometry. 2 All chemical substances are combinations of atoms. Atoms of different elements have different masses (H =
Organic Mass Spectrometry
Analysis of Complex Proteomic Datasets Using Scaffold Free Scaffold Viewer can be downloaded at:
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
SDBS Integrated Spectral Database for Organic Compounds Sample Search for Chemistry 130 Grace Baysinger and Dr. Dave Keller.
MS Intro. MS requires gas-phase ions, why? MS uses magnetic and electric fields to control the path of a compound based on mass to charge ratio (m/z)
Finding a Needle in a Haystack: Using High Resolution Mass Spectrometry in Targeted and Non Targeted Searching for Food Contaminants Erik Verschuuren.
Advanced Analytical Chemistry – CHM 6157® Y. CAIFlorida International University Updated on 9/26/2006Chapter 3ICPMS Interference equations Isobaric.
CHM 312 Fall 2008 Special Topics in MS Dr. Ralph Mead.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
EBI is an Outstation of the European Molecular Biology Laboratory. In silico analysis of accurate proteomics, complemented by selective isolation of peptides.
Mass Spectrometry Quantitative Mass Spectrometry
Metabolomics MS and Data Analysis PCB 5530 Tom Niehaus Fall 2015.
Chem. 133 – 4/26 Lecture. Announcements Return graded quiz and additional problem Lab – Lab report deadlines (2:4 – Thursday) Today’s Lecture – Mass Spectrometry.
Geranyl acetate C12H20O2. Mass Spectral Libraries An Ever-Expanding Resource for Chemical Identification Steve Stein Mass Spectrometry Data Center National.
2014 생화학 실험 (1) 6주차 실험조교 : 류 지 연 Yonsei Proteome Research Center 산학협동관 421호
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introduction to Liquid Phase Mass Spectrometry
김지형. Introduction precursor peptides are dynamically selected for fragmentation with exclusion to prevent repetitive acquisition of MS/MS spectra.
Computational Challenges in Metabolomics (Part 1)
RANIA MOHAMED EL-SHARKAWY Lecturer of clinical chemistry Medical Research Institute, Alexandria University MEDICAL RESEARCH INSTITUTE– ALEXANDRIA UNIVERSITY.
Yonsei Proteome Research Center Peptide Mass Finger-Printing Part II. MALDI-TOF 2013 생화학 실험 (1) 6 주차 자료 임종선 조교 내선 6625.
MS Libraries for Forensics: DART-MS and GC-MS
Metabolomics Part 2 Mass Spectrometry
Metabolomics Data Analysis
Mass Spectrometry makes it possible to measure protein/peptide masses (actually mass/charge ratio) with great accuracy Major uses Protein and peptide identification.
Ionization techniques, Mass spectra and MS-Instrumentation
Tandem MS.
Chem. 133 – 4/18 Lecture.
LC-MS/MS Identification of Impurities Present in Synthetic Peptide Drugs Dr Anna Meljon*, Dr Alan Thompson, Dr Osama Chahrour, and Dr John Malone Almac.
Chem. 133 – 4/13 Lecture.
Chem. 133 – 4/20 Lecture.
Ivana Blaženović Postdoctoral Researcher
Metabolomics Part 2 Mass Spectrometry
Mass Spectrometry meets Cheminformatics WCMC Metabolomics Course 2013
Introduction Spectroscopy is an analytical technique which helps determine structure. It destroys little or no sample. The amount of light absorbed by.
Presentation Title NEMC 2018 Dale Walker, Bruce Quimby Agilent
Microbiome: Metabolomics
Metabolomics: Preanalytical Variables
Mass Spectrometry THE MAIN USE OF MS IN ORG CHEM IS:
Shotgun Proteomics in Neuroscience
Microbiome: Metabolomics
Comprehensive, Non-Target Characterization of Environmental Exposome Samples Using GC×GC and High Resolution Time-of-Flight Mass Spectrometry James Carlson,
Determining the Structure of an Organic Compound
Operation manual of AI SIDA
Presentation transcript:

Canadian Bioinformatics Workshops

2Module #: Title of Module

Module 3 Metabolite Identification and Annotation – Part II

Learning Objectives Become aware/familiar with of other NMR spectral deconvolution tools Learn about how spectral deconvolution is applied to GC-MS data Become aware of NIST and AMDIS Learn about various MS database searches and MS databases Learn about molecular formula generation Learn about other techniques for unknown compound ID by MS

Goal of Metabolite Annotation ppm

Metabolite ID by Spectral Deconvolution (NMR) Mixture Compound A Compound B Compound C

Alternatives to Chenomx AMIX (Bruker) AutoFit (automated fitting) MetaboMiner (2D NMR) HMDB (NMR spectral match) PRIMe Spin Assgn (NMR spectral matching server) rNMR and BRMB Peaks Server BATMAN

AutoFit - Automated NMR Profiling

Performance of Autofit Synthetic Real P. Mercier et al. J Biomol NMR Apr;49(3-4):307-23

NMR Compound ID from Mixtures - MetaboMiner Raw TOCSY Spectrum ID’d Compounds

MetaboMiner Software Design Standard reference libraries –225 TOCSY spectra –488 HSQC spectra –Specialized sub-libraries for CSF, plasma and urine Algorithms for automatic processing & compound identification –“Minimal signature peaks” –1D 1H peak list as sanity check –Extra dimensional information for identification Support for direct spectral annotation

MetaboMiner Performance

NMR Compound ID - HMDB Phenyllactate Phenylpyruvate Phenylacetic acid Tropic acid Benzyl alcohol … NMR spectrum of mixture Peak list to HMDB High scoring matches

PRIMe Spin Assign

rNMR

BMRB Peaks Server

BATMAN

Metabolite ID by GC-MS GC -MS total Ion chromatogram

EI Breaks up Molecules in Predictable Ways Molecular ion Recall EI MS Generates Multiple Peaks

GC-MS Spectrum

Recall GC-MS Analytes are Derivatized Methoxime

Metabolite ID by GC-MS GC-MS is often best for identification of amino acids, organic acids, sugars, fatty acids and molecules with MW<500 GC has higher resolution and reproducibility than LC EI-MS is more standardized than soft ionization methods, so EI spectra are more comparable Most common route is to use AMDIS + NIST database

NIST 11 MS Database 243,893 EI spectra of 212,961 cmpds 9934 ion trap MS for 4649 cmpds 91,557 Qtof & QqQ spectra for 3774 compounds 224,038 RI values for 21,847 cmpds

NIST MS Search Software

AMDIS (Automated Mass Spectral Deconvolution and Identification System) Noise analysis –Determines background noise level Component perception –Identifies peaks by comparing to noise Spectral deconvolution –Generates a “clean” or model spectrum Compound identification –Identifies compounds via a library search using a match factor

Match Factor (MF) Measures the similarity of the MS spectrum of the query to the MS spectrum in the reference database Defined as the normalized dot product of the query and the reference spectra I ref corresponds to the intensities of the reference spectra, I qry corresponds the intensities of the query spectra, M corresponds to the masses (m/z) w is a weighting term to penalize uncertain peaks

GC-MS Protocol Prepare a set of external n-alkane standards (8-9 n-alkanes spanning octane to hexadecane) and run as an external calibration standard Run a “blank sample” containing just the solvent and derivatization agents Run the sample of interest (under the same conditions as the blank)

GC-MS Protocol External n-alkane standard used for RI calculation

GC-MS Protocol Create a calibration file using the n-alkane mixture (sets retention indices [RI’s] to the standard values) Analyze the sample data file against the CAL(calibration)-file for the alkane mixture (sets and recalculates RI's using the n- alkanes) Search the NIST database for matches and displaying the results of the search Get rid of “false” positives by comparing the “blank” against the sample spectrum

Step 1- Create Calibration File AMDIS

Step 2 – Calibrate Sample Spectrum Using CAL-file

AMDIS GC Peak List EI-MS Spectrum For Step 3 – Search NIST Database for Matches

Match factor  60% (if in doubt compare “blank” and your signal) Step 3 – Search NIST Database for Matches (Zero in) Reference Spectrum Peak Spectrum MF = 84% Match To Valine 73 & 144 are 2 most abund. m/z

Other GC-MS Options Alternatives to AMDIS –AnalyzerPro (SpectralWorks) –ChromaTOF (Leco) –Evaluated in TrAC Trends in Analytical Chemistry Volume 27, Issue 3, March 2008, Pages Alternatives to NIST08 or NISTII –Golm Database (Open access) –FiehnLib (Leco, Agilent) –HMDB???

The Golm Database GC-MS (Quad and TOF) database Contains MSRI (MS + retention index) or MST data for 1450 identified metabolites Includes 10,336 spectra linked to analytes Downloadable libraries compatible with NIST08 and AMDIS software Primary focus on plant metabolites Supports compound name and MS queries MS submissions via NIST08 or AMDIS format

Golm Database

Golm Database

The FiehnLib GC-MS Database 2212 EI MS and RI data for quadrupole &TOF GC-MS Over 1000 primary metabolites below 550 Da Covers lipids, amino acids, fatty acids, amines, alcohols, sugars, amino- sugars, sugar alcohols, sugar acids, sterolsphosphates, hydroxyl acids, purines

Metabolite ID by LC-MS LC -MS total Ion chromatogram

Levels of Metabolite Identification in MS 4 levels of metabolite identification Positively identified compounds –Confirmed by match to known standard Putatively identified compounds –Match to MS + RT or MS/MS + RT Compounds putatively identified in a compound class Unknown compounds

Metabolite ID by LC-MS LC-MS is often best for identification of lipids, bases, amino acids, organic acids, fatty acids and other somewhat hydrophobic molecules Metabolite ID typically requires both MS and MS/MS data (along with retention time information) and internal standards Compound ID can be done by high accuracy mass matching and/or by MS/MS matching to spectral databases

Simple MW Search DBs ChEBI ( PubChem ( ChemSpider ( HMDB (

PubChem MW Search Available Under “Advanced Search”

PubChem Results

ChEBI MW Search

Advanced MS Search DBs NIST/AMDIS ( Metlin ( HMDB ( MassBank (

Advanced MS Search DBs These databases support not only MW or MW range searches, but also support parent ion searches (positive, negative, neutral), peak list searches (from MS or MS/MS data) as well as MS/MS spectral matching These DBs are intended more for MS- based metabolomics and compound ID than the simple MW search tools

MS Compound ID - HMDB Phenyllactate Phenylpyruvate Atrolactic acid Homovanillin Coumaric acd LC-MS Spectrum Peak list to HMDB High scoring matches

MS Compound ID - HMDB Database of ~400,000 predicted masses from ~40,000 known metabolites Includes adduct mass calculations for 30+ possible or expected metabolite adducts Allows selection of different databases (DrugBank, HMDB, FooDB, T3DB), mass tolerance and ionization mode Designed for mixture deconvolution (i.e. identification of multiple compounds at a time)

MS/MS Compound ID - HMDB Database of 1000 experimental MS/MS spectra (low, medium and high collision energies) collected on QqQ - but largely valid for ion trap instruments as well Allows selection of different instruments (QqQ, ion trap, FT-MS qTOF), collision energies, ionization modes, parent ion mass tolerance and fragment ion mass tolerance Designed for identification of a single compound at a time

Metlin MS Search Step 1: Enter Mass Step 2: Select Charge Step 3: Select “all” Step 4: “Find Metabolites”

Metlin Results

Metlin MS/MS Search mzXML mzML mzData

Metabolite ID - Complications LC-ESI-MS often leads to the production of salt adducts, neutral loss species and multiply charged species Up to 50% of LC-MS signals arise from these “noise” sources Key challenge is to distinguish adducts or multiply charged species from parent ions or to group adducts or multiply charged species with parent ions

Adduct Formation Effect on ESI Mass Spectrum Sample Na Adducts

Common Adducts in DI-MS

Fiehn Lab Adduct Table

MZedDB – Adduct Calculator

MZedDB – Results for C6H12O6

Neutral Loss Fragments

Handling MS Complications MZedDB, Metlin and HMDB are able to handle or predict adducts Metlin and MZedDB are able to handle or predict ion pairs or multiply charged species Metlin can potentially handle or predict neutral loss species Searching by MS or MS ranges can lead to lots of hits (high FP rate)

Exploiting High Mass Accuracy to ID Compounds ppm Linear IonTrap ppm Triple Quad ppm Q-TOF ppm TOF-MS ppm Magnetic Sector ppm Orbitrap ppm FT-ICR-MS Mass AccuracyType (10 ppm in Ultra-Zoom)

Molecular Formula Generators Formula generators are used to create molecular formulae from very accurate masses obtained by FT-MS or OrbiTrap Assist in compound ID by LC-MS (formula is more restrictive than MW) Input typically requires: –Accurate isotopic mass (with or without adduct) –Error in ppm or mDa (milliDaltons)

Molecular Formula Generators (MWTWIN) Accurate mass Mass error

Molecular Formula Generators (HighChem)

Molecular Formula Generator Server (MZedDB)

Finding Compounds By Molecular Formula - PubChem

Finding Compounds By Molecular Formula - ChEBI

Formula Filters Use additional MS information (isotopic abundance) as well as chemical bonding restrictions (Lewis & Senior rules), known or presumed atomic compositional data and matches to known or hypothesized structures to reduce the possible # of structures/formulas that are generated

Fiehn’s 7 Golden Rules (7GR) Formula Filter

7GR Software

Molecular Formula Space of Small Molecules

Frequency Distribution of Molecular Formulas

Impact of Mass Accuracy on Formula Numbers

Mass + Isotope Abundance Example: ESI-MS (+) of Solanine on a LTQ Resolving Power: 1700 Mass Accuracy: 46 ppm Isotopic Abundance Error: ±1.46% C45H73NO15 MW = [M+H] +

Mass Isomers Are Hard To Distinguish by MS Alone Use Retention Time or Isomer Generators to Distinguish

Molecular Isomer Generators Example: MOLGEN DEMO (Bayreuth)MOLGEN DEMO Creates all possible structural isomers from a given molecular formula

Size of Molecular Isomer Space is Unknown Accurate massFormulaNumber Isomersin Beilstein DB CH2O CH6N2O C2H6O C4H2N C5H2O C6H C7H6N2O2100,082, C7H10N466,583, C8H6O36,717, C8H10N2O76,307, C9H10O26,843, C9H14N29,459, C10H2N265,563, C10H14O1,548, C11H2O9,414, C11H1884, C12H634,030,90512

Some Points of Caution Many databases (PubChem, ChEBI, Metlin, FiehnLib, NIST) mix non-metabolites with metabolites or plant metabolites with animal and/or microbial metabolites or drugs/buffer reagents with metabolites This leads to many “silly” hits If you know the source organism use this information to limit the search or use organism-specific metabolome databases (HMDB, FooDB, DrugBank, KnapSack, etc.)

Alternatives to Mass Filtering and Mass Matching Use chemoselective labeling (similar to proteomics) to simplify the identification of “true” metabolites, reduce number of signals and eliminate false positives Use MS-based kits (Biocrates) Use concepts in Computer-Aided Structure Elucidation (CASE) to assist in compound ID

Quantitative MS Metabolomics With Chemoselective Labeling LC-MS Analysis Mix Pooled AnalysisIndividual Analysis

Quantitative MS Metabolomics With Chemoselective Labeling

Quantitative MS Metabolomics in Human Urine 2.51mM 30 nM 672 peaks by amino labeling 120 standards spiked 92 peaks identified/quantified 30 nM mM 820 peaks by carboxy labeling Still assessing Guo K. & Li L. Anal Chem May 15;81(10):

Advantages to Derivitization Tags can convert non-UV active compounds into UV or fluorescently detectable cmpds Tags improve ionization efficiency and lower limit of detection Tags permit affinity purification and concentration Tags make polar molecules hydrophobic, leading to better LC separations Tags permit isotope based quantification Tags greatly increase # compounds detected Tags allow independent confirmation of “real” peaks Best route to automated ID & quantification by LC-MS

BioCrates IDQ Kit 40 acylcarnitines, 13 amino acids, 15 LysoPCs, 77 PCs, 15 SMs = 160

Multiple Reaction Monitoring Q1 Q3 CH 3 CD 3

Sample Urine Metabolite List Concentration range from 10 nM to 7.2 mM (1,000,000 X concentration) Arginine 38.7 uM Tyrosine uM C14:2 Carn 0.03 uM C4:1 Carn uM C8 Carnitine 1.05 uM PC(36:5) aa uM LysoPC-20: uM SM(22:3) uM Glutamine uM Valiine 37.0 uM C14:2-OH 0.02 uM C5 Carnit 4.39 uM C9 Carnitine 1.37 uM PC(38:5) aa uM LysoPC-6: uM SM(24:0) uM Glycine uM Leu/Ile uM C16 Carn uM C6-OH Carn uM PC(28:1) aa uM PC(42:4) aa uM SM(OH)16: uM SM(24:1) uM Histidine uM Carnitiine 73.2 uM C16-OH Cr uM C5-M-DC uM PC(30:2) aa uM PC(38:3) ae uM SM(OH)22: uM SM(26:0) uM Methionine 15.6 uM C10 Carn uM C16:1-OH uM C5-OH Carn 1.46 uM PC(34:1) aa uM PC(38:4) ae uM SM(OH)22: uM SM(26:1) uM Phenylalanin 52.7 uM C10:1 Carn 1.83 uM C2 Carnitine 45.2 uM C5:1 Carn 1.84 uM PC(34:2) aa uM PC(38:5) ae uM SM(OH)24: uM Glucose 2264 uM Proline 42.9 uM C10:2 Carn uM C3 Carnitine 2.12 uM C5:1-OH uM PC(34:4) aa uM PC(38:6) ae uM SM(16:0) uM Creatinine 7222 uM Serine uM C12 Carn uM C3-OH Carn uM C6 Carnitine uM PC(36:1) aa uM PC(40:5) ae uM SM(16:1) uM Threonine uM C14 Carn uM C4 Carnint 11.0 uM C6:1 Carnt uM PC(36:3) aa uM PC(42:3) ae uM SM(18:1) uM Tryptophan 15.0 uM C14:1-OH uM C4-OH Carn uM C8-OH Carn uM PC(36:4) aa uM PC(44:3) ae uM SM(20:2) uM

CASE – Computer-Aided Structure Elucidation Two approaches – Bottom Up and Top Down Top-Down uses known metabolites and generates variants (via metabolic transformation or other bio-informed methods). Properties/spectra/MW are predicted and then compares them to observed spectra/properties of unknown Bottom-Up uses known fragments of molecules, assembles the fragments into logical structure, predicts the properties/spectra and compares to observed spectra/properties of unknown

Top - Down CASE Methods Known metabolites (20,000) Predicted biotransformations (20,000 --> 200,000) Predicted MS, MS/MS, NMR, GC-MS Spectra Match observed spectra to predicted specta to ID

MyCompoundID Computationally metabolizes 8000 compounds in HMDB to create a dataset of 400,000 possible theoretical metabolites

MyCompoundID

Bottom-Up (Traditional) CASE Known metabolite substructures or metabolite EI or CID fragments Match observed spectra to predicted specta to ID Predicted (or kown) MS, MS/MS, NMR, GC-MS fragment spectra Neural Network or GA driven fragment assembly +