Metabolomics Part 1 PCB 5530 Fall 2017
Metabolomics Day 1 Day 2 Day 3 Introduction to Metabolomics: Basics & limitations of metabolomics Sample preparation Chromatography Mass Spectrometry in metabolomics Day 2 Metabolomics data analysis Day 3
Definitions and Background Metabolome = the total metabolite pool • All low molecular weight (< 2000 Da) organic molecules in a sample such as a single cell, leaf, fruit, seedling, etc. Sugars Nucleosides Organic acids Ketones Aldehydes Amines Amino acids Small peptides Lipids Sterols Terpenes Alkaloids
Definitions and Background Metabolomics = high-throughput analysis of metabolites Metabolomics is the simultaneous measurement of the levels of a large number of metabolites (typically several hundred). However, due to the complexity, any of these are not identified (i.e. are just peaks in a profile).
Definitions and Background
Definitions and Background Scope Accuracy Untargeted Metabolomics -measures many compounds (Ratios) Metabolic profiling -measures a set of related compounds (e.g. phosphate esters) Targeted analysis -measures specific compounds; (Quantitation)
How old is the field of metabolomics? Profiling of blood and urine for clinical detection of human disease has been carried out for Centuries. Ulrich Pinder, 1506: Epiphanie Medicorum Urine wheel describes possible colors, smells and tastes of urine Nicholson, J. K. & Lindon, J. C. Nature 455, 1054–1056 (2008).
How old is the field of metabolomics? Advanced chromatographic separation techniques developed in late 1960’s. Linus Pauling published “Quantitative Analysis of Urine Vapor and Breath by Gas-Liquid Partition Chromatography” in 1971 Chuck Sweeley at MSU helped pioneer metabolic profiling using gas chromatography/ mass spectrometry (GC-MS) Plant metabolic biochemists (e.g. Lothar Willmitzer) were among other early leaders in the field. # of metabolomics publications Metabolomics is expanding to catch up with other multiparallel analytical techniques (transcriptomics, proteomics) but remains far less developed and less accessible.
The Metabolome: Size and Concentrations 200,000 Chemicals 20,000 8000 All Mammals All Microbes All Plants The Pyramid of Life All plant species combined contain on the order of 105 compounds. Individual plant species are estimated to contain 5,000 – 30,000 compounds Ratio of metabolites/genes much smaller than in microbes Metabolic profiling much harder than in other organisms
Power of Metabolomics Silent Knockout Mutations. ~90% of Arabidopsis knockout mutations are silent (no visible phenotype) ~85% of yeast genes are not needed for survival Metabolic Control Analysis: Growth rate (sum of all metabolic fluxes) is unchanged in silent knockout mutations Pool sizes of metabolites can change to compensate for effect of mutation, metabolic fluxes are unchanged this can be measured!
Silent knockout mutations Power of Metabolomics Silent knockout mutations Example. • Chloroplast 2010 project (phenotype analysis of knockouts of Arabidopsis genes encoding predicted chloroplast proteins): • Various knockouts showed essentially normal growth and color but highly abnormal free amino acid profiles, e.g. At1g50770 (‘Aminotransferase-like’)
Why Metabolomics is Difficult 5 Bases 20 Amino acids 105 Molecules Metabolomics Proteomics Genomics Chemical Diversity The Pyramid of Life
Why Metabolomics is Difficult Proteomics Genomics Response Time Concentrations of cellular metabolites vary over several orders of magnitude (mM to pM) Differences in molecular weight (20-2000 Da) Concentration High turnover rates Some metabolites are labile Concentration Concentration
Why Metabolomics is Difficult Dihydrozeatin riboside monophosphate <1 fmol mg-1 leaf fresh weight Sucrose Cytosolic concentration as high as 50mM in spinach leaves
Metabolomics Steps in metabolomics sample preparation sample extraction chromatography detection data analysis
Sample Preparation Growth/Sample Size Grow organisms (e.g. plants or bacteria) under identical conditions Randomize the treatment groups (Make sure effects are measured due to variability in samples, not in experimental set up • number of replicates… depends on what you want to find Large differences = small replication needed Small differences = large replication needed • In general, a minimum of six replicates for each treatment are needed (due to high biological variability) Grow more than you think you’ll need!
Sample Preparation Biological replicates: High variability, but “this is life” Technical replicates: “Is your sampling/ extraction method robust?”
Sample Preparation Sample collection • Uniform sample sizes (e.g. hole punches in leaves) • Be consistent - similar tissue - time of day • Quickly freeze sample in liquid nitrogen, store samples at -80°C • Fast-harvesting method for bacteria (~30 sec)
Choosing an extraction method Sample Extraction Choosing an extraction method • No universal extraction method exists • Some solvents may degrade certain compounds • Its good to have some idea of what metabolites you want to extract: Untargeted metabolomics / Metabolic profiling / Targeted analysis
Choosing an extraction method Sample Extraction Choosing an extraction method Physical disruption: Grind by hand? Mechanical? Extraction time Extraction efficiency How do you check for sufficient extraction?
Sample Extraction Sample extraction • The method should be consistent and reproducible • Further workup may be required; esp. for targeted analysis (e.g. solid phase extraction)
Why is it particularly difficult in plants? Sample Extraction Why is it particularly difficult in plants? Plants are tough! Plants contain tissues made up of complex polymers that are difficult to homogenize (e.g. lignin, cellulose, starch). Homogenizing roots is much more difficult than E. coli These structural polymers also contain metabolites which are difficult to extract Plant leaves are made up of ~30 different cell types, so incomplete homogenization can lead to high variability Plants contain a wider variety of metabolites (both in number and chemical composition) than most organisms.
Chromatography introduction Invented in 1900 by Mikhail Tsvet (used to separate plant pigments) • Types include: - TLC (thin-layer chromatography) - GC (gas chromatography) - LC (liquid chromatography) Y GC and LC are routinely used in metabolomics
Principle of chromatography Principle: Separation of compounds based on differential partitioning between solid and mobile phases. https://www.khanacademy.org/test-prep/mcat/chemical-processes/separations-purifications/a/principles-of-chromatography
Chromatography Gas Chromatography • GC = ‘good chromatography’ Separation according to difference in volatility & structure For compounds with sufficient volatility thermostable
Chromatography Gas Chromatography Mostly used for untargeted metabolomics optimized over several decades High reproducibility Easy to use Good software ‘standardized’ GC method with very good databases for compounds identification Very universal for compounds <600 Da Great coverage for polar compounds (for same coverage, 3+ LC-MS methods are needed Only tool for volatiles Limitations: - high temperatures can destroy labile compounds - polar compounds cannot ‘fly’ on GC columns and must first be derivatized - difficult to collect fractions - heat may cause degradation
Chromatography GC Profile Retention time [min]
Sample derivatization Chromatography Sample derivatization Step 1) Methoximation Step 2) Silylation Z/E isomer have same mass spectrum but differ 2 seconds in retention time Gas chromatography requires volatile compounds (two step derivatization in vial) 1) Methoximation of aldehyde and keto groups (primarily for opening reducing ring sugars) 2) Silylation of polar hydroxy, thiol, carboxy and amino groups with silylation agent MSTFA A single compound with multiple active groups will result in multiple peaks (1TMS, 2TMS, 3TMS) GC-MS can distinguish between stereoisomers Anal Chem. 2009 Dec 15;81(24):10038-48. doi: 10.1021/ac9019522. FiehnLib: mass spectral and retention index libraries for metabolomics based on quadrupole and time-of-flight gas chromatography/mass spectrometry. Kind T, Wohlgemuth G, Lee do Y, Lu Y, Palazoglu M, Shahbaz S, Fiehn O.
Liquid Chromatography LC = ‘Lousy chromatography’ Mobile phase: Liquid Analyte separation based on difference in interaction with column & mobile phase For small and macro-molecules, ionic compounds (not volatiles) Thermostable & thermolabile compounds
Liquid Chromatography LC = ‘Lousy chromatography’ Relatively new, recent advantages Thousand of columns available… normal phase, reverse phase, ion exchange, HILIC New columns constantly being developed to improve resolution, sensitivity and run time Infinite solvent systems possible: Selection of chromatographic configuration depends on physicochemical properties: solubility, polarity, weight Separation is based on complex interaction of analytes with column and mobile phase Advantages: compound can be collected after separation derivatization not necessary a separation protocol can be optimized for nearly any compound BUT: Low reproducibility no massive databases
Chromatography LC Profile Retention time [min]
Liquid Chromatography LC = ‘Lousy chromatography’ Normal phase chromatography HILIC Reversed Phase Chromatography Stationary Phase Polar (silica) Polar (silica, modified silica) Non-polar modified silica (C18, C8, phenyl) Mobile phase Non-polar (hexane, Ethyl acetate, chloroform) Polar (Water, Acetonitrile) Polar (water, Acetonitrile, Methanol) Analytes Non-polar and water insoluble (Lipids) Very polar metabolites Polar metabolites Polar compounds are retained Polar compounds elute first Not MS compatible MS compatible
What to use for what compounds? Chromatography What to use for what compounds? (Reversed Phase High Performance Liquid Chromatography) RP-HPLC GC (Gas Chromatography) HILIC (Hydrophilic Interaction Chromatography) Non-polar vitamins Amino/organic acids Volatile organic compounds Sterols Fatty acids Sugars Triglycerides Very polar Polar Medium-polar Non-polar