Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIOM 209/CHEM 210/PHARM 209 Interrogating Gene, Protein and Lipid Databases: A Bioinformatics Perspective Dr. Eoin Fahy, University Of California San Diego.

Similar presentations


Presentation on theme: "BIOM 209/CHEM 210/PHARM 209 Interrogating Gene, Protein and Lipid Databases: A Bioinformatics Perspective Dr. Eoin Fahy, University Of California San Diego."— Presentation transcript:

1 BIOM 209/CHEM 210/PHARM 209 Interrogating Gene, Protein and Lipid Databases: A Bioinformatics Perspective Dr. Eoin Fahy, University Of California San Diego Professor Edward A. Dennis Department of Chemistry and Biochemistry Department of Pharmacology, School of Medicine University of California, San Diego Copyright/attribution notice: You are free to copy, distribute, adapt and transmit this tutorial or individual slides (without alteration) for academic, non-profit and non-commercial purposes. Attribution: Edward A. Dennis (2010) “LIPID MAPS Lipid Metabolomics Tutorial” www.lipidmaps.org E.A. DENNIS 2016 © ®

2 Lipids may be broadly defined as hydrophobic or amphiphilic small molecules that originate entirely or in part from two distinct types of biochemical subunits or "building blocks": ketoacyl and isoprene groups. Using this approach, lipids may be divided into eight categories : fatty acyls, glycerolipids,,glycerophospholipids, sphingolipids, saccharolipids and polyketides (derived from condensation of ketoacyl subunits); and sterol lipids and prenol lipids (derived from condensation of isoprene subunits). * Fahy,E. et al, Journal of Lipid Research, Vol. 46, 839-862, May 2005 Definition of a lipid*

3 Fundamental biosynthetic units of lipids

4 Lipid classification: biosynthetic routes

5 LIPID MAPS Classification System Categories and Examples CategoryAbbreviationExample Fatty acylsFADodecanoic acid GlycerolipidsGL1-hexadecanoyl-2-(9Z-octadecenoyl)-sn- glycerol GlycerophospholipidsGP1-hexadecanoyl-2-(9Z-octadecenoyl)-sn- glycero-3-phosphocholine SphingolipidsSPN-(tetradecanoyl)-sphing-4-enine Sterol lipidsST Cholest-5-en-3  -ol Prenol lipidsPR2E,6E-farnesol SaccharolipidsSL UDP-3-O-(3R-hydroxy-tetradecanoyl)-  D- N- acetylglucosamine PolyketidesPKAflatoxin B 1

6 J. Lipid Res. Classification publications Journal of Lipid Research, Vol. 46, 839-862, May 2005 Journal of Lipid Research, 50 th anniversary edition, May 2009

7 CategoryAbbrev Example Fatty Acyls Glycerolipids Glycerophospholipids Sphingolipids Sterol Lipids Prenol Lipids Saccharolipids Polyketides FA GL GP SP ST PR SL PK Arachidonic acid 1-hexadecanoyl-sn-glycerol 1-hexadecanoyl-2-(9Z-octadecenoyl)- sn-glycero-3-phosphocholine Sphingosine Cholesterol Retinol Kdo 2 -lipid A epothilone D Name: PGE 2 LM_ID: LMFA03010003 LM_ID description: Database: LM (LIPID MAPS) Category: FA (Fatty Acyls) Main Class: 03 (Eicosanoids) Sub Class: 01 (Prostaglandins) Unique identifier within a sub class: 0003 LIPID MAPS Lipid classification system

8 LIPID MAPS: Recommendations for drawing structures Consistent structure representation across classes Fatty Acyls(FA) Sterol Lipids (ST) Glycerophospholipids (GP) Sphingolipids (SP) Prenol Lipids (PR) Glycerolipids (GL)

9 Structural comparison of SM and PC

10 Online drawing tools for various lipid categories (FA,GL,GP,SP,ST) Structures may be saved as Molfiles Online lipid structure-drawing tools http://www.lipidmaps.org/tools/index.html

11 LIPID MAPS Lipidomics gateway http://www.lipidmaps.org

12

13

14 LIPID MAPS structure database Structures from core labs and partners New structures identified by LIPID MAPS experiments Websites, Publications Public databases Computationally generated structures Populating LIPID MAPS structure database

15 Search LMSD by browsing classification hierarchy

16 Search LMSD by structure, text, mass, formula,ontology

17 Search LMSD with ontology terms e.g. find all lipids with 20 carbons, 3 double bonds, at least 3 hydroxyl groups and 1 epoxy group

18 LMSD Detail view for a lipid structure Lipid classification LM_ID m/z calculation tool Database cross-references Names, synonyms InChiKey identifier MS/MS spectrum Physicochemical properties Other structure formats Structure

19

20 Alternative lipid subclasses/functionality Take advantage of built-in ontology feature for all lipid structures in LMSD

21 Use InChIKey to find structures differing only in stereochemistry, double-bond geometry or isotopic labeling

22 Use InChIKey (full or partial) to perform a Google structure search LIPID MAPS European Bioinformatics Inst. PubChem

23 Querying Lipidomics Gateway website as well as LIPID MAPS databases via “Quick search”  Multi-purpose  Small “footprint”  High visibility (on home page) Search the Lipidomics Gateway html pages by keyword, or the databases by lipid class, common name, systematic name or synonym, mass, formula, InChIKey, LIPID MAPS ID, gene or protein term.

24 LIPID MAPS LM_ID Lipid standard (name or LMID) Gene/protein name/synonym Lipid common/systematic name or synonym Lipid molecular formula Lipid classification term InChI Key Keywords on Lipidomics Gateway website pages (personnel, publications, news, updates, etc.) LMFA03010003 XEYBRNLFEZDVAW-ARSRFYASSA-N C 12 H 24 O 2 “Linoleic”, “HETE”, ”, “PAF”, “PGE”, “5Z,8Z,14Z-eicosatrienoic”, “PC(16:0/18:1(9Z))” “MGDG “docosa”, “phytosphingosine” sterol FABP “Choline”, “prostaglandin”, “diterpene” “Atherosclerosis”, “Dennis”, “homeostasis” Quick search query typesExample

25 Lipid Proteome Database (LMPD) SpeciesGenesProteins Human (Homo sapiens)11162273 Mouse (Mus musculus)10821504 Rat (Rattus norvegicus)12581315 Rhesus monkey (Macaca mulata)8911634 Yeast (Saccharomyces cerevisiae (s288c)) 720 E. coli (Escherichia coli(K12))245 C. elegans (Caenorhabditis elegans)595868 Drosophila (Drosophila melanogaster)4041064 Arabidopsis (Arabidopsis thaliana)18292447 Zebrafish (Danio rerio)638647

26 LMPD:Data collection strategy Entrez Gene ID list Lipid-related keywords in gene names, metabolic pathways and ontology terms Manual curation NCBI EntrezUniProt Python program Gene, mRNA, protein data, PTM variants, motifs, homologs, cross- references, related proteins, ontologies, annotations, etc. LMPD database

27 Entrez Gene ID (DNA/genomic links) RefSeq mRNA ID’s (both coding and UTR variants) RefSeq protein ID’s and sequences (unique isoforms) Post–translationally modified variants (e.g. apo-, mature forms, leader sequences, etc.) LMPD organization: Gene-> mRNA-> (apo)protein -> mature protein

28 LMPD query page

29 LMPD overview page: listing of annotations and isoforms

30 LMPD gene orthologs, alignments, links

31 LMPD UniProt, domain/motif, related protein annotations

32 LMPD gene ontology/pathway annotations

33 LIPID MAPS REST interface

34 Different input contexts: Compounds Genes Proteins Output formats: JSON, text, molfile, image JSONmolfile

35 LIPID MAPS lipidomic pathways

36 Cholesterol Biosynthesis

37 TLR4 signaling pathway

38 Overview of Quantitative Lipid Analysis by Mass Spectrometry as performed by LIPID MAPS consortium on bone marrow derived macrophages (BMDM) www.lipidmaps.org LIPID MAPS funded by Glue Grant from :www.nigms.nih.gov LIPID MAPS Bioinformatics Core a, UCSD, 9500 Gilman Dr, La Jolla, CA 92093; Department of Bioengineering b, UCSD, 9500 Gilman Dr, La Jolla, CA, 92093 Extract bone marrow cells Transfer to plates Perform timecourse experiment on plated cells Repeat 3x (replicates) Aliquot samples for shipping to core research labs Fatty acids Glycerophospholipids GlycerolipidsSterolsSphingolipids Prenols CardiolipinsEicosanoids Cholesteryl esters EtOAc/isooctane extraction DFPI derivatization of DAGS LC/MS analysis (normal phase) ESI-QTRAP [M+NH4]+ detection mode Deuterated standards Saponification Methanol/CHCl 3 extraction SPE extraction Combination of GC/MS, LC/MS (reverse phase) on ESI-QTRAP and APCI- MS analysis Deuterated standards EtOAc/isooctane extraction LC/MS analysis (normal phase) ESI-QTRAP [M+NH4]+/neutral loss detection mode Deuterated standards Methanol/CHCl 3 extraction LC/MS analysis (normal phase) ESI-QSTAR-XL using MS/MS methods Odd-chain standards Methanolic HCl/CHCl 3 extraction LC/MS analysis (normal phase) ESI-QTRAP 2-stage quantitation Odd-chain standards Separate media from cells SPE extraction of media LC/MS analysis (reverse phase) ESI-QTRAP viaMRM methods Deuterated standards Methanolic HCl/Isooctane extraction GC/MS analysis Deuterated standards Methanol/CHCl 3 + methanolic KOH extraction Combination of LC-C18, LC-Si and LC-NH 2 separation ESI-QTRAP and API-3000 Triple Quad detection with MRM methods C12 analog standards Methanol/CHCl 3 extraction LC/MS analysis (reverse phase) QSTAR-XL via MRM methods Nor-dolichol/CoQ6 standards BIOINFORMATICS Data consolidation, normalization, statistical analysis and databasing Presentation in tabular and graphical formats For details of extraction, purification and quantitation by MS, see: Lipidomics reveals a remarkable diversity of lipids in human plasma. Quehenberger O et al.,J Lipid Res 51, 3299-3305 (2010). A mouse macrophage lipidome. Dennis EA et al., J Biol Chem 285, 39976-39985 (2010) Methods Enzymol. (Brown AH, ed.) 2007; Vol. 432 (multiple chapters)

39 Data presentation formats Tabular: Graphical: Integrated pathway/heatmap:Heatmap: LipidsGenes

40 Dennis et al (2010) J. Biol. Chem, 51, 39976-85

41 E. Fahy 2010 ©

42 Online drawing tools for various lipid categories (FA,GL,GP,SP,ST) Structures viewable in Marvin, JMol and Chemdraw format. May be saved as Molfiles Online lipid structure-drawing tools http://www.lipidmaps.org/tools/index.html E. Fahy 2010 ©

43 Online generation of glycan structures in full chair conformation Sugars Glc Gal GlcNAc GalNac Xyl Fuc Man NeuAc NeuGc KDN Anomeric Carbon  or  linkages may be specified http://www.lipidmaps.org/tools/index.html E. Fahy 2010 ©

44 Mass spectrometry prediction tools  Using virtual databases of structures based on commonly occurring core structures and chains  Using known lipids in the LIPID MAPS structure database (LMSD)

45 Creation of a virtual lipid database Choice of range of acyl/alkyl chains These are used to create “bulk” species e.g. PC(38:4), PE(O-36:0), Cer(d32:1), HexCer(d40:2), TG(54:2), DG(32:0), FA(20:3(OH)), CE(18:1) Conservative approach: stereochemistry, sn (glycerol) position, double bond/functional group regiochemistry, double bond geometry not defined. Links to: On-demand expansion of all possible chain combinations (within defined limits) Links to: Matches of bulk species to discrete structures in LMSD database (examples)

46 Enumeration of “bulk” lipid species from selected lists of acyl/alkyl chains Glycerolipids Phospholipids Sphingolipids Fatty acids Chol. esters Acyl CoA’s Acyl carnitines Cardiolipins Suite of combinatorial expansion tools Database of lipid “bulk” species, exact masses, formulae, annotations Wax esters

47 Virtual database of bulk lipids: number of entries per class Monoradylglycerols84Fatty acids13590 Diradylglycerols615Acyl carnitines78 Triradylglycerols1844Chol. Esters78 Digalactosyl DG's553Acyl CoA's78 Monogalactosyl DG's553Wax esters403 Sulfoquinovosyl DG's553 Ceramides258 PA696Ceramide phosphates258 PC696PE-Ceramides230 PE696PI-Ceramides230 PG696Mannosyl-di-IP-ceramides258 PI696Mannosyl-IP-ceramides258 PIP696Hexosyl ceramides258 PS696Lactosyl ceramides258 Cardiolipins375Sphingomyelins258 Sulfatides258

48 Precursor ion search interface to virtual database Input: Either copy/paste a list of precursor ions or upload a peaklist file Input parameters: Mass tolerance, ion type, all chains or even chains, sort results Optionally restrict search to one or multiple lipid species

49 Results page for precursor ion search Output: view in online format (below) or as tab-delimited text file Output features: Sub-table for each input ion. Links: On-demand expansion of all possible chain combinations (abbreviation) Links: Matches of bulk species to discrete structures in LMSD database (examples)

50 Expansion of species level to display all possible chain combinations within defined chain and chain/double-bond ratio limits

51 Links to examples of discrete structures in LMSD database with the identical bulk structure *This feature was implemented by computing the “bulk” abbreviation (where possible) for every structure in the LMSD database

52 Educating the public about lipids

53 Educating the public about lipids: LIPID MAPS tutorials http://www.lipidmaps.org

54 LIPID MAPS ®


Download ppt "BIOM 209/CHEM 210/PHARM 209 Interrogating Gene, Protein and Lipid Databases: A Bioinformatics Perspective Dr. Eoin Fahy, University Of California San Diego."

Similar presentations


Ads by Google