Data Exchange & Public Reference Data

Slides:



Advertisements
Similar presentations
Metabolomics and the Human Metabolome Project
Advertisements

SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
THOMSON REUTERS INTEGRITY SM : INTEGRATED DRUG DISCOVERY AND DEVELOPMENT PORTAL.
MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
CellDesigner Tutorial Laurence Calzone, Andrei Zinovyev UMR U900 INSERM/Institut Curie/Ecole des Mines de Paris Wednesday, April 30th.
Gene Ontology John Pinney
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
4 th NeuroML Development Workshop & BrainScaleS CodeJam, Edinburgh, March NeuroML: Where are we at? Padraig Gleeson Department.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Data Curation and Management activities within the UCT Computational Biology Group Dr Nicky Mulder.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
Introduction to Pharmacoinformatics
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
Copyright OpenHelix. No use or reproduction without express written consent1.
ChEMBL– Open Access Database For Drug Discovery By – Udghosh Singh M.S.(Pharm), 3 rd Sem Pharmacoinformatics.
LibAnnotationSBML Neil Swainston Manchester Centre for Integrative Systems Biology 29 March 2009.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
Sharing Models. How Can I Exchange Models? SBML (Systems Biology Markup Language): de facto standard for representing cellular networks. A large number.
Copyright OpenHelix. No use or reproduction without express written consent1.
Motif discovery and Protein Databases Tutorial 5.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Johannes Griss PSI Meeting Heidelberg, April 2011 EBI is an Outstation of the European Molecular Biology Laboratory. mzTab Proposal for.
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
GO based data analysis Iowa State Workshop 11 June 2009.
Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Computational Challenges in Metabolomics (Part 1)
Thomas Hartung, Andre Kleensang & team
BDPGx - A Big Data Platform for Graph-based Pharmacogenomics Data
Classifying Chemistry: Current Efforts in Canada
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Biological Databases By: Komal Arora.
Online BIOS QTL atlases
Ministry of Economic Development and Innovation
Pathway Analysis June 13, 2017.
An Overview of Data-PASS Shared Catalog
Using ArrayExpress.
Flanders Marine Institute (VLIZ)
Open PHACTS 1.3 Release ( triples)
Pathway Visualization
Biotechnology Objectives: At the end of this lecture we will be able to identify and describe the uses of biotechnology in society.
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
Service-centric Software Engineering
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
Lixia Yao, James A. Evans, Andrey Rzhetsky  Trends in Biotechnology 
Importing and Working with Data from the Metabolomics Workbench
Lecture 7: Biological Network Crosstalk Y. Z
Bioinformatics Research Group SRI International
Introduction to Metallomics Supplementary Reading:
Enabling Semantic Ecoblogging and Bioblitzes
Service-enabling Biomedical Research Enterprise
The Category Approach for Predicting Mutagenicity and Carcinogenicity
BIOBASE Training TRANSFAC® ExPlain™
Part II SeqViewer AraCyc Help
EFSA’s Chemical Hazards Database
Pathway Analysis July 9, 2019.
Presentation transcript:

Data Exchange & Public Reference Data Computational Metabolomics, Schloss Dagstuhl, Germany Nov. 29-Dec. 4, 2015 David Wishart, University of Alberta

Public Metabolomics Repositories

Public Repositories Currently focused on capturing experimental data from metabolomics experiments Well defined file formats, good capture of meta-data, sustainable funding, aiming for permanence Relatively little “reference” data concerning pure compounds (no pure compound spectral data) – “Missing the reference layer” Struggling with data exchange and common formats

MS Spectral Databases

MS Spectral Databaes Currently focused on capturing referential data from pure compounds Fills the need that the public repositories are missing Variable file formats, limited capture of meta-data, non-sustainable funding, largely managed by single labs or individuals, limited exchange between centres except through targeted data harvesting Is there an opportunity to link these resources with the public repositories?

What Are We Missing? Updated and current data exchange standards (MSI is 10+ years old) Easy-to-use formats for entering meta-data Methods for keeping meta-data current with changing technologies Coordinated efforts in reference data collection and meta-data descriptions Databases on metabolite origins (linking metabolites to species or sources) Common methods formats for describing or classifying chemicals/metabolites (an ontology) Common formats and methods for exchanging pathway data (putting biology into metabolomics)

Databases on Metabolite Origins – Why? All other fields of omics track information of what genes/proteins/transcripts come from which species – unfortunately we don’t Many metabolites are gene-specific Many compounds come from specific sources – Source information is important for xenobiotics Avoids problems of false discovery and mis-identification Helps link metabolites to biology, helps with multi-omics integration

Databases on Metabolite Origins http://www.hmdb.ca http://www.T3DB.ca http://www.foodb.ca http://www.drugbank.ca

The Human Metabolome Database (HMDB) A web-accessible resource containing detailed information on 41,993 “quantified”, “detected” and “expected” metabolites Includes many food components & metabolites 100’s of drug metabolites 1000’s of xenobiotics >10,000 reference spectra Supports sequence, spectral, structure and text searches as well as compound browsing Full data downloads http://www.hmdb.ca

The Drug Database (DrugBank v. 4.3) 1602 small molecule drugs >5000 experimental drugs Detailed ADMET, MOA and pharmacokinetic data >1000 drugs with metabolizing enzyme data >1200 drug metabolites >600 MS+NMR spectra >4200 unique drug targets 208 data fields/drug Supports sequence, spectral, structure and text searches as well as compound browsing Full data downloads http://www.drugbank.ca

The Toxic Exposome Database (T3DB) Comprehensive data on toxic compounds (drugs, pesticides, herbicides, endocrine disruptors, drugs, solvents, carcinogens, etc.) Detailed mechanisms, binding constants, target info, lots of ToxCast data >3600 toxic compounds >1900 reference spectra ~2100 toxic targets Supports sequence, spectral, structure, text searches as well as compound browsing Full data downloads http://www.t3db.ca

The Food Database (FooDB) 26,619 compounds, 25,579 structures with 24,843 descriptions 171,359 synonyms ~700,000 concentration values 31,791 references 1376 cmpds with health effects 2692 cmpds with flavour data Content data on 907 raw or processed foods Supports structure & text searches >100 data fields/compound Full data downloads www.foodb.ca

Chemical Ontologies – Why? Every other field of omics has ontologies to describe genes & proteins – we don’t Helps link metabolites and chemistry to biology, helps with multi-omics integration Gives metabolomics researchers a common vocabulary Need to work on a shared standard

Classyfire Server A webserver (and database) designed to facilitate chemical classification and chemical description via structure alone 4800 chemical and chemical class definitions Accepts InChI or SMILES strings and generates classification in <1 s

Pathway Integration – Why? Pathways are a unique strength to metabolomics (it’s already beyond network “hairball” diagrams) Visualization is important in omics, helps improve integration possibilities Helps link metabolites and chemistry to biology, helps with multi-omics integration Need to work on a shared standard

PathWhiz Webserver designed to permit creation of colourful, biologically accurate pathway diagrams that are machine readable and interactive Supports BioPAX, SBML and SBGN conversion as well as SVG and PNG image generation Google Maps-style viewer http://smpdb.ca/pathwhiz

The Small Molecule Pathway Database (SMPDB) All pathways in SMPDB are generated via PathWhiz >700 small molecule pathways linked to HMDB, MetaboAnalyst Depicts cell compartments, organelles, protein locations, 4o structures SMPDB is able to map gene chip & metabolomic data Converts gene, protein or chemical lists to pathways or disease diagnoses http://www.smpdb.ca