PubChem BioAssay: Link chemical research to GenBank and beyond

Slides:



Advertisements
Similar presentations
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
Advertisements

Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
1.
The National Center for Biotechnology Information (NCBI) a primary resource for molecular biology information Database Resources.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Strategies towards improving the utility of scientific big data Evan Bolton, PhD National Center for Biotechnology Information (NCBI) National Library.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Sequence/Structure Alignment Resources from NCBI Steve Bryant Protein Data Bank Rutgers University November 19, 2005.
Introductory Overview
Knowledge Integration for Gene Target Selection Graciela Gonzalez, PhD Juan C. Uribe Contact:
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Rational Drug Design Soma Mandal, Mee'nal Moudgil, Sanat K. Mandal.
X-ray crystallography NMR cryoEM Experimental approaches for structural biology.
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
Evan Bolton, PhD Jian Zhang, PhD Gang Fu, PhD Jun. 15, 2015 U.S. National Center for Biotechnology Information (NCBI)
NCBI’s Bioinformatics Resources Michele R. Tennant, Ph.D., M.L.I.S. Health Science Center Libraries U.F. Genetics Institute January 2015.
Copyright OpenHelix. No use or reproduction without express written consent1.
Searching PubMed® NCBI, NLM Resources, Micromedex -GSBS TTUHSC Preston Smith Library presents Rev. 08/17/14.
NCBI FieldGuide NCBI Molecular Biology Resources January 2008 Using Entrez.
CANDID: A candidate gene identification tool Janna Hutz March 19, 2007.
DONNA MAGLOTT, PH.D. PRO AND MEDICAL GENETICS RESOURCES AT NCBI.
ChEMBL– Open Access Database For Drug Discovery By – Udghosh Singh M.S.(Pharm), 3 rd Sem Pharmacoinformatics.
NCBI FieldGuide NCBI Molecular Biology Resources March 2007 Using Entrez.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Protein Data Bank: An Introduction Learning to Use the RCSB PDB Portal.
NCBI Literature Databases: PubMed
Shuang Liang ● Southern Medical University Building a Knowledge Discovery System.
Bioinformatics and Computational Biology
Copyright OpenHelix. No use or reproduction without express written consent1.
PubChem: An Open Repository for Chemical Structure and Biological Activity Information Steve Bryant The NIH Biowulf Cluster: 10 Years of Scientific Supercomputing.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Ontology Driven Data Collection for EuPathDB Jie Zheng, Omar Harb, Chris Stoeckert Center for Bioinformatics, University of Pennsylvania.
Copyright GeneGo Cover Slide Cytoscape Reteat November 7 th 2007 Mark Hughes PhD.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
NCBI PubMed NCBI Literature Databases: PubMed Session #1, April 28, 2005 Session #2, April 29, 2005 Ho Chi Minh City, VietNam.
PubChem—Substance, Compound, BioAssay Part 1: Essentials Principles of May 24, 2007.
PubChem Search Features Stephen Bryant Wolfram Data Summit Scientific and Technical Data Session September 9-10, 2010.
Pathway Team SNU, IDB Lab. DongHyuk Im DongHee Lee.
Keeping Current: Genetics Resources. This workshop will provide an overview of NCBI resources for finding-- Background information & journal articles.
Entrez, dbSNP, GEO, OMIM & LinkOut JanPlan Entrez Distributed by NCBI in 1991 on CD-ROM Included linked nodes: GenBank & PDB Translated GenBank,
Signal transduction. General scheme There are two general types of receptors: cell-surface and intracellular receptors.
IUPHAR/BPS guide to pharmacology (GtoPdb): Concise mapping for the triples of chemistry, data, and protein target classifications.
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
Introduction to PubChem BioAssay
Uncovering the Protein Tyrosine Phosphatome in Cattle
NCBI Molecular Biology Resources
An Advanced Web Query Interface for Biological Databases
Directly Upload Data From An ELN Into PubChem
Intersecting different databases to define the inner and outer limits of the data-supported druggable proteome
Biological databases: Collection, storage and maintenance
Open PHACTS 1.3 Release ( triples)
Archives and Information Retrieval
Introduction to PubChem BioAssay
생물정보학 Bioinformatics.
Bioinformatics Capstone Project
gene-CENTRIC database
Annotation: linking literature to gene products
محسن شیرازی کارشناسي علوم کتابداري و اطلاع رساني پزشکی
Gammaherpesvirus infection modulates the temporal and spatial expression of SCGB1A1(CCSP) and BPIFA1 (SPLUNC1) in the respiratory tract Gail H. Leeming,
Volume 19, Issue 1, Pages (January 2012)
Lecture 7: Biological Network Crosstalk Y. Z
Nat. Rev. Urol. doi: /nrurol
R119 is highly conserved, and the p
Session 1: WELCOME AND INTRODUCTIONS
How to search NCBI.
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

PubChem BioAssay: Link chemical research to GenBank and beyond Yanli Wang PubChem BioAssay: Link chemical research to GenBank and beyond   251st American Chemical Society National Meeting  San Diego, California March 13th-17th, 2016

PubChem BioAssay … Public Data Repository at NCBI Open Research Database Small Molecule Bioactivity Data RNAi Screening Connect to PubChem Substance Integrated with Other Biomedical Resources *at PubChem we have several goals *public depo system *various sources

Chemical Biology Chemgenomics Medicinal Chemistry Drug Discovery PubChem BioAssay … support multiple research areas Chemical Biology Chemgenomics Medicinal Chemistry Drug Discovery Functional Genomics *at PubChem we have several goals *public depo system *various sources

PubChem BioAssay … data standard Meta data Test results Protocol Target Cell line Comment / Categorized comment Grant number Embargo date Cross reference publication taxonomy related assays gene, nucleotide etc. Sample ID / SID Bioactivity outcome / score/potency/dose response Phenotype annotation Bioactivity readout Cross reference Target Replicate Attributes

PubChem BioAssay … data content Statistics Data type 1,000, 000 records 3,000,000 tested substances 220,000,000 bioactivity outcomes 1,000,000,000 data points 200 chemical probes HTS experiment Literature curation Bioactivity Toxicity Selectivity Profiling

Links to many other databases PubMed 50,000 research data OMIM Protein 10,000 assay target BioSystems a pathway db Gene 50,000 assay target drug annotation MeSH Literature claasifiication Nucleotide assay target Depositor website *to this end *hosted by NCBI *providing additional annotation Nucleotide: AID 1637 GEO Taxonomy 3000 Structure a mirror of Protein Data Bank (PDB) CDD conserved protein family domain

Link Research Data to Molecular Target … BioAssay targets all test results specific to test reagent specific readout *at PubChem we have several goals *public depo system *various sources

Chemical Probe … 200 more F2RL3 antagonists IC50: 0.139 uM (CID: 2333) mGlu5 positive allosteric Potentiator EC50: 2.411 uM (CID: 1318633) EGFR inhibitor IC50: 0.7079 uM (CID: 2303746) Thyroid Hormone Receptor / Steroid Receptor Coregulator 2 interaction inhibitor Potency: 1.4uM (CID: 5184800) *here we show some examples of aspirin in … *generic name, brand name *same/different structure representation *but user just want one aspirin CHRM5 antagonists IC50: 0.44uM (CID: 42519285) STAR inhibitor IC50: 2.12 uM (CID: 45100448) mGluR3 modulator IC50: 2.611 uM (CID: 60210836) MRGPRX1 allosteric activator EC50: 0.19 uM (CID:71598556)

Protein BioAssay Target …

Biological Pathways for Protein Target … BioSystems name KEGG id (conserved pathway) Count of genes Neuroactive ligand-receptor interaction ko04080 623 Calcium signaling pathway ko04020 329 cAMP signaling pathway ko04024 299 PI3K-Akt signaling pathway ko04151 279 MAPK signaling pathway ko04010 252 Ribosome ko03010 220 Proteoglycans in cancer ko05205 194 cGMP-PKG signaling pathway ko04022 190 Focal adhesion ko04510 189 Rap1 signaling pathway ko04015 186 Oxytocin signaling pathway ko04921 181 Retrograde endocannabinoid signaling ko04723 168 Inflammatory mediator regulation of TRP channels ko04750 HTLV-I infection ko05166 165 Vascular smooth muscle contraction ko04270 Chemokine signaling pathway ko04062 163 Alzheimer's disease ko05010 161 Epstein-Barr virus infection ko05169 155 Adrenergic signaling in cardiomyocytes ko04261 Dopaminergic synapse ko04728 153

Organisms … Organism Assay Count Rattus norvegicus 391714 Homo sapiens 260507 Mus musculus 118398 Staphylococcus aureus 17767 Canis lupus familiaris 16884 Escherichia coli 13513 Cavia porcellus 12341 Human immunodeficiency virus 1 9075 Pseudomonas aeruginosa 6654 Oryctolagus cuniculus 6574 Candida albicans 6206 Bos taurus 5277 Plasmodium falciparum 5037 Macaca mulatta 4412 Streptococcus pneumoniae 3604 Mycobacterium tuberculosis 3562 Macaca fascicularis 3244 Klebsiella pneumoniae 3031 Saccharomyces cerevisiae 2900 Cricetulus griseus 2889 -- assay count by Kingdom ;WITH z AS ( SELECT a.taxid, a.pTaxid AS pTaxid, 0 AS rn FROM BaTaxonomyLineage a UNION ALL SELECT z.taxid, x.pTaxid, z.rn+1 FROM z INNER JOIN BaTaxonomyLineage x ON z.pTaxid=x.taxid WHERE x.pTaxid>0 ), z2 AS ( SELECT z.taxid, z.pTaxid, ROW_NUMBER() OVER (PARTITION BY z.taxid ORDER BY z.rn DESC) AS rn FROM z ), z3 AS ( SELECT z2.taxid, y.sciName FROM z2 INNER JOIN BaTaxonomyLineage y ON y.taxid=z2.pTaxid WHERE z2.rn=1 ) SELECT z3.sciName, COUNT(DISTINCT a.aid) FROM z3 INNER JOIN BaXrefTaxon a ON a.taxid=z3.taxid GROUP BY z3.sciName SELECT TOP 20 a.taxid, COUNT(DISTINCT a.aid) cnt FROM BaXrefTaxon a GROUP BY a.taxid ORDER BY cnt DESC SELECT b.sciName, z.cnt FROM z INNER JOIN BaTaxonomyLineage b ON b.taxid=z.taxid ORDER BY z.cnt DESC

Gene Target and its relevance to disease …

BioAssay Descriptions & Data … https://pubchem. ncbi. nlm. nih

A RNAi BioAssay Record… http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=720703 Gene target

Kinase selectivity profiling assay…

BioAssay Search … - classification tool for research data https://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi

BioAssay Target Search …

Link BioAssay data to Entrez Gene … Verify gene functions with RNAi data Identify drugs & chemical modulators

Summary Repository of chemistry & functional genomics research data Cross link chemical biology data to genomic resources providing access to chemical tools Identify gene functions Predict target and off-targets Evaluate selectivity, promiscuity, toxicity Construct drug target network Drug repositioning

PubChem … Open & Public Resource http://pubchem.ncbi.nlm.nih.gov Send questions to: pubchem-help@ncbi.nlm.nih.gov pubchem-deposit-help@ncbi.nlm.nih.gov ywang@ncbi.nlm.nih.gov

Acknowledgement Steve Bryant Ben Shoemaker Paul Thiessen Jiyao Wang Evan Bolton Jie Chen Tiejun Cheng Gang Fu Haehnke Volker Lewis Geer Renata Geer Asta Gindulyte Lianyi Han Jane He Siqian He Sunghwan Kim Ben Shoemaker Paul Thiessen Jiyao Wang Bo Yu Jian Zhang