Open PHACTS Easy API Community Workshop, June 25, 2014 Christine Chichester Swiss Institute of Bioinformatics.

Slides:



Advertisements
Similar presentations
The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
Advertisements

THOMSON REUTERS INTEGRITY SM : INTEGRATED DRUG DISCOVERY AND DEVELOPMENT PORTAL.
Connecting to Open PHACTS API via Python/Pipeline Pilot
Antonis Loizou (some slides created by Paul Groth) VU University Amsterdam LDBC TUC Meeting.
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
Working with gene lists: Finding data using GEO & BioMart June 5, 2014.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Today’s menu: -UniProt - SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.
UCB BioText TREC 2003 Participation Participants: Marti Hearst Gaurav Bhalotia, Presley Nakov, Ariel Schwartz Track: Genomics, tasks 1 and 2.
Protein and Function Databases
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
An introduction to using the AmiGO Gene Ontology tool.
1 SRI International Bioinformatics Advanced PGDB Editing: Regulation GO Terms Ingrid M. Keseler Bioinformatics Research Group SRI International
Open PHACTS “Data integration for all” Andrew Leach.
The Open PHACTS Discovery Platform Open PHACTS for Academia.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
1 SRI International Bioinformatics Large-Scale Metabolic Network Alignment: MetaCyc and KEGG Tomer Altman Bioinformatics Research Group SRI International.
July 2015 CSHL Data analysis: GO tools and YeastMine, use-case examples.
Paul Groth VU University Amsterdam Convergence Meeting: Semantic Interoperability for Clinical Research & Patient.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Copyright OpenHelix. No use or reproduction without express written consent1.
© Wiley Publishing All Rights Reserved. Protein and Specialized Sequence Databases.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Copyright OpenHelix. No use or reproduction without express written consent1.
SEMESTER PROJECT PRESENTATION CS 6030 – Bioinformatics Instructor Dr.Elise de Doncker Chandana Guduru Jason Eric Johnson.
The MMI Tools Carlos Rueda Monterey Bay Aquarium Research Institute OOS Semantic Interoperability Workshop Marine Metadata Interoperability Project Boulder,
The Open Pharmacological Concepts Triple Store
CHRIS NELSON METADATA TECHNOLOGY WORK SESSION ON STATISTICAL METADATA GENEVA 6-8 MAY 2013 Designing a Metadata Repository Metadata Technology Ltd.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
Supporting High- Performance Data Processing on Flat-Files Xuan Zhang Gagan Agrawal Ohio State University.
Martin Golebiewski Scientific Databases and Visualization Group EML Research, Heidelberg 2nd BioModels.net Training Camp th of January 2007, Manchester,
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
ChEMBL– Open Access Database For Drug Discovery By – Udghosh Singh M.S.(Pharm), 3 rd Sem Pharmacoinformatics.
Leveraging Ontologies for Human Immunology Research Barry Smith, Alexander Diehl, Anna- Maria Masci Presented at Leveraging Standards and Ontologies to.
Help: Strain Page Header Yeast ORF deletion: _d suffix : dubious ORF _p suffix : putative (uncharacterized) ORF Gene/Protein: The established name for.
Strategies for functional modeling TAMU GO Workshop 17 May 2010.
What's New in Kinetic Calendar 2.0 Jack Boespflug Kinetic Data.
Copyright OpenHelix. No use or reproduction without express written consent1.
Help: Strain Page Header Yeast ORF deletion: _d suffix : dubious ORF _p suffix : putative (uncharacterized) ORF Gene/Protein: The established name for.
NCBI Literature Databases: PubMed
Delivering an online service for validating and standardizing chemical structure files using the ChemSpider platform.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Using the Open PHACTS API with KNIME Daniela Digles Open PHACTS Community Workshop.
WStore Programmer Guide Offering management integration.
Copyright OpenHelix. No use or reproduction without express written consent1.
EnVisioning Data Integration SME forum 2009, Vienna Henning Hermjakob Henning Hermjakob
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
AdisInsight User Guide July 2015
Ontology, RDF, SW for Chemical Structures
Building linked-data, large-scale chemistry platform: challenges, lessons and solutions Valery Tkachenko, Alexey Pshenichnov, Aileen Day, Colin Batchelor,
Take a REST from manual searching: PDBe, programmatically
Improving Data Discovery Through Semantic Search
Open PHACTS 1.3 Release ( triples)
Introduction to PubChem BioAssay
PDAP Query Language International Planetary Data Alliance
What is Bioinformatics?
Advanced PGDB Editing: Regulation GO Terms
Welcome - webinar instructions
Remote Data Access Update
Supporting High-Performance Data Processing on Flat-Files
Welcome - webinar instructions
Presentation transcript:

Open PHACTS Easy API Community Workshop, June 25, 2014 Christine Chichester Swiss Institute of Bioinformatics

Concepts Data Sources Use Cases API Approach for analysis

“Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM” “What is the interaction profile of known p38 inhibitors?” “Let me compare MW, logP and PSA for known oxidoreductase inhibitors” Use Cases

Concepts Data Sources Use Cases API Approach for analysis

Concepts Chemical compounds Biological targets Pathways Diseases

Concepts Data Sources Use Cases API Approach for analysis

ChEMBL DrugBank Gene Ontology Wikipathways UniProt ChemSpider ChEMBL Target Class ChEMBL Target Class ConceptWik i ChEBI DisGeNet neXtProt Data Sources ENZYME FDA adverse events FDA adverse events Clinical trials.org

Concepts Data Sources Use Cases API Approach for analysis

Registering for API keys API Overview Entry Points: URLs API Results Getting Started with Compounds Going Further Open PHACTS API Basics

Get my API keys!

API Overview: Documentation

The Linked Data API – Simple Rest-ful API Advantages of the linked data approach while providing a more familiar API Lowers the barrier to data access Based on community standards REST JSON, XML, TSV Not just data access, but complex queries allowing filtering, pagination, export, etc.

Response template legend for API calls results

1.3 API calls: Concept types

Multiple result formats

Many filtering options per call

Calls that provide filtering parameters

API Entry Points: URLs

What’s needed to get started? The API is URL centric a810-7dfd6eb05168 Why? -Ensures precise identification of the concept -Allows for dereferencablity -supports many URLs from different domains Next: Getting a URL

Finding an initial URL

Compound example: Naphthalene Textual name: Naphthalene SMILES: C1=CC=C2C=CC=CC2=C1 InChI: InChI=1S/C10H8/c (10)5-1/h1-8H

"_about": "

Entry with either Chemspider or ConceptWiki URL (or others) into other API calls "_about": "

API Results: Example with Compound APIs

Compound APIs: Results by Dataset Chembl DrugBank OPS: Open PHACTS Chemical Registry “inDataset”

Compound Pharmacology API: Retrieve a target URL Target URL

Target Information API: Using target URL from previous call Target URL

Target Information API: results (continued)

Compound -> Target in 3 URLs 1.Free text to retrieve a compound URL 1.Pharmacology for the compound: results include a target URLs 1.More information about a target

Thank you

Solving more use cases

1.) Provide all activities for a given compound X, with targets annotated by gene (Compound -> target) needed API calls: –Compound Pharmacology Paginated –Map URL

Compound Pharmacology Paginated – parameters uri: needs a compound uri. Possible sources: –Map free text to a concept URL methods –SMILES, Inchi or InchiKey to URL methods –Chemical Structure search methods –known identifier from other sources: e.g. – app_id and app_key target_type: single_protein (if wanted) _pageSize (default 10, use with loop on _page, or set to all)

Compound Pharmacology - results

Retrieving a protein/gene ID Use the Map URL API call Uri: any input uri, e.g. (Restrict to wanted targetUriPattern) Example results: – –

2.) Provide all compounds assayed for target Y with target indicated by a gene (Target -> Compound) needed API call: –Target Pharmacology Paginated input parameter: –uri: protein or gene uri, e.g.

3.) For target X provide target family. (Gene -> Gene family) needed API calls: –Target Classifications parameters for classification: –tree: chembl, enzyme or go

Retrieving Pharmacology for all proteins with a given classification Target Class Pharmacology Paginated –uri: Classification uri e.g.

4.) Filter results by activity cut-offs Available filters: –Activity type / cutoff value / units combinations –pChembl cutoff values –activity relation filters (>, >=, =, <, <=) –Organism filters (target and assay organism) –Target type (e.g. single_protein, protein_family)

STANDARD_TYPE UNIT_COUNT AC50 7 Activity 421 EC50 39 IC50 46 ID50 42 Ki 23 Log IC50 4 Log Ki 7 Potency 11 log IC50 0 STANDARD_TYPE STANDARD_UNITS COUNT(*) IC50 nM IC50 ug.mL IC IC50 ug/ml 2038 IC50 ug ml IC50 mg kg IC50 molar ratio 178 IC50 ug 117 IC50 % 113 IC50 uM well-1 52 IC50 p.p.m. 51 IC50 ppm 36 IC50 uM-1 25 IC50 nM kg-1 25 IC50 milliequivalent 22 IC50 kJ m-2 20 ~ 100 units >5000 types Implemented using the Quantities, Dimension, Units, Types Ontology Quantitative Data Challenges

Activity type / cutoff value / units combinations possible values: –Activity types: e.g. Potency, GI50, IC50, Ki, … –cutoff values: number with the appropriate parameter –= activity_value –>= min-activity_value –> minEx-activity_value –<= max-activity_value –< maxEx-activity_value –Units for activity type: e.g. nanomolar, microgram_per_milliliter

pChembl cutoff values Definition: -Log(molar IC50, XC50, EC50, AC50, Ki, Kd or Potency) –e.g. IC50=10µM -> pChembl = 5 Filters: –= pChembl –>= min-pChembl –> minEx-pChembl –<= max-pChembl –< maxEx-pChembl

5.) Substructure and Similarity search in Open PHACTS. Uses Chemspider search tools (Bingo from GGA) Chemical Structure Exact/Similarity/Substructure Search

Chemical Structure Similarity Search selected parameters: –searchOptions.Molecule: a SMILES string –searchOptions.SimilarityType: 0: Tanimoto; 1: Tversky; 2: Euclidian –searchOptions.Threshold: value between 0 and 1 –resultOptions.Count: number of results

6.a) For a target Y, find pathway Z Needs API call: –Pathways by Target Find an identifier for the target – 4c f http:// 4c f Additional: Map URL, possibly with the restriction or to retrieve complementary identifiers Additional: Count Pathways by Target

"_about": " "identifier": " "title": "DNA damage response (only ATM dependent)", "description": "This is the second pathway out of two pathways which deals with DNA damage response", "hasPart": { "_about": " "type": " "exactMatch": { "_about": " f ", "prefLabel": "E3 ubiquitin-protein ligase Mdm2 (Homo sapiens)" } }, "inDataset": " "pathway_organism": { "_about": " "label": "Homo sapiens" }, Pathways by Target

"_about": " f ", "exactMatch": " " Map URL with restriction on namespace

6.b) For a pathway Z, find target Y Needs API call: –Get Targets for Pathway, textually names for pathways can be searched (Map free text to concept URL) or use WikiPathway URI directly: b357-3eed6f837ca8http:// b357-3eed6f837ca8

"_about": " "title_en": "DNA damage response (only ATM dependent)", "title": "DNA damage response (only ATM dependent)", "hasPart": [ " " " List of genes/proteins in pathway Get Targets for Pathway