Download presentation
Presentation is loading. Please wait.
1
Open PHACTS 1.3 Release (2 701 602 484 triples)
Blue font indicates new data and new data sources available in 1.3
2
1.3 Release Summary Integration of WikiPathways to provide a series of pathway-based API calls A refresh of ChEMBL to ChEMBL_16 Extension and better support for units and filtering of pharmacology data. Support for pChEMBL filtering Ability to query hierarchy data in queries (GO, ChEBI, ChEMBL target, ENZYME) All new chemistry processing using the Open PHACTS Chemistry registry and new chemistry "lenses" for flexible mapping Further development of KNIME and Pipeline Pilot Nodes Further development of our support portal at support.openphacts.org
3
Open PHACTS 1.3 Data Content
4
Open PHACTS 1.3 Supported Identifiers
We now support an increased range of public database identifiers for proteins and compounds. Please see: openphacts.cs.man.ac.uk:9093/QueryExpander/mappingSet?lensUri=All Home page: openphacts.cs.man.ac.uk for supported identifiers including: HGNC Symbols GeneOntology ChEMBL ID Ensembl EntrezGene and many more….
5
Details on new methods & data
6
Compound API Methods
7
Compound Info Data SMILES, InChI, InChIKey
logP, Hydrogen bond acceptor or donor, Rule of 5 violations, Polar surface area Rotatable bonds, molecular weight, molecular formula, Freebase molecular weight Biotransformation, description, protein binding, toxicity, melting point, drug name, drug type (approved, experimental, etc) Compound name
8
Compound Info Data Sources
OPS chemical registration system (OCRS) ChEMBL (ChEMBL_16) molecule dataset triples DrugBank ConceptWiki (Sept 9, 2013 ChemSpider)
9
Compound Class Data Sources
ChEBI : Count, List, Compound Classifications GeneOntology : Compound Classifications, Target Classification ChEBML Targets: Target Classification ( triples) ENZYME: Target Classification
10
Compound Pharmacology Data
Compound data: Activity (type, value, units, comment), pChEMBL, Assay (type, organism, description), URIs (OCRS, ChEMBL, ConceptWiki, DrugBank), Drug type, Generic name, SMILES, InChI, InChIKey, molecular weight, Rule of 5, Literature (DOI, PMID) Target data: Name, organism target component, target type
11
Compound Pharmacology Data Sources
OPS chemical registration system (OCRS) ChEMBL activity dataset triples assay dataset triples DrugBank ( triples)
12
Compound Class Pharmacology
Activity values for compounds from a given ChEBI compound class
13
Chemical Structure Mapping Methods
14
Chemical Structure Conversion and Search Data
Conversion of InChI, InChIKey, SMILES, to OCRS URIs Chemical similarity search: type of search (Tanimoto, Tversky, Euclidian) search threshold (alpha, beta) Chemical substructure search Chemical structure exact search: options for tautomers, same skeleton in/excluding H, all isomers Relevance score for each structure search result returned
15
Target API Methods
16
Target Info Data URIs (ConceptWiki, UniProt, ChEMBL, DrugBank)
Name, synonyms Sequence, Protein existence, Mass, Functional annotation, GO terms, UniProt curated Protein-Protein Interactions, links to PDB and IntAct, Number of residues, Theoretical pI, Cellular location ChEMBL target component description
17
Target Info Data Sources
ConceptWiki (including WikiPathways, Pathway Ontology concepts) UniProt (including UniParc) GOA (update Sept 9, 2013) ChEMBL target dataset triples target component triples) DrugBank IntAct
18
Target Class Data Sources
GeneOntology Annotations (GOA): Classification of Targets ChEMBL Targets: Classification of Targets ENZYME: Classification of Targets
19
Target Pharmacology Data
Target data: Name, organism, target component, target type (API method for determining 16 different target types which can be used for filtering results) Compound data: Activity (type, value, units, comment), pChEMBL, Assay (type, organism, description), URIs (OCRS, Chembl, ConceptWiki, DrugBank), Drug type, Generic name, SMILES, InChI, InChIKey, molecular weight, Rule of 5, Literature (DOI, PMID)
20
Target Class Pharmacology
Activity values for targets found in a given class in the supported hierarchies: ENZYME Classification ChEMBL Target Classification GeneOntology
21
Pathway API Methods
22
Pathway Info Data URIs (WikiPathways, Pathway Ontology, ConceptWiki)
Title, Description, Annotations, Organism Pathway participants Compounds Proteins Literature
23
Pathway Info Data Sources
WikiPathways contains Curated pathways, converted KEGG pathways Metabolite identifiers: HMDB, ConceptWiki, Target identifiers: GeneID, UniProt, ConceptWiki, Ensembl Publication identifiers: DOI, PMID NCBI taxonomy URIs and textual names
24
Open PHACTS API Methods for Hierarchies
1.3 release methods for hierarchies and classifications: ENZYME, GeneOntology, ChEBI, and ChEMBL targets
25
Hierarchy Methods Visualization of hierarchy data by hierarchy structure: ENZYME ChEBI ChEMBL target classification GeneOntology
26
Basic hierarchy API methods
ENZYME EC EC ChEMBL Target Hierarchy (Root) EC EC EC EC GeneOntology Biological process Molecular function Cellular component ChEBI Subatomic particle Chemical entity Has role
27
Basic hierarchy API methods
DNA methylation DNA methylation or demethylation regulation of gene expression, epigenetic DNA alkylation macromolecule methylation cellular response to hypoxia regulation of transcription from RNA polymerase II promoter in response to hypoxia hypoxia-inducible factor-1alpha signaling pathway
28
Query of ChEMBL Target Classification
ChEMBL Target Hierarchy Query for Protein Kinase class : CHEMBL_PC_320 646 Results Transversal of all nodes below Protein Kinase to retrieve all leaves (targets), e.g all 3 proteins in the Histk node will be returned
29
Closer look: HistK class members
Query with classes: Atypical: CHEMBL_PC_1451 or Histk: CHEMBL_PC_267 Same 3 results for both classes: 2 different types of targets
30
“Protein family” target: PKC Alpha
Target Class Members List: Query for Alpha class CHEMBL_PC_317 PKC alpha (P17252) is part of protein family target, Chembl , which is comprised of members from several other PKC classes
31
Target Classification: PKC Alpha
Target Classification: Query for protein target PKC alpha (P17252)
32
Classification API for ChEMBL targets
A “protein family” target can be represented in several classes P17252 is a “single protein” target as well as part of a “protein family” target For retrieving only “single protein” targets, filter results by using the target_type restriction with the “single_protein” parameter The Target Classification API method returns all classes that have been annotated to contain the target P17252 as part of the “protein family” target is found in classes: Ser_Thr, Agc, Pkc, Alpha, as well as Camk and Pkd P17252 as part of the “protein family” target is found in the ENZYME classes: EC and EC
33
Target Classification: GO terms
Target Classification method returns all GO classes that have been annotated to contain the target. Similar to ChEMBL, super classes are returned.
34
Query for protein target PKC alpha (P17252)
Classes of compounds that have pharmacology with a target: ChEBI results Query for protein target PKC alpha (P17252) Target query: PKC Alpha Compound class result: monohydroxyquinoline
35
Members of ChEBI Compound Class: monohydroxyquinoline
Compound Class Members List: Query for monohydroxyquinoline (ChEBI_38775) The members of the compound class of monohydroxyquinoline (indacaterol, chloroxine, etc) can be used In Compound Classifications methods
36
Compound Classification: Indacaterol
Compound Classification: Query for compound Indacaterol (OPS ) Concepts from the 3 branches of the ontology can be returned. Has role concepts: beta-adrenergic agonist and bronchodilator agent Chemical entity (type) concepts: quinoline
37
Hierarchical Activity Clustering Use Case
Sorafenib: 2755 Assay Points total Activity only cutoff returned 707 results Human only cutoff returned 634 results Single Protein only cutoff returned 482 results 120 distinct human targets But what does the distribution look like? When you run pharmacology on a promiscuous compound you’re going to get hits to a lot of different targets- You may want to cluster those hits by family, to see whether the hits were concentrated in sets of protein types- This example starts the compound Sorafenib, and filters for human, activity type pharmacology, and single proteins only – which reduces the pool to120 distinct protein-compound links-
38
-Developed a small script, for each target, get the hierarchy [A => B => C] and set a counter for each level [A=1, B=1, C=1] Then, get the next protein hit, and get the full hierarchy: [A => B => D] And set a counter for each level [A=2, B=2, C=1, D=1]- -Results in “tree-node to count distinct target proteins"- -Use the basic API functionality (Hierarchies: Parent nodes) for each node, to get the full path from tree top and concatenated it Put path->count in excel and sorted- Shows that the target proteins group on the two main kinase branches, and other specific nodes s such as Tie, Src, Eph where there are many hits.
39
Extra Methods Map URL: visualization of mappings between URLs
Data sources: visualization of VoID for integrated data sources
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.