Download presentation
Presentation is loading. Please wait.
1
Networks and Pathways I
First & Last Name February X, 2003 Networks and Pathways I CBW Bioinformatics Workshop February 24th 2005, Vancouver Christopher Hogue The Blueprint Initiative Lecture 10.3 (c) 2003 CGDN
2
About this talk The Problem of Choice – Too Many Databases
First & Last Name February X, 2003 About this talk The Problem of Choice – Too Many Databases Data Exchange Formats Pathway Resources KEGG EcoCyc Small Molecule Resources PubChem SMID-BLAST Lecture 10.3 (c) 2003 CGDN
3
Molecular Assembly Data
Interaction pair “A binds B” Database of Interactions Molecule = Vertex Interaction = Edge Tools/Computations Graph Theory Pathway Finding Simulations Cellular CAD Goodsell
4
Molecular Assembly What Databases to use?
First & Last Name February X, 2003 Molecular Assembly What Databases to use? DNA RNA Proteins Small molecules Complexes Lecture 10.3 (c) 2003 CGDN
5
First & Last Name The Problem February X, 2003 So many assembly databases, all with their own data models, formats, and data access methods. Include the number 138 in larger font Lecture 10.3 (c) 2003 CGDN
6
User Behavior The problem of too much choice.
(M. and S. Two tables in a supermarket: 24 jars of jam vs 6 jars of jam. 3% vs 30% Choice frustration. Leads to incrementalism – as essential user criticism is withdrawn. Can’t Debug - This jam is a little bitter compared to the other 6? the other 26? A whole lot of bad jam that nobody wants to buy… Lecture 10.3
7
User Behavior The problem of too much choice.
(M. and S. Two tables in a supermarket: 24 jars of jam vs 6 jars of jam. 3% vs 30% Choice frustration. Leads to incrementalism Essential user criticism is withdrawn. Can’t Debug - This jam is a little bitter compared to the other 6? the other 24? A whole lot of bad jam that nobody wants to buy… Lecture 10.3
8
Standards Fatigue Data Standards are not an effective goal to achieve results in a timely way Interactions/Pathways since NIH meeting in Nov Efforts are still not integrated (PSI/IMEX and BIOPAX). Information Systems are better goals. Wet Lab Scientists are busy people who are (excuse me) trying to write papers. Ongoing wishful thinking about latest new technology. If only we had the semantic web – it wouldl fix everything! Lecture 10.3
9
“Community” Standards
IMEX (BIND/DIP/INTACT/MINT/MIPS) BioPAX (pathway databases) SBML (>70 software systems collaborating) Cytoscape (collaborating interface developers) NCBI/Blueprint (architecture) Model Organism Databases (GMOD architecture) Journals and Editors Scientific Societies (FASEB) Member and Non-member Scientists Lecture 10.3
10
Interaction Standards - PSI
Lecture 10.3
11
BioPAX – Pathways/Reactions
Lecture 10.3
12
Exchange Formats in the Pathway Data Space
First & Last Name February X, 2003 Database Exchange Formats Simulation Model Exchange Formats BioPAX Genetic Interactions SBML, CellML PSI-MI 2 Regulatory Pathways Low Detail High Detail Interaction Networks Molecular Non-molecular Pro:Pro TF:Gene Genetic Rate Formulas Molecular Interactions Pro:Pro All:All Biochemical Reactions This slide shows the difference between exchange formats in the pathway data space. Pathway databases and tools are not considered here, although each one is important and addresses specific use cases. I would use the other slide which shows the scope of BioPAX level 1. You can always point out that BioPAX intends to cover the whole space. It is important not to overstate the current state of things. BioPAX level 1 is broad and shallow allowing us to quickly build a format that can represent most the existing data types in databases today. Details will be added in subsequent levels. This is a practical approach. Metabolic Pathways Low Detail High Detail Small Molecules Low Detail High Detail Lecture 10.3 (c) 2003 CGDN
13
Two Views on Biomolecular Assembly Data Integration
Separate Models Pathways Interactions Separate Databases Multiple DB ontologies Ad-hoc curation standards Ontology Consortia PSI BioPAX APIs – Exchange Only Publish or perish Unified Model Networks with Interactions and Reactions GenBank-Like Data Archive One Ontology archiving all Professional Curation Single Curation Standard FTP Services APIs – Atomistic Objects Service or perish Lecture 10.3
14
Where to define data objects? API or Exchange or Archive?
Software Systems Components (OSI Layers…) Human Interfaces Application Programming Interfaces Communications Protocols (Exchange) Content Structure (Archive) Database (ODBC/JDBC compliant MySQL) Document Structures (XML) Architectures (Compatible orchestration of the above) Platforms (Runs the above: Windows, Linux, Unix) Atomistic All-or-none Lecture 10.3
15
BioPAX Motivation First & Last Name February X, 2003 Common format will make data more accessible, promoting data sharing and distributed curation efforts Application Data loss… Database Designed by the databases for themselves (DB-DB exchange) and for users. Umm… not really for users, yet. Unless you integrate into SBML User With BioPAX Before BioPAX >150 DBs and tools Lecture 10.3 (c) 2003 CGDN
16
Pathways, Interactions and Signaling
First & Last Name February X, 2003 Pathways, Interactions and Signaling Also include gene regulation Metabolic Pathways Molecular Interaction Networks Signaling Pathways Lecture 10.3 (c) 2003 CGDN
17
Lecture 10.3
18
Summary: Working with a spectrum of communities.
Identify the communities. Recognize that communities are disjoint. Success will arise from broad collaboration across the spectrum of identified communities. Service all communities effectively with a whole system. Drive innovation more through applications development and use. Gain and effectively incorporate user critique. Understand user needs, behaviors. Lecture 10.3
19
Pathway Databases KEGG and EcoCyc Lecture 10.3
20
Lecture 10.3
21
Lecture 10.3
22
Lecture 10.3
23
Lecture 10.3
24
Lecture 10.3
25
Lecture 10.3
26
Lecture 10.3
27
Lecture 10.3
28
Lecture 10.3
29
Lecture 10.3
30
Lecture 10.3
31
Lecture 10.3
32
Lecture 10.3
33
Lecture 10.3
34
PubChem – Small Molecules
Lecture 10.3
35
PubChem Substance Compound
descriptions of chemical samples, from a variety of sources, and links to PubMed citations, protein 3D structures, and biological screening results that are available in PubChem BioAssay. If the contents of a chemical sample are known, the description includes links to PubChem compound. Compound Includes mixtures Lecture 10.3
36
Lecture 10.3
37
Lecture 10.3
38
Lecture 10.3
39
Similarity Search Similar Compounds Links – PubChem Bioassay
Lecture 10.3
40
Lecture 10.3
41
Lecture 10.3
42
Lecture 10.3
43
Lecture 10.3
44
Lecture 10.3
45
Small Molecule Interaction DB.
SMID-BLAST - for finding small molecule binding sites based on 3D structures. Lecture 10.3
46
Lecture 10.3
47
What’s in SMID? SMID-BLAST?
SMID is a derived relational database 3D structures that have small molecule binding sites CDD domain regions – families of conserved domains Small molecule binding residues are mapped onto CDDs. SMID-BLAST enhances domain searching with small molecule binding context. Lecture 10.3
48
Proteomics – HUPO Poster
Proteomics – Phenol upregulated protein in H. salinarium. Spots identified by 2D gels of +/- 1mM Phenol in 4.5M NaCL Han, Han, Kim, Joo and Chan-Wha Kim, Korea University H. salinarium is not sequenced – Mass spec peptide hits to Halobacterium sp. NRC-1 GI (Vng2406c) and GI (Vng2339c) Poster authors presented no conclusions other than that these were completely unknown proteins. Lecture 10.3
49
Lecture 10.3
50
Little information from CDD…
Lecture 10.3
51
Lecture 10.3
52
Lecture 10.3
53
Completely relaxed Search settings… Lecture 10.3
54
Lecture 10.3
55
Aromatic ligand binding site – phenol…
Lecture 10.3
56
Oxygen Reactive site Lecture 10.3
57
SMID-BLAST Offers small molecule context in addition to CDD domain hits With SMID-BLAST we can speculate on how two proteins work to utilize Phenol as a carbon source Reactive species and “loose” specificity hydrophobic binding sites. Lecture 10.3
58
SMID-BLAST Standalone
Scoring System Distinguishes site specificity Weights substrate/binding site size Generates GenPept Annotation Suitable for use in sequence analysis pipelines Lecture 10.3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.