Download presentation
Presentation is loading. Please wait.
1
Protein association networks with STRING
Lars Juhl Jensen
2
interaction networks
3
association networks
4
guilt by association
6
protein networks
7
STRING
8
9.6 million proteins
9
common foundation
10
Exercise 1 Go to http://string-db.org/
Query for human insulin receptor (INSR) using the search by name functionality Make sure you are in evidence view (check the buttons below the network) Why are there multiple lines connecting the same to two proteins?
11
curated knowledge
12
(what we know)
13
protein complexes
14
3D structures
16
pathways
17
metabolic pathways
18
Letunic & Bork, Trends in Biochemical Sciences, 2008
19
signaling pathways
20
very incomplete
21
experimental data
22
(what we measured)
23
physical interactions
24
Jensen & Bork, Science, 2008
25
genetic interactions
26
Beyer et al., Nature Reviews Genetics, 2007
27
gene coexpression
28
microarrays
29
RNAseq
30
Exercise 2 (Continue from where exercise 1 ended)
Which types of evidence support the interaction between INSR and IRS1? Click on the interaction to view the popup, which has buttons linking to full details Which types of experimental assays support the INSR–IRS1 interaction?
31
predictions
32
(what we infer)
33
genomic context
34
evolution
35
gene fusion
36
Korbel et al., Nature Biotechnology, 2004
37
gene neighborhood
38
Korbel et al., Nature Biotechnology, 2004
39
phylogenetic profiles
40
Korbel et al., Nature Biotechnology, 2004
41
a real example
45
Cell Cellulosomes Cellulose
46
complications
47
many databases
48
different formats
49
different identifiers
50
variable quality
51
not comparable
52
not same species
53
hard work
54
parsers
55
mapping files
56
quality scores
57
affinity purification
58
von Mering et al., Nucleic Acids Research, 2005
59
phylogenetic profiles
61
score calibration
62
gold standard
63
von Mering et al., Nucleic Acids Research, 2005
64
implicit weighting by quality
65
common scale
66
homology-based transfer
67
orthologous groups
68
Franceschini et al., Nucleic Acids Research, 2013
69
missing most of the data
70
Exercise 3 (Continue from where exercise 2 ended)
Change the network to the confidence view Change the confidence cutoff to 0.9; any changes in proteins or interactions shown? Turn off all but experiments; what changes? Increase the number of interactors shown to 50; how many proteins do you get? Why?
71
text mining
72
>10 km
73
too much to read
74
exponential growth
75
~40 seconds per paper
76
named entity recognition
77
comprehensive lexicon
78
cyclin dependent kinase 1
79
CDC2
80
orthographic variation
81
prefixes and suffixes
82
CDC2
83
hCdc2
84
“black list”
85
SDS
86
information extraction
87
co-mentioning
88
counting
89
within documents
90
within paragraphs
91
within sentences
92
scoring scheme
95
score calibration
96
Natural Language Processing
NLP Natural Language Processing
97
part-of-speech tagging
98
what you learned in school
pronoun pronoun verb preposition noun
99
semantic tagging
100
grammatical analysis
101
Cue words for entity recognition Verbs for relation extraction
Gene and protein names Cue words for entity recognition Verbs for relation extraction [nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]] is controlled by [nxpg HAP1] Saric et al., Proceedings of ACL, 2004
102
type and direction
103
complex sentences
104
anaphoric references
105
it
106
related resources
107
association networks
108
heterogeneous data
109
common identifiers
110
quality scores
111
protein networks
112
string-db.org Szklarczyk et al., Nucleic Acids Research, 2015
113
STITCH
114
chemical networks
115
stitch-db.org Kuhn et al., Nucleic Acids Research, 2014
116
COMPARTMENTS
117
subcellular localization
118
compartments.jensenlab.org Binder et al., Database, 2014
119
TISSUES
120
tissue expression
121
tissues.jensenlab.org Santos et al., PeerJ, 2015
122
DISEASES
123
disease associations
124
diseases.jensenlab.org Frankild et al., Methods, 2015
125
Exercise 4 Open http://tissues.jensenlab.org
Look up tissue associations for insulin (INS) Open Search for insulin receptor (INSR) What is the strongest associated disease? Inspect the underlying text-mining evidence
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.