Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein association networks with STRING

Similar presentations


Presentation on theme: "Protein association networks with STRING"— Presentation transcript:

1 Protein association networks with STRING
Lars Juhl Jensen

2 interaction networks

3 association networks

4 guilt by association

5

6 protein networks

7 STRING

8 9.6 million proteins

9 common foundation

10 Exercise 1 Go to http://string-db.org/
Query for human insulin receptor (INSR) using the search by name functionality Make sure you are in evidence view (check the buttons below the network) Why are there multiple lines connecting the same to two proteins?

11 curated knowledge

12 (what we know)

13 protein complexes

14 3D structures

15

16 pathways

17 metabolic pathways

18 Letunic & Bork, Trends in Biochemical Sciences, 2008

19 signaling pathways

20 very incomplete

21 experimental data

22 (what we measured)

23 physical interactions

24 Jensen & Bork, Science, 2008

25 genetic interactions

26 Beyer et al., Nature Reviews Genetics, 2007

27 gene coexpression

28 microarrays

29 RNAseq

30 Exercise 2 (Continue from where exercise 1 ended)
Which types of evidence support the interaction between INSR and IRS1? Click on the interaction to view the popup, which has buttons linking to full details Which types of experimental assays support the INSR–IRS1 interaction?

31 predictions

32 (what we infer)

33 genomic context

34 evolution

35 gene fusion

36 Korbel et al., Nature Biotechnology, 2004

37 gene neighborhood

38 Korbel et al., Nature Biotechnology, 2004

39 phylogenetic profiles

40 Korbel et al., Nature Biotechnology, 2004

41 a real example

42

43

44

45 Cell Cellulosomes Cellulose

46 complications

47 many databases

48 different formats

49 different identifiers

50 variable quality

51 not comparable

52 not same species

53 hard work

54 parsers

55 mapping files

56 quality scores

57 affinity purification

58 von Mering et al., Nucleic Acids Research, 2005

59 phylogenetic profiles

60

61 score calibration

62 gold standard

63 von Mering et al., Nucleic Acids Research, 2005

64 implicit weighting by quality

65 common scale

66 homology-based transfer

67 orthologous groups

68 Franceschini et al., Nucleic Acids Research, 2013

69 missing most of the data

70 Exercise 3 (Continue from where exercise 2 ended)
Change the network to the confidence view Change the confidence cutoff to 0.9; any changes in proteins or interactions shown? Turn off all but experiments; what changes? Increase the number of interactors shown to 50; how many proteins do you get? Why?

71 text mining

72 >10 km

73 too much to read

74 exponential growth

75 ~40 seconds per paper

76 named entity recognition

77 comprehensive lexicon

78 cyclin dependent kinase 1

79 CDC2

80 orthographic variation

81 prefixes and suffixes

82 CDC2

83 hCdc2

84 “black list”

85 SDS

86 information extraction

87 co-mentioning

88 counting

89 within documents

90 within paragraphs

91 within sentences

92 scoring scheme

93

94

95 score calibration

96 Natural Language Processing
NLP Natural Language Processing

97 part-of-speech tagging

98 what you learned in school
pronoun pronoun verb preposition noun

99 semantic tagging

100 grammatical analysis

101 Cue words for entity recognition Verbs for relation extraction
Gene and protein names Cue words for entity recognition Verbs for relation extraction [nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]] is controlled by [nxpg HAP1] Saric et al., Proceedings of ACL, 2004

102 type and direction

103 complex sentences

104 anaphoric references

105 it

106 related resources

107 association networks

108 heterogeneous data

109 common identifiers

110 quality scores

111 protein networks

112 string-db.org Szklarczyk et al., Nucleic Acids Research, 2015

113 STITCH

114 chemical networks

115 stitch-db.org Kuhn et al., Nucleic Acids Research, 2014

116 COMPARTMENTS

117 subcellular localization

118 compartments.jensenlab.org Binder et al., Database, 2014

119 TISSUES

120 tissue expression

121 tissues.jensenlab.org Santos et al., PeerJ, 2015

122 DISEASES

123 disease associations

124 diseases.jensenlab.org Frankild et al., Methods, 2015

125 Exercise 4 Open http://tissues.jensenlab.org
Look up tissue associations for insulin (INS) Open Search for insulin receptor (INSR) What is the strongest associated disease? Inspect the underlying text-mining evidence


Download ppt "Protein association networks with STRING"

Similar presentations


Ads by Google