Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open access – making the most of biomedical literature mining Lars Juhl Jensen EMBL Heidelberg.

Similar presentations


Presentation on theme: "Open access – making the most of biomedical literature mining Lars Juhl Jensen EMBL Heidelberg."— Presentation transcript:

1 Open access – making the most of biomedical literature mining Lars Juhl Jensen EMBL Heidelberg

2 why open access?

3 why biomedicine?

4 why literature mining?

5 M EDLINE

6

7 Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation

8 information retrieval

9 finding the papers

10 if you can’t find them …

11 … they don’t exist!

12 ad hoc retrieval

13 users-specified query

14 “yeast AND cell cycle”

15 Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation

16

17

18 M EDLINE

19 abstracts

20 complete papers

21

22

23 tricks

24 stemming

25 yeast / yeasts

26 synonyms

27 yeast / S. cerevisiae

28 dynamic query expansion

29 next logical step

30 ontologies

31 annotation

32 Cdc28  yeast gene

33 Cdc28  cell cycle

34 Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation

35 “yeast AND cell cycle”

36 entity recognition

37 identifying the substance(s)

38 Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation

39 if you can’t find them …

40 … they don’t exist!

41

42

43

44 abstracts

45 M EDLINE

46 tricks

47 good synonyms list

48 manual curation

49 orthographic variation

50 CDC28

51 Cdc28p

52 disambiguation

53 hairy

54 SDS

55 Cdc2

56 information extraction

57 formalizing the facts

58 co-mentioning

59 Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation

60

61 NLP Natural Language Processing

62 Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1 hyperphosphorylation and degradation

63 Gene and protein names Cue words for entity recognition Verbs for relation extraction [ nxexpr The expression of [ nxgene the cytochrome genes [ nxpg CYC1 and CYC7]]] is controlled by [ nxpg HAP1]

64

65 new discoveries

66 text mining

67

68

69 temporal trends

70

71 buzzwords

72

73 grant applications

74 global correlations

75 327983 3592 RegulatesRegulated P < 9  10 -9

76 transcriptional networks

77 112744 3704 PhosphorylatesPhosphorylated P < 2  10 -7

78 signal cascades

79 810747 3625 ExpressionPhosphorylation P < 5  10 -4

80

81 integration of text and data

82

83

84

85

86 network mining

87 linking genes to diseases

88

89

90

91

92 multifactorial diseases

93 genotype to phenotype

94

95

96

97 where are we now?

98

99 abstracts

100 complete papers

101 restricted access

102 open access

103 the tools are there

104 now we need the text!

105 Acknowledgments Jasmin Saric Rossitza Ouzounova Michael Kuhn Isabel Rojas Miguel Andrade Peer Bork


Download ppt "Open access – making the most of biomedical literature mining Lars Juhl Jensen EMBL Heidelberg."

Similar presentations


Ads by Google