Download presentation
Presentation is loading. Please wait.
Published byKarin Wood Modified over 9 years ago
1
Copyright OpenHelix. No use or reproduction without express written consent1
2
Textpresso A Text-Mining System for Biological Literature Materials prepared by: Mary E. Mangan, Ph.D. www.openhelix.com Updated: Q1 2011 Version 1.0
3
Copyright OpenHelix. No use or reproduction without express written consent3 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
4
Copyright OpenHelix. No use or reproduction without express written consent4 Textpresso Introduction Textpresso developed at WormBase open-source text-mining software ontology-based information retrieval and extraction system initially used by WormBase biocurators now applicable to other bodies of text
5
Copyright OpenHelix. No use or reproduction without express written consent5 Text-Mining: Separating Wheat from Chaff ? but I just want papers for Hoxd11 Hoxd11 keyword = Hoxd11 Text-Mining Scientific Literature corpus
6
Copyright OpenHelix. No use or reproduction without express written consent6 Results Example Keyword: lin-12
7
Copyright OpenHelix. No use or reproduction without express written consent7 Textpresso: Beyond the Nematode... FUNGI: N. crassa, A. nidulans, M. grisea, Cryptococcus, Filamentous http://ilex.caltech.edu:80/trac/alere/
8
Copyright OpenHelix. No use or reproduction without express written consent8 Textpresso Homepage About
9
Copyright OpenHelix. No use or reproduction without express written consent9 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
10
Copyright OpenHelix. No use or reproduction without express written consent10 Information Retrieval and Extraction Hoxd11 retrieves documents Hoxd11 is expressed in the limb bud. Pax6 regulates Hoxd11 transcription. The nucleus contains Hoxd11 protein. extracts facts Information Retrieval corpus Information Extraction corpus
11
Copyright OpenHelix. No use or reproduction without express written consent11 Two Major Elements of Textpresso Information Retrieval titles and abstracts only titles, abstracts, and full-text 1 Improves Results sentences are marked up with ontology-based categories + 2 Information Extraction
12
Copyright OpenHelix. No use or reproduction without express written consent12 How Textpresso Works paper Sentence 1 Sentence 2 Sentence 3 Sentence 4 Sentence 5 Sentence 6 Word TEXTPRESSO = Text Processing System ontology category tags = association = molecular function = phenotype = anatomy etc... Query = phenotype ( ) + anatomy ( ) : Sentence 1 Sentence 4 Extract =
13
Copyright OpenHelix. No use or reproduction without express written consent13 Textpresso Ontology Categories grouped by Topic allele, anatomy, developmental stage phenotype, molecular function (GO), biological process (GO), etc... characterization, effect, localization, method, pathway, purpose, etc... Biological Concepts Description anatomy = cuticle, dorsal cord, M1 neuron, etc... association, comparison, consort, involvement, regulation, spatial relation, etc... Relationships
14
Copyright OpenHelix. No use or reproduction without express written consent14 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
15
Copyright OpenHelix. No use or reproduction without express written consent15 Similar Textpresso Sites hosted on Textpresso server at Caltech hosted by the groups using them And more!
16
Copyright OpenHelix. No use or reproduction without express written consent16 Textpresso for C. elegans corpus
17
Copyright OpenHelix. No use or reproduction without express written consent17 Document Finder WBPaper00000646 WormBase paper ID document is in the corpus
18
Copyright OpenHelix. No use or reproduction without express written consent18 Ontology Browser tail
19
Copyright OpenHelix. No use or reproduction without express written consent19 Category Lists pick-list all the terms for “association” in lexicon
20
Copyright OpenHelix. No use or reproduction without express written consent20 Query Language
21
Copyright OpenHelix. No use or reproduction without express written consent21 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
22
Copyright OpenHelix. No use or reproduction without express written consent22 Simple Search enter search term let-23
23
Copyright OpenHelix. No use or reproduction without express written consent23 let-23 Results Deselect “Matching Sentences” for list overview
24
Copyright OpenHelix. No use or reproduction without express written consent24 Document Details click keyword highlighted
25
Copyright OpenHelix. No use or reproduction without express written consent25 Matching Sentences
26
Copyright OpenHelix. No use or reproduction without express written consent26 Supplemental Document Links
27
Copyright OpenHelix. No use or reproduction without express written consent27 let-23 Adding Categories to Your Search
28
Copyright OpenHelix. No use or reproduction without express written consent28 Detailed Query
29
Copyright OpenHelix. No use or reproduction without express written consent29 Detailed Query Results anatomy c molecular function regulation c keyword other genes
30
Copyright OpenHelix. No use or reproduction without express written consent30 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
31
Copyright OpenHelix. No use or reproduction without express written consent31 Filtering the Results +Clark[author]
32
Copyright OpenHelix. No use or reproduction without express written consent32 Filtered Results
33
Copyright OpenHelix. No use or reproduction without express written consent33 Turning on Advanced Search Options
34
Copyright OpenHelix. No use or reproduction without express written consent34 Advanced Search Options field of document to search what scope of document to search how to sort documents exclude documents optional filters search mode
35
Copyright OpenHelix. No use or reproduction without express written consent35 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
36
Copyright OpenHelix. No use or reproduction without express written consent36 Textpresso: Beyond the Nematode... FUNGI: N. crassa, A. nidulans, M. grisea, Cryptococcus, Filamentous
37
Copyright OpenHelix. No use or reproduction without express written consent37 Textpresso: Beyond the Model Organisms
38
Copyright OpenHelix. No use or reproduction without express written consent38 Textpresso for Neuroscience category list
39
Copyright OpenHelix. No use or reproduction without express written consent39 category list Neuroscience Categories adds neurobiology categories
40
Copyright OpenHelix. No use or reproduction without express written consent40 Neuroscience Query Example cocaine synapsebinding
41
Copyright OpenHelix. No use or reproduction without express written consent41 Pharmspresso
42
Copyright OpenHelix. No use or reproduction without express written consent42 Pharmspresso Query
43
Copyright OpenHelix. No use or reproduction without express written consent43 Pharmspresso Results
44
Copyright OpenHelix. No use or reproduction without express written consent44 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
45
Copyright OpenHelix. No use or reproduction without express written consent45 Textpresso: Text Processing System paper Sentence 1 Sentence 2 Sentence 3 Word ontology category tags searches full-text marks up sentences and words + Information RetrievalInformation Extraction
46
Copyright OpenHelix. No use or reproduction without express written consent46 Textpresso: Mine the Biological Literature c molecular functionregulation c keyword other genes Extract = matching sentences Search using keywords + categories anatomy Retrieval = pertinent documents
47
Copyright OpenHelix. No use or reproduction without express written consent47 Textpresso: Different Site, Same Interface Learn it once. Use it everywhere.
48
Copyright OpenHelix. No use or reproduction without express written consent48 Textpresso: Beyond Organisms neurobiology pharmacogenetics
49
Copyright OpenHelix. No use or reproduction without express written consent49 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso: http://www.textpresso.org
50
Copyright OpenHelix. No use or reproduction without express written consent50
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.