Copyright OpenHelix. No use or reproduction without express written consent1
Textpresso A Text-Mining System for Biological Literature Materials prepared by: Mary E. Mangan, Ph.D. Updated: Q Version 1.0
Copyright OpenHelix. No use or reproduction without express written consent3 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent4 Textpresso Introduction Textpresso developed at WormBase open-source text-mining software ontology-based information retrieval and extraction system initially used by WormBase biocurators now applicable to other bodies of text
Copyright OpenHelix. No use or reproduction without express written consent5 Text-Mining: Separating Wheat from Chaff ? but I just want papers for Hoxd11 Hoxd11 keyword = Hoxd11 Text-Mining Scientific Literature corpus
Copyright OpenHelix. No use or reproduction without express written consent6 Results Example Keyword: lin-12
Copyright OpenHelix. No use or reproduction without express written consent7 Textpresso: Beyond the Nematode... FUNGI: N. crassa, A. nidulans, M. grisea, Cryptococcus, Filamentous
Copyright OpenHelix. No use or reproduction without express written consent8 Textpresso Homepage About
Copyright OpenHelix. No use or reproduction without express written consent9 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent10 Information Retrieval and Extraction Hoxd11 retrieves documents Hoxd11 is expressed in the limb bud. Pax6 regulates Hoxd11 transcription. The nucleus contains Hoxd11 protein. extracts facts Information Retrieval corpus Information Extraction corpus
Copyright OpenHelix. No use or reproduction without express written consent11 Two Major Elements of Textpresso Information Retrieval titles and abstracts only titles, abstracts, and full-text 1 Improves Results sentences are marked up with ontology-based categories + 2 Information Extraction
Copyright OpenHelix. No use or reproduction without express written consent12 How Textpresso Works paper Sentence 1 Sentence 2 Sentence 3 Sentence 4 Sentence 5 Sentence 6 Word TEXTPRESSO = Text Processing System ontology category tags = association = molecular function = phenotype = anatomy etc... Query = phenotype ( ) + anatomy ( ) : Sentence 1 Sentence 4 Extract =
Copyright OpenHelix. No use or reproduction without express written consent13 Textpresso Ontology Categories grouped by Topic allele, anatomy, developmental stage phenotype, molecular function (GO), biological process (GO), etc... characterization, effect, localization, method, pathway, purpose, etc... Biological Concepts Description anatomy = cuticle, dorsal cord, M1 neuron, etc... association, comparison, consort, involvement, regulation, spatial relation, etc... Relationships
Copyright OpenHelix. No use or reproduction without express written consent14 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent15 Similar Textpresso Sites hosted on Textpresso server at Caltech hosted by the groups using them And more!
Copyright OpenHelix. No use or reproduction without express written consent16 Textpresso for C. elegans corpus
Copyright OpenHelix. No use or reproduction without express written consent17 Document Finder WBPaper WormBase paper ID document is in the corpus
Copyright OpenHelix. No use or reproduction without express written consent18 Ontology Browser tail
Copyright OpenHelix. No use or reproduction without express written consent19 Category Lists pick-list all the terms for “association” in lexicon
Copyright OpenHelix. No use or reproduction without express written consent20 Query Language
Copyright OpenHelix. No use or reproduction without express written consent21 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent22 Simple Search enter search term let-23
Copyright OpenHelix. No use or reproduction without express written consent23 let-23 Results Deselect “Matching Sentences” for list overview
Copyright OpenHelix. No use or reproduction without express written consent24 Document Details click keyword highlighted
Copyright OpenHelix. No use or reproduction without express written consent25 Matching Sentences
Copyright OpenHelix. No use or reproduction without express written consent26 Supplemental Document Links
Copyright OpenHelix. No use or reproduction without express written consent27 let-23 Adding Categories to Your Search
Copyright OpenHelix. No use or reproduction without express written consent28 Detailed Query
Copyright OpenHelix. No use or reproduction without express written consent29 Detailed Query Results anatomy c molecular function regulation c keyword other genes
Copyright OpenHelix. No use or reproduction without express written consent30 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent31 Filtering the Results +Clark[author]
Copyright OpenHelix. No use or reproduction without express written consent32 Filtered Results
Copyright OpenHelix. No use or reproduction without express written consent33 Turning on Advanced Search Options
Copyright OpenHelix. No use or reproduction without express written consent34 Advanced Search Options field of document to search what scope of document to search how to sort documents exclude documents optional filters search mode
Copyright OpenHelix. No use or reproduction without express written consent35 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent36 Textpresso: Beyond the Nematode... FUNGI: N. crassa, A. nidulans, M. grisea, Cryptococcus, Filamentous
Copyright OpenHelix. No use or reproduction without express written consent37 Textpresso: Beyond the Model Organisms
Copyright OpenHelix. No use or reproduction without express written consent38 Textpresso for Neuroscience category list
Copyright OpenHelix. No use or reproduction without express written consent39 category list Neuroscience Categories adds neurobiology categories
Copyright OpenHelix. No use or reproduction without express written consent40 Neuroscience Query Example cocaine synapsebinding
Copyright OpenHelix. No use or reproduction without express written consent41 Pharmspresso
Copyright OpenHelix. No use or reproduction without express written consent42 Pharmspresso Query
Copyright OpenHelix. No use or reproduction without express written consent43 Pharmspresso Results
Copyright OpenHelix. No use or reproduction without express written consent44 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent45 Textpresso: Text Processing System paper Sentence 1 Sentence 2 Sentence 3 Word ontology category tags searches full-text marks up sentences and words + Information RetrievalInformation Extraction
Copyright OpenHelix. No use or reproduction without express written consent46 Textpresso: Mine the Biological Literature c molecular functionregulation c keyword other genes Extract = matching sentences Search using keywords + categories anatomy Retrieval = pertinent documents
Copyright OpenHelix. No use or reproduction without express written consent47 Textpresso: Different Site, Same Interface Learn it once. Use it everywhere.
Copyright OpenHelix. No use or reproduction without express written consent48 Textpresso: Beyond Organisms neurobiology pharmacogenetics
Copyright OpenHelix. No use or reproduction without express written consent49 Textpresso Agenda Introduction and Credits Text-Mining Layout Basic Search Advanced Search Tours Summary Exercises Textpresso:
Copyright OpenHelix. No use or reproduction without express written consent50