Literature Mapping with PubAtlas -- extending PubMed with a `BLASTing interface’ D Stott Parker 1, WW Chu 1, FW Sabb 3, AW Toga 2, RM Bilder 3 1 UCLA Computer.

Slides:



Advertisements
Similar presentations
Presentation at Society of The Query conference, Amsterdam November 13-14, 2009 (original title: Learning from Google: software design as a methodology.
Advertisements

ELIBRARY CURRICULUM EDITION The ultimate K-12 curriculum and reference solution.
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Office of SA to CNS GeoIntelligence Introduction Data Mining vs Image Mining Image Mining - Issues and Challenges CBIR Image Mining Process Ontology.
Geographic Information Systems “GIS”
Query Chain Focused Summarization Tal Baumel, Rafi Cohen, Michael Elhadad Jan 2014.
A Smorgasbord of PubMed Interfaces Margaret Henderson, B.Sc., M.L.I.S., Education Services, Tompkins-McCaw Library for Health Sciences, Virginia Commonwealth.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
5 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
Literature Informatics Beyond PubMed: Next Generation Literature Searching Carrie Iwema, PhD, MLS 24 th August 2011.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
INFO 624 Week 3 Retrieval System Evaluation
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
ICS (072)Database Systems Background Review 1 Database Systems Background Review Dr. Muhammad Shafique.
A Digital Library to Archive Research Material from Jane Goodall's Gombe Chimpanzee Project PI : Prof.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Topics in Computational Biology (COSI 230a) Pengyu Hong 09/02/2005.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Claudia Marzi Institute for Computational Linguistics, “Antonio Zampolli” – Italian National Research Council University of Pavia – Dept. of Theoretical.
Path Knowledge Discovery: Association Mining Based on Multi-Category Lexicons Chen Liu, Wesley W. Chu, Fred Sabb, Stott Parker and Joseph Korpela.
Representing, Querying and Mining Knowledge about Autism Phenotypes
InsilicoCell: an integrated platform for biological model development and analysis Thai Quang Tung Korea Institute of Science and Technology Information.
AuthorLink: Instant Author Co-Citation Mapping for Online Searching Xia Lin Howard D. White Jan Buzydlowski Drexel University Philadelphia,
1 How to find literature - A very short introduction SMED 8004 Medicine and Health Library October 2014.
Context-based Search in Topic Centered Digital Repositories Christo Dichev, Darina Dicheva Winston-Salem State University Winston-Salem, N.C. USA {dichevc,
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
 CiteGraph: A Citation Network System for MEDLINE Articles and Analysis Qing Zhang 1,2, Hong Yu 1,3 1 University of Massachusetts Medical School, Worcester,
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
SUMMON ® 2.0 DISCOVERY REINVENTED. What is Summon 2.0? A new, streamlined, modern interface New and enhanced features providing layers of contextual guidance.
Medline on OvidSP. Medline Facts Extensive MeSH thesaurus structure with many synonyms used in mapping and multidatabase searching with Embase Thesaurus.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
Linking Tasks, Data, and Architecture Doug Nebert AR-09-01A May 2010.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Finding Functional Gene Relationships Using the Semantic Gene Organizer (SGO) Kevin Heinrich Master’s Defense July 16, 2004.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Author Name Disambiguation in Medline Vetle I. Torvik and Neil R. Smalheiser August 31, 2006.
1 CS 430: Information Discovery Lecture 19 User Interfaces.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
CNI, 3rd April 2006 Slide 1 UK National Centre for Text Mining: Activities and Plans Dr. Robert Sanderson Dept. of Computer Science University of Liverpool.
Daniel Boivin OCLC Canada OCLC and Access98. AgendaAgenda n What’s new with FirstSearch 4.0 n New FirstSearch or FirstSearch 5.0.
Copyright OpenHelix. No use or reproduction without express written consent1.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Needs and Progress: Summary Flexible, powerful, modular atlas interface, and a query gateway to multiple types of data (GeneNetwork, Barlow, Smith, CCDB,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining concept maps from news stories for measuring civic scientific literacy in media Presenter :
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
PubMed …featuring more than 20 million citations for biomedical literature from MEDLINE, life science journals, and online books.
Automated Literature Searching. Literature Databases PubMed (NCBI): 14,000,000 Bio/Med references ISI Web of Science: Bio, Chem and Ag references SciFinder:
Information Architecture & Design Week 9 Schedule - Web Research Papers Due Now - Questions about Metaphors and Icons with Labels - Design 2- the Web -
Document Clustering and Collection Selection Diego Puppin Web Mining,
Semantic (web) activity at Elsevier Marc Krellenstein VP, Search and Discovery Elsevier October 27, 2004
1998 NSF Information and Data Management Workshop Research Agenda for the 21st Century.
Text and Data Mining for Systematic Reviews Investigating Trends to Update Collaboration Services Virginia Pannabecker Virginia Tech, University Libraries.
Summon® 2.0 Discovery Reinvented
Genomics research paper presentation
Biomedical Text Mining and Its Applications
Intro to Machine Learning
Development of the Amphibian Anatomical Ontology
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Google Scholar, ShareLaTeX, and Gephi
Data Warehousing and Data Mining
Author Name Disambiguation in Medline
Introduction to Information Retrieval
Google Scholar, OverLeaf, and Gephi
Presentation transcript:

Literature Mapping with PubAtlas -- extending PubMed with a `BLASTing interface’ D Stott Parker 1, WW Chu 1, FW Sabb 3, AW Toga 2, RM Bilder 3 1 UCLA Computer Science Dept, 2 Laboratory of Neuroimaging, 3 Dept of Psychiatry & Biobehavioral Sciences Hypothesis Web Project NIH RL1LM009833

PubAtlas is a “PubMed BLAST-query” service for two term sets/lexica result: contingency table for all queries (X AND Y) where X,Y are terms in the two lexica

PubAtlas Lexica: term: definition pairs Term Name : PubMed Query optional hierarchical structure Lexicon as: concept base ontology user-defined term hierarchy (personalized MeSH hierarchy) domain-specific query language

Lexicon 1 = X hierarchy Literature Map: (X AND Y) association table Lexicon 2 = Y hierarchy MEDLINE / PubMed as a bioscience association base PubAtlas `Concept BLASTing’ seeks useful associations, much like microarray analysis

AliBaba: "AliBaba" [TIAB] AND "PubMed" [TIAB] Anne O'Tate: "Anne O'Tate" [TIAB] BioIE: "BioIE" [TIAB] ClusterMed: "ClusterMed" [TIAB] ConceptLink: "ConceptLink" [TIAB] GoPubMed: "GoPubMed" [TIAB] HubMed: "HubMed" [TIAB] PubFocus: "PubFocus" [TIAB] PubGene: "PubGene" [TIAB] PubMatrix: "PubMatrix" [TIAB] PubMed Assistant: "PubMed Assistant" [TIAB] PubNet: "PubNet" [TIAB] PubReMiner: "PubReMiner" [TIAB] Relemed: "Relemed" [TIAB] SLIM: "Muin M" [au] AND "SLIM" [TIAB] VisualNet: "VisualNet" [TIAB] OR "Visual Net" [TIAB] XplorMed: "XplorMed" [TIAB] graph: "PubMed" [TIAB] AND ("graph" [TIAB] OR "network" [TIAB] OR "diagram" [TIAB]) visual: "PubMed" [TIAB] AND ("visual" [TIAB] OR "visualizing" [TIAB] OR "visualization" [TIAB] …) friendly: "PubMed" [TIAB] AND ("friendly" [TIAB] OR "flexible" [TIAB]) better interface: "PubMed" [TIAB] AND ("interface" [TIAB] OR "interaction" [TIAB] OR "query" [TIAB]) …) exploration: "PubMed" [TIAB] AND ("exploration" [TIAB] OR "explore" [TIAB] OR "discovery" [TIAB] …) summarization: "PubMed" [TIAB] AND (summariz* [TIAB] OR digest* [TIAB]) map: "PubMed" [TIAB] AND ("mapping" [TIAB] OR "map" [TIAB] OR "mapped" [TIAB]) extraction: "PubMed" [TIAB] AND (extract* [TIAB] OR identif* [TIAB]) relevance: "PubMed" [TIAB] AND ("relevance" [TIAB] OR "ranking" [TIAB] OR "ordering" [TIAB]) powerful: "PubMed" [TIAB] AND ("powerful" [TIAB] OR "extended" [TIAB] OR "advanced" [TIAB]) Desirable extension features previous PubMed extensions semi-automated generation of a review paper -- but thorough and remaining up-to-date

 PubAtlas as a tool for concept “BLASTing”  Moving towards shared, user-defined query/concept languages  Visual literature search with concept maps / literature maps  Building on familiar association mining metaphor  Extending PubMed with temporal indexing / concept evolution  Real uses: semi-automated reviews, knowledge mgmt,...  Applications in Phenomics  Phenotypes are often naturally represented as queries  Promising applications in interdisciplinary collaboration

Who at UCLA works on Dopamine Receptors? Many possibilities for interdisciplinary collaboration

Lori Altshuler: Altshuler Lori [FAU] OR Altshuler LL [AU] Stephen Marder: Marder Stephen [FAU] OR Marder SR [AU] Carrie Bearden: Bearden Carrie [FAU] OR Bearden CE [AU] Ty Cannon: Cannon Tyrone [FAU] OR Cannon TD [AU] Michael Phelps: Phelps Michael [FAU] OR Phelps ME [AU] John Mazziotta: Mazziotta John [FAU] OR Mazziotta J [AU] Paul Thompson: Thompson Paul M [FAU] OR Thompson PM [AU] Arthur Toga: Toga Arthur [FAU] OR Toga A [AU] Roger Woods: Woods Roger [FAU] OR Woods RP [AU] Bob Bilder: Bilder Robert [FAU] OR Bilder RM [AU] Nelson Freimer: Freimer Nelson [FAU] OR Freimer N [AU]. Map of publications in which people X, Y both occur as authors

Historical map of interdisciplinary collaboration at UCLA over 10 yrs

Visualization and interaction along with standard mining of association data

For term sets of size M, N, PubAtlas submits M+N PubMed queries This can scale to hundreds or thousands of terms

Diverse, complex phenotypes can be represented as queries (predicates) -- denoting the set of all relevant documents PubMed / MEDLINE = central phenomics database

Query Expansion -- for Phenotypes  Queries (like “n-back test”) can be expanded with terms related to their target concept (like working memory), using statistical models to identify better expansions.  Expansion can improve precision and recall of queries that are being used as models of concepts/phenotypes N-back Wisconsin card sorting Sternberg Stroop choice reaction time … paced auditory serial addition ("nback" OR “n-back” OR "wisconsin card sorting" OR "sternberg" OR "working memory capacity" OR "stroop" OR "choice reaction time" OR "paced auditory serial addition" OR "pasat" OR "digit span" OR "delayed match to sample") "nback" 

 PubAtlas as a tool for concept “BLASTing”  Lexica are concept bases / user-defined query languages  PubAtlas constructs concept maps / literature maps  Extends PubMed with temporal indexing  Multiple features for exploration, visualization  Real uses: semi-automated reviews, who is doing what,...  Many interesting directions for further work  Applications in Phenomics  Phenotypes are often naturally represented as queries  Promising applications in interdisciplinary collaboration