UIC at TREC 2007: Genomics Track Wei Zhou, Clement Yu University of Illinois at Chicago Nov. 8, 2007.

Slides:



Advertisements
Similar presentations
Metadata in Carrot II Current metadata –TF.IDF for both documents and collections –Full-text index –Metadata are transferred between different nodes Potential.
Advertisements

1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,
Query Chain Focused Summarization Tal Baumel, Rafi Cohen, Michael Elhadad Jan 2014.
Improved TF-IDF Ranker
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Effective Keyword Search in Relational Databases Fang Liu (University of Illinois at Chicago) Clement Yu (University of Illinois at Chicago) Weiyi Meng.
Clinical Trial Design Considerations for Therapeutic Cancer Vaccines Richard Simon, D.Sc. Chief, Biometric Research Branch, NCI
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Information Retrieval Models: Probabilistic Models
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Evaluating Search Engine
Presenters: Başak Çakar Şadiye Kaptanoğlu.  Typical output of an IR system – static predefined summary ◦ Title ◦ First few sentences  Not a clear view.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Probabilistic IR Models Based on probability theory Basic idea : Given a document d and a query q, Estimate the likelihood of d being relevant for the.
Issues in Pre- and Post-translation Document Expansion: Untranslatable Cognates and Missegmented Words Gina-Anne Levow University of Chicago July 7, 2003.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.
UCB BioText TREC 2003 Participation Participants: Marti Hearst Gaurav Bhalotia, Presley Nakov, Ariel Schwartz Track: Genomics, tasks 1 and 2.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
INEX 2003, Germany Searching in an XML Corpus Using Content and Structure INEX 2003, Germany Yiftah Ben-Aharon, Sara Cohen, Yael Grumbach, Yaron Kanza,
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
Sample Size Determination Ziad Taib March 7, 2014.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Predictive Modeling with Heterogeneous Sources Xiaoxiao Shi 1 Qi Liu 2 Wei Fan 3 Qiang Yang 4 Philip S. Yu 1 1 University of Illinois at Chicago 2 Tongji.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
SRI & SMS Training Module. Downloading the SRI & SMS Training Module Using a Mac 1.Visit
SMART/FHIR Genomic Resources
Automatically Generating Gene Summaries from Biomedical Literature (To appear in Proceedings of PSB 2006) X. LING, J. JIANG, X. He, Q.~Z. MEI, C.~X. ZHAI,
Word Sense Disambiguation in Queries Shaung Liu, Clement Yu, Weiyi Meng.
Clustering Top-Ranking Sentences for Information Access Anastasios Tombros, Joemon Jose, Ian Ruthven University of Glasgow & University of Strathclyde.
Relevance Detection Approach to Gene Annotation Aid to automatic annotation of databases Annotation flow –Extraction of molecular function of a gene from.
Predicting Question Quality Bruce Croft and Stephen Cronen-Townsend University of Massachusetts Amherst.
An Architecture for Emergent Semantics Sven Herschel, Ralf Heese, and Jens Bleiholder Humboldt-Universität zu Berlin/ Hasso-Plattner-Institut.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
 Examine two basic sources for implicit relevance feedback on the segment level for search personalization. Eye tracking Display time.
Modeling term relevancies in information retrieval using Graph Laplacian Kernels Shuguang Wang Joint work with Saeed Amizadeh and Milos Hauskrecht.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
Facilitating Document Annotation using Content and Querying Value.
Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.
CIKM Opinion Retrieval from Blogs Wei Zhang 1 Clement Yu 1 Weiyi Meng 2 1 Department of.
The role of knowledge in conceptual retrieval: a study in the domain of clinical medicine Jimmy Lin and Dina Demner-Fushman University of Maryland SIGIR.
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
UIC at TREC 2006: Genomics Track Wei Zhou, Clement T. Yu University of Illinois at Chicago Nov. 16, 2006.
A System for Finding Biological Entities that Satisfy Certain Conditions from Texts Wei Zhou, Clement Yu University of Illinois at Chicago Weiyi, Meng.
A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Facilitating Document Annotation Using Content and Querying Value.
UIC at TREC 2006: Blog Track Wei Zhang Clement Yu Department of Computer Science University of Illinois at Chicago.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
The TDR Targets Database Prioritizing potential drug targets in complete genomes.
Semantic Processing with Context Analysis
Cumulated Gain-Based Evaluation of IR Techniques
Literature retrieval for personalized cancer treatment
Presentation transcript:

UIC at TREC 2007: Genomics Track Wei Zhou, Clement Yu University of Illinois at Chicago Nov. 8, 2007

Overview of the system Question analyzer IR Query Documents Passages Document retriever Passage retriever

Interpretation of a query Q: “What [ANTIBODIES] have been used to detect protein TLR4?” A TARGET refers to any instance of the entity type The QUALIFICATION refers to the condition that the target needs to satisfy in order to be qualified as an answer to the query

Two types of queries Type I: candidate targets can be found in existing resources. ~ UMLS ~ Entrez Gene Type II: no resources available to find candidate targets. “ANTIBODIES”, “PATHWAYS”, and etc.

IR models for Type I queries Observation: different passages may have different qualified targets. Strategy I: The more important qualified targets a passage has, the higher it is ranked (the importance is measured by idf). Strategy II: Passages having at least one qualified target are ranked equally.

IR models for Type I queries Strategy I: Strategy II:

A conceptual IR model sim(q,d2)>sim(q,d1) if either 1 or 2: (Liu 2004)

Result: Strategy I vs. Strategy II

IR models for Type II queries Query expansion with representative words of the entity type. Representative words: words that co-occur with the entity type much more often than being independent in the same sentences of PubMed abstracts. The top 15 representative words were added into the query. For example: ANTIBODIES: “antigen”, “humanized”, “neutralizing”, “vaccine”, “cell”, “immune”, “protein”, “assay”, “immunogenicity”, “vaccination”, and etc. Similarity measure:

Result: candidate targets vs. representative words The representative words of entity type is useful: they have captured the context in which a target of that entity type appears.

Passage extraction A two-step rational: Step 1: Given various windows of different sizes, choose the ones which have the maximum number of query concepts and the smallest number of sentences. Step 2: Merge two candidate windows if they are exactly adjacent to each other.

Experience of using UMLS 1. Some candidate targets retrieved from UMLS are very common terms and they are not “real” instance of the entity type.

Experience of using UMLS 2. Some candidate targets are variants of the entities in the documents. A sample relevant passage: “In patients with cad, several small clinical trials suggest that cognitive-behavioural therapy successfully reduced anxiety and depression, and thus facilitated the modification of cardiac risk factors.” Sign or Symptom (UMLS) grade 1 anxiety grade 2 anxiety grade 3 anxiety grade 4 anxiety grade 5 anxiety

Summary 1. A better performance has been achieved to rank passages having at least one qualified target equally than to rank them according to the importance of targets they have. 2. A filtering step might be necessary to retrieve candidate targets from UMLS.

THANK YOU