Georg Buscher German Research Center for Artificial Intelligence (DFKI) Knowledge Management Department Kaiserslautern, Germany SIGIR 07 Doctoral Consortium.

Slides:



Advertisements
Similar presentations
Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Georg Buscher Georg Buscher, Andreas Dengel, Ludger van Elst German Research Center for AI (DFKI) Knowledge Management Department Kaiserslautern, Germany.
Chapter 5: Introduction to Information Retrieval
Multimedia Database Systems
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
| 1 › Gertjan van Noord2014 Zoekmachines Lecture 4.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.
Information Retrieval Review
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Internet Resources Discovery (IRD) Search Engines Quality.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Computer comunication B Information retrieval. Information retrieval: introduction 1 This topic addresses the question on how it is possible to find relevant.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Recuperação de Informação. IR: representation, storage, organization of, and access to information items Emphasis is on the retrieval of information (not.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Tag-based Social Interest Discovery
Web search basics (Recap) The Web Web crawler Indexer Search User Indexes Query Engine 1 Ad indexes.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
Chapter 6: Information Retrieval and Web Search
1 Computing Relevance, Similarity: The Vector Space Model.
CPSC 404 Laks V.S. Lakshmanan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides at UC-Berkeley.
Parallel and Distributed Searching. Lecture Objectives Review Boolean Searching Indicate how Searches may be carried out in parallel Overview Distributed.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Autumn Web Information retrieval (Web IR) Handout #1:Web characteristics Ali Mohammad Zareh Bidoki ECE Department, Yazd University
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
 Examine two basic sources for implicit relevance feedback on the segment level for search personalization. Eye tracking Display time.
Personalizing Search Jaime Teevan, MIT Susan T. Dumais, MSR and Eric Horvitz, MSR.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
A neuroimaging technique (like a CT scan) that uses magnetic fields and radio waves to vibrate atoms in the brain’s neurons to produce and image of the.
Vector Space Models.
Evidence from Behavior
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Improving the performance of personal name disambiguation.
L&I SCI 110: Information science and information theory Instructor: Xiangming(Simon) Mu Sept. 9, 2004.
Web Search Personalization with Ontological User Profile Advisor: Dr. Jai-Ling Koh Speaker: Shun-hong Sie.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Neuroimaging Techniques. CT Computerised Tomography (CT): takes x-rays of the brain at different angles to produce a computer-enhanced image of a cross-section.
An Adaptive User Profile for Filtering News Based on a User Interest Hierarchy Sarabdeep Singh, Michael Shepherd, Jack Duffy and Carolyn Watters Web Information.
Information Retrieval and Web Search IR models: Vector Space Model Term Weighting Approaches Instructor: Rada Mihalcea.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
BioCreAtIvE Critical Assessment for Information Extraction in Biology Granada, Spain, March28-March 31, 2004 Task 2: Functional annotation of gene products.
Automated Information Retrieval
Plan for Today’s Lecture(s)
Representation of documents and queries
Information Retrieval and Web Design
Topic: Semantic Text Mining
Presentation transcript:

Georg Buscher German Research Center for Artificial Intelligence (DFKI) Knowledge Management Department Kaiserslautern, Germany SIGIR 07 Doctoral Consortium Attention-Based Information Retrieval

Georg Buscher Motivation  Magnetic Resonance Imaging uses magnetic fields and radio waves to produce high quality two- or three-dimensional images of brain structures. Sensors read frequencies of radio waves and a computer uses the information to construct an image of the brain (see 2).  Positron Emission Tomography measures emissions from radioactively labeled metabolically active chemicals that have been injected into the bloodstream. The emission data are computer-processed to produce 2- or 3-dimensional images of the distribution of the chemicals throughout the brain. Especially useful are a wide array of chemicals used to map different aspects of neurotransmitter activity (see 3).  Homer's personality is one of frequent stupidity, laziness, and explosive anger. He also suffers from a short attention span which complements his intense but short- lived passion for hobbies, enterprises and various causes. Furthermore, he is prone to emotional outbursts. 123

Georg Buscher Outline  Acquiring attention evidence –Attention evidence through eye tracking –Attention annotation and derivation with Dempster-Shafer  Applications in Information Retrieval –Attention-based TfIdf –Context elicitation –Context-based Index –Query Expansion / result re-ranking

Georg Buscher Sources of Attention-Data  There are many indications of attention from the user: read skimmed longer viewed Annotations (explicit) Reading evidence (implicit)

Georg Buscher Reading Detection – An Example

Georg Buscher Attention Annotations Imply Different Levels of Attention  Attention evidence values [0.7; 1.0][0.5; 1.0] [0.2; 0.7] [1.0; 1.0] … ……  Range from 0 to 1  Width of an interval expresses uncertainty

Georg Buscher Dempster-Shafer Combination of Attention Evidence read [The demo … provide][different][visualizations][and interfaces][according … situation.] RR HR H U R U R [0.5; 1][0.85; 1][0.96; 1] [0.85; 1] [0.5; 1] Calculate one value of attention (att(t) = bel(t) – 0.2*bel(t) + 0.2*pl(t)) : In that way, the function att provides an attention value for every term of the document. att different, d = 0.88 att according, d = 0.6 att somethingElse, d = 0

Georg Buscher Outline  Acquiring attention evidence –Attention evidence through eye tracking –Attention annotation and derivation with Dempster-Shafer  Applications in Information Retrieval –Attention-based TfIdf Desktop Index –Context elicitation –Context-based Index –Query Expansion

Georg Buscher Attention-Based Desktop Index  A Desktop index is especially for re-finding known documents.  You can better remember those parts of a document that you paid attention to.   Attended terms should be weighted higher.  TfIdf-based modification –Attention is a local factor (like tf) –The higher the maximal intensity of an attended document part, the more weight should be assigned to the attention value. –The lower the maximal intensity of an attended document part, the more weight should be assigned to tf. attention partterm frequency part tf t,d : term frequency of term t in document d att t,d : attention value of term t in document d α in [0; 1] is a balancing factor for defining the influence of attention in contrast to term frequency.

Georg Buscher Why Context? The Search for the Mental Model  If a knowledge worker tries to recall something concerning a topic, does he primarily think –on the basis of documents and document structures or –on the basis of former thematic contexts?  Rather the latter…  While re-finding some information, one does not search primarily for the document, but for the former mental model. Documents mediate.

Georg Buscher Elicitation and Representation of the Thematic Context Document 1 Brain imaging Document 2 Brain imaging Document 3 The Simpsons thematic context Brain imaging  Some read sub- documents  Combination of the viewed sub- documents to one virtual context document (only those attended parts that have a thematic overlapping) Document 4 Brain imaging

Georg Buscher Determination of Thematical Overlapping  Determine buzzwords for each viewed document by using –Attention value –Idf of desktop index  Compare buzzword vector with previous context vectors –If there is a similarity, then merge with context vector –Else buzzword vector is a new context ? Previous contexts Currently viewed document (part)

Georg Buscher  Idea: two indexes 1. Term – Context2. Context – Document  A context is represented by a virtual context document  The value for each term–context relation is influenced by the degree of attention Context-Based Vector-Space Index  Common index structure Doc1 Doc2 Doc3 Term1 Term2 Term C1 C2 C3 Term1 Term2 Term3 Term Doc1 Doc2 C1 C2 C3 xxxx xxxx

Georg Buscher New Kinds of Search Tasks Possible  Local search: Find for the current task (parts of) documents, that I formerly used for a similar task.  Enterprise-wide search: Find for the current task (parts of) documents, that I do not know yet, but that have been used by some colleague for a similar task.

Georg Buscher Evaluation of the Context-Based Index  Main advantage is expected to show up in several weeks.  Not possible to do real-world eye tracking studies for such a long time  Artificial experiment: –Several different exploration tasks within some hours –Then some re-finding tasks about previously viewed content –Measuring the time or user- satisfaction during the search process? Context-based search Normal search

Georg Buscher Contextual Attention-Based Relevance Feedback  Problem with context-based index: it doesn’t scale for web search  therefore query expansion  Current elicited context (i.e. term vector) expresses current interest of the user  Topmost characteristic keywords will be used for query expansion

Georg Buscher Attention data generation module Eye Tracker Text Mark Recognition Attention-annotated document The Global Picture Attention-based desktop index Context-based index Context document Query expansion for web search attention Thank you for your !attention