User-Friendly Systems Instead of User-Friendly Front-Ends Present user interfaces are not accepted because the underlying systems are too difficult to.

Slides:



Advertisements
Similar presentations
Comparison of BIDS ISI (Enhanced) with Web of Science Lisa Haddow.
Advertisements

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Spelling Correction for Search Engine Queries Bruno Martins, Mario J. Silva In Proceedings of EsTAL-04, España for Natural Language Processing Presenter:
Chapter 5: Introduction to Information Retrieval
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Introduction to Information Retrieval
INSTRUCTOR: DR.NICK EVANGELOPOULOS PRESENTED BY: QIUXIA WU CHAPTER 2 Information retrieval DSCI 5240.
Contextual Advertising by Combining Relevance with Click Feedback D. Chakrabarti D. Agarwal V. Josifovski.
Intelligent Information Retrieval 1 Vector Space Model for IR: Implementation Notes CSC 575 Intelligent Information Retrieval These notes are based, in.
IS530 Lesson 12 Boolean vs. Statistical Retrieval Systems.
Final Project of Information Retrieval and Extraction by d 吳蕙如.
Advanced Query Processing Dr. Susan Gauch. Query Term Weights  The vector space model matches queries to documents with the inner product/cosine similarity.
1 Oct 30, 2006 LogicSQL-based Enterprise Archive and Search System How to organize the information and make it accessible and useful ? Li-Yan Yuan.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
ISP 433/533 Week 2 IR Models.
Parametric search and zone weighting Lecture 6. Recap of lecture 4 Query expansion Index construction.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Probabilistic IR Models Based on probability theory Basic idea : Given a document d and a query q, Estimate the likelihood of d being relevant for the.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
CS/Info 430: Information Retrieval
© Anselm Spoerri Lecture 13 Housekeeping –Term Projects Evaluations –Morse, E., Lewis, M., and Olsen, K. (2002) Testing Visual Information Retrieval Methodologies.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
Retrieval Models II Vector Space, Probabilistic.  Allan, Ballesteros, Croft, and/or Turtle Properties of Inner Product The inner product is unbounded.
Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
WISER: Newspapers online : an introduction to the scope and range of recent and current newspapers available on Oxlip, including hints on effective search.
CS246 Basic Information Retrieval. Today’s Topic  Basic Information Retrieval (IR)  Bag of words assumption  Boolean Model  Inverted index  Vector-space.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
Information Search UDSM Library. Search Techniques Information search techniques largely dependent on how information is structured and how the search.
An introduction to databases In this module, you will learn: What exactly a database is How a database differs from an internet search engine How to find.
PERSONALIZED SEARCH Ram Nithin Baalay. Personalized Search? Search Engine: A Vital Need Next level of Intelligent Information Retrieval. Retrieval of.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Pete Bohman Adam Kunk. What is real-time search? What do you think as a class?
The Development of a search engine & Comparison according to algorithms Sungsoo Kim Haebeom Lee The mid-term progress report.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
TOPIC CENTRIC QUERY ROUTING Research Methods (CS689) 11/21/00 By Anupam Khanal.
NoteSearch - Find what you’re looking for. Prototype Team B.
Weighting and Matching against Indices. Zipf’s Law In any corpus, such as the AIT, we can count how often each word occurs in the corpus as a whole =
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Chapter 6: Information Retrieval and Web Search
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Parallel and Distributed Searching. Lecture Objectives Review Boolean Searching Indicate how Searches may be carried out in parallel Overview Distributed.
IR Homework #2 By J. H. Wang Mar. 31, Programming Exercise #2: Query Processing and Searching Goal: to search relevant documents for a given query.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
Evaluation of Agent Building Tools and Implementation of a Prototype for Information Gathering Leif M. Koch University of Waterloo August 2001.
Advanced Search Features Dr. Susan Gauch. Pruning Search Results  If a query term has many postings  It is inefficient to add all postings to the accumulator.
1 Internet Research Third Edition Unit A Searching the Internet Effectively.
1 CS 430: Information Discovery Sample Midterm Examination Notes on the Solutions.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
User Interfaces and Information Retrieval Dina Reitmeyer WIRED (i385d)
GENERATING RELEVANT AND DIVERSE QUERY PHRASE SUGGESTIONS USING TOPICAL N-GRAMS ELENA HIRST.
Text Similarity: an Alternative Way to Search MEDLINE James Lewis, Stephan Ossowski, Justin Hicks, Mounir Errami and Harold R. Garner Translational Research.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
IR Theory: Evaluation Methods
Basic Information Retrieval
Data Mining Chapter 6 Search Engines
Learning Literature Search Models from Citation Behavior
Introduction to Information Retrieval
Information Retrieval and Web Design
Information Retrieval and Web Design
Presentation transcript:

User-Friendly Systems Instead of User-Friendly Front-Ends Present user interfaces are not accepted because the underlying systems are too difficult to use. Natural language queries using statistical probability formulas are a possible solution. Four prototype systems were looked at, PRISE, CITE, Muscat, and the News Retrieval Tool.

User-Friendly Systems Instead of User-Friendly Front-Ends There are two major problems with the Boolean system –It requires the user to input AND and OR statement. –The results are returned in an unordered list. Statistical retrieval engines allow both a natural language query and returns a list ranked by relevance.

User-Friendly Systems Instead of User-Friendly Front-Ends The basic theory is to compare each record in an information file to the users query and estimating the likelihood of relevance. One of the first systems tried was quorum or coordinate matching which performed poorly One of the most efficient ways of weighting terms is through IDF – Inverted Document Frequency.

PRISE The similarity of the document and query are found through summing the weights of all matching terms. The goal of this system were only to show that this system could be efficiently implemented. The query was formed using natural language. Results were returned in an ranked list

PRISE The scope of the test consisted of 5 different data sets Over 40 users took part, 2/3 of which had never did or seldom used online retrieval systems 9 of the test subjects used Boolean systems regularly 5 of them used Boolean systems occasionally

PRISE 53 of the 68 queries retrieved at least 1 relevant document in the top of these relevant documents were the first document retrieved. 10 of the users that had at one time used a Boolean search engine gave the system a favorable review.

CITE Developed specifically to access Medline. Uses “free-text” queries against the titles and abstracts in Medline. Users marked the relevant documents returned, then the controlled words from those documents were applied to a new search. No longer in use.

Muscat It is based in the probabilistic model by Spark Jones, however it was weighted differently. Could use various query input, from natural language to UDC for Polar Libraries. It has the ability to use a Boolean search in concert with the probabilistic model.

News Retrieval Tool Uses a Macintosh based interface where the user can add weight to their terms. The users are allowed to select relevant documents and then research based on those documents. Users can also select key words from a document to become part of the search set.

Conclusion These systems fit the requirement that they provide high quality results at real-time speeds. They allow novice users to quickly become successful searchers. By adding successful search engines that cater to the end user these prototypes became user-friendly systems.