Interfaces for Querying Collections. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting.

Slides:



Advertisements
Similar presentations
Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Advertisements

AN INTRODUCTION TO PL/SQL Mehdi Azarmi 1. Introduction PL/SQL is Oracle's procedural language extension to SQL, the non-procedural relational database.
Query Languages. Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
Dynamic Queries for Visual Information Seeking Ben Shneiderman Jin Tong Hyunmo Kang Cmsc838 Sep. 28, 1999.
IS530 Lesson 12 Boolean vs. Statistical Retrieval Systems.
Financial Data Calculator© Produced by: Mathematical Investment Decisions, Inc. 95 West Gate Drive – 2 nd Floor Cherry Hill, NJ Web site:
Jane Reid, AMSc IRIC, QMUL, 13/11/01 1 IR interfaces Purpose: to support users in information-seeking tasks Issues: –Functionality –Usability Motivations.
Information Retrieval IR 7. Recap of the last lecture Vector space scoring Efficiency considerations Nearest neighbors and approximations.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Information Retrieval in Practice
 How many pages does it search?  How does it access all those pages?  How does it give us an answer so quickly?  How does it give us such accurate.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Interfaces for Retrieval Results. Information Retrieval Activities Selecting a collection –Talked about last class –Lists, overviews, wizards, automatic.
ISP 433/533 Week 2 IR Models.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Intelligent Information Retrieval CS 336 –Lecture 2: Query Language Xiaoyan Li Spring 2006 Modified from Lisa Ballesteros’s slides.
Copyright © 2005, Pearson Education, Inc. Chapter 8 Command and Natural Languages.
1 CS 430 / INFO 430 Information Retrieval Lecture 15 Usability 3.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:
WMES3103 : INFORMATION RETRIEVAL INDEXING AND SEARCHING.
Overview of Search Engines
CORE 2: Information systems and Databases STORAGE & RETRIEVAL 2 : SEARCHING, SELECTING & SORTING.
Information Retrieval in Practice
Lecture #32 WWW Search. Review: Data Organization Kinds of things to organize –Menu items –Text –Images –Sound –Videos –Records (I.e. a person ’ s name,
1 California State University, Fullerton Chapter 8 Personal Productivity and Problem Solving.
CSC 480 Software Engineering Lecture 19 Nov 11, 2002.
10 Usability Heuristics for User Interface Design.
Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
SRI International Bioinformatics 1 The Structured Advanced Query Page Tomer Altman & Mario Latendresse Bioinformatics Research Group SRI, International.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 8 Personal Productivity and Problem Solving.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
The Internet 8th Edition Tutorial 4 Searching the Web.
Information retrieval 1 Boolean retrieval. Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text)
CS3773 Software Engineering Lecture 04 UML Class Diagram.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.
XP Chapter 4 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Collecting Data for Well-Designed Forms Chapter 4 “Making.
1/62 Introduction to and Using MS Access Database Management and Analysis Yunho Song.
Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Internet Research – Illustrated, Fourth Edition Unit B.
Information Retrieval
©SoftMooreSlide 1 Introduction to HTML: Forms ©SoftMooreSlide 2 Forms Forms provide a simple mechanism for collecting user data and submitting it to.
Unit B Constructing Complex Searches Internet Research Third Edition.
User Interfaces and Information Retrieval Dina Reitmeyer WIRED (i385d)
Search and Retrieval: Query Languages Prof. Marti Hearst SIMS 202, Lecture 19.
1 Unit E-Guidelines (c) elsaddik SEG 3210 User Interface Design & Implementation Prof. Dr.-Ing. Abdulmotaleb.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Information Retrieval in Practice
Query Models Use Types What do search engines do.
Why the interest in Queries?
Information Retrieval on the World Wide Web
Information Retrieval
Dynamic Queries for Visual Information Seeking Ben Shneiderman
Introduction to Information Retrieval
Chapter 11 Interaction styles
Query processing: phrase queries and positional indexes
Information Retrieval and Web Design
Presentation transcript:

Interfaces for Querying Collections

Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting a request –Queries & expressiveness –Graphical interfaces –Natural language Examining the response –Next class

Simple Query Interface

Complex Query Interface

Primary HCI Styles Command language Form filling Menu selection Direct manipulation Natural language Others?

Boolean Queries Most commercial full-text retrieval systems (until recently) supported only Boolean queries. Many studies show users have difficulty with Boolean expression –And and Or not as used in English “cats and dogs” “tea or coffee” –Syntax specifying nesting is often cryptic Boolean model does not include ranking –Earlier systems used reverse chronological order

Web-based Boolean Queries Search engines based on Boolean or extended Boolean engines needed to make their systems usable by the Web audience Reduce expressiveness for ease of use –Use “all the words” and “any of the words” –Boolean-based search engines added the + prefix Ranking performed using statistical algorithms and Web-specific heuristics

Command Line Search Command line interfaces for search Example Queries from Melvyl: –FIND PA darwin and TW species or TW descent –FIND TW Mt St. Helens AND DATE 1981

Command Line Search Still in use …

Form and Menus Melvyl

Faceted Queries Boolean queries often return too many or too few results –Conjunctions reduce sets too quickly –Disjunctions grow sets too quickly Solution: –Try out smaller queries to see if they have an appropriately sized set of results –Combine the smaller queries that are successful into larger query. Example: 1.(osteoporosis OR “bone loss”) 2.(drugs OR pharmaceuticals) 3.(preventions OR cure) 4.1 AND 2 AND 3

Post-Coordinate or Quorum Ranking Results are first ranked based on how many facets of the query they match. Faceted Search with Quorum ranking allows specifying each concept in multiple ways yet ranking based on number of concepts included in document. Further extension is to allow users to weight each facet. –Found on the web to help balance different goals of search (e.g. selecting a car or house)

Result Size Problem Occurs with Web Search Too

Graphical Query Specification Graphical interfaces can be static, direct manipulation, or combine the two. Direct manipulation –Continuous representation of objects –Physical actions replace complex syntax –Rapid incremental reversible operations on objects –Immediate feedback on actions

Graphical Boolean Queries Graphical queries are more accurate and faster than command-line queries in some studies Venn diagrams are common graphical approach –Limit to three elements in conjunction VQuery –Let users draw ellipses to create their own queries

VQuery

Process-Based Graphs Can graphically represent the query as a process of selection. Filter-flow model presents a set of filters. –One attribute and set of potential values per filter, multiple values treated as disjunction –Branches in flow indicate disjunctions –Serialized filters indicate conjunctions Fewer errors made with filter-flow than with SQL

Filter-Flow

Block-diagram Visualization Users arrange blocks to specify query. STARS –Users initially type in natural language query –Query terms are turned into blocks –Blocks are then arranged into query –Blocks in same row represent conjunction –Blocks in same column represent disjunction –Allows for previewing the query results by simple rearrangement of blocks

STARS

Magic Lenses Lenses act as filters on an overview visualization. –Disjunction is represented by independent lenses –Conjunction is expressed by placing multiple lenses over one another –Lenses can include addition information Where the term must appear Term frequency requirements Switches to use stemming …

Magic Lenses

Phrases and Proximity Specifying phrases and proximity constraints can be used to vastly improve precision. Phrase search is often used in the context of the Web. –But the phrase must be literal –“President Lincoln” does not match “President Abraham Lincoln” Proximity constraints allow for more general queries –Examples: LEXIS-NEXIS “white w/3 house” means “white within three words of house”

Natural Language and Free Text Queries Many systems treat question as a bag of words Natural language processing can be used to try to better determine the information need. –Extract noun (and verb) phrases –Find noun (and verb) phrases in same sentence Ask.com uses sites preselected to answer particular question forms. –Need to recognize type of question

Ask.com