WikiQuery.org -- An interactive collaboration interface for creating, storing and sharing effective CNF queries Le Zhao*, Xiaozhong Liu #, Jamie Callan*

Slides:



Advertisements
Similar presentations
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Advertisements

Leveraging Web 2.0 Technologies for KidsClick! A Redesign Project Enid J. Irwin Project Manager, KidsClick! SJSU School of Library and Information Science.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
How to Make Manual Conjunctive Normal Form Queries Work in Patent Search Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science.
Adaptive Book: A Platform for teaching, learning and student modeling Ananda Gunawardena School of Computer Science Carnegie Mellon University.
Office Live Basics. Track projects, customers, and company information in one place  Manage your company's projects, sales leads, employees, customer.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
Blogs  Also known as a web log  A tool for collaboration in a in the 21 st century classroom  Allows one to share ideas and thoughts with the world.
INTRODUCTION TO BASIC BOOLEAN SEARCH AND TRUNCATION METHODS Paul Tremblay, Reference Librarian Paul Tremblay, Reference Librarian Office: (718)
Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Federated Search of Text Search Engines in Uncooperative Environments Luo Si Language Technology Institute School of Computer Science Carnegie Mellon University.
The Internet Inquiry Process: How to Succeed in Searching the Web, Simply Debbie Clingingsmith John Liddicoat Maria Moratto Eve Benson Glee Mellor.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Overview of Search Engines
Effective Internet Searching. Why use the Internet Search for a question Research a topic Current research Variety of sources, a click away What other.
Automatic Term Mismatch Diagnosis for Selective Query Expansion Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie.
Chapter 5 Application Software.
Introduction to WordPress with SiteControl By: Web Services.
Andrea Peach, Ed. D. Associate Professor of Graduate Education Georgetown College 21 st Century Skills for College.
Web 2.0 for Government Knowledge Management Everyone benefits by sharing knowledge March 24, 2010 Emerging Technologies Work Group Rich Zaziski, CEO FYI.
Evaluation of digital collections' user interfaces Radovan Vrana Faculty of Humanities and Social Sciences Zagreb, Croatia
Social Media for Credit Unions? Facebook – Getting Started Adding content Promoting Advertising Summary W E L O O K A T T H I N G S D I F F E R E N T.
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Wiki Culture & Collaboration Presented by: Faria Sami Quratulain Shattari Munim Ahmed Zaid Nizami.
Build a Free Website1 Build A Website For Free 2 ND Edition By Mark Bell.
Improving Participation in Adult Education Web 2.0 tools for strengthening competencies of adult education providers.
ControlDraw, Modularisation, Standards And Re-Use Standardised Specification and Modular Design How ControlDraw Help.
Gathering and Analyzing Web Use Statistics: A Practical Tutorial for Archivists Michael Szajewski, Ball State University, Archivist for Digital Development.
The Savvy Cyber Teacher ® Using the Internet Effectively in the K-12 Classroom Copyright  2001 Stevens Institute of Technology, CIESE, All Rights Reserved.
IR Evaluation Evaluate what? –user satisfaction on specific task –speed –presentation (interface) issue –etc. My focus today: –comparative performance.
Sharad Oberoi and Susan Finger Carnegie Mellon University DesignWebs: Towards the Creation of an Interactive Navigational Tool to assist and support Engineering.
A Survey of Patent Search Engine Software Jennifer Lewis April 24, 2007 CSE 8337.
7. Approaches to Models of Metadata Creation, Storage and Retrieval Metadata Standards and Applications.
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
1 SWEET Simple Wiki Embedded Editing Tool The SWEET Team Michael Kouyessein Brian Sullivan Yuan-Hsun Tang Fangyan Xu The SWEET Website
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
SUMMON ® 2.0 DISCOVERY REINVENTED. What is Summon 2.0? A new, streamlined, modern interface New and enhanced features providing layers of contextual guidance.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Personalized Search Xiao Liu
The Savvy Cyber Teacher ® Using the Internet Effectively in the K-12 Classroom 1Copyright © 2001 Stevens Institute of Technology, CIESE, All Rights Reserved.
© Paradigm Publishing Inc. 5-1 Chapter 5 Application Software.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
Discovering Computers Fundamentals, Third Edition CGS 1000 Introduction to Computers and Technology Spring 2007.
The Savvy Cyber Teacher ® Using the Internet Effectively in the K-12 Classroom 1 Copyright © 2003 Stevens Institute of Technology, CIESE, All Rights Reserved.
Search Tools and Search Engines Searching for Information and common found internet file types.
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
AIMS OF THE WORKSHOP To understand the Research Process To understand the Research Process To become familiar with the Library Catalogue To become familiar.
Information Retrieval
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Text Similarity: an Alternative Way to Search MEDLINE James Lewis, Stephan Ossowski, Justin Hicks, Mounir Errami and Harold R. Garner Translational Research.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Searching the Web for academic information Ruth Stubbings.
Summon® 2.0 Discovery Reinvented
Advanced Web Searching for Educators
Discover How Your Business Can Benefit from a Facebook Fanpage
Discover How Your Business Can Benefit from a Facebook Fanpage
Software Documentation
Computer Literacy BASICS: A Comprehensive Guide to IC3, 3rd Edition
IR Theory: Evaluation Methods
Introduction to Information Retrieval
Information Retrieval and Web Design
Presentation transcript:

WikiQuery.org -- An interactive collaboration interface for creating, storing and sharing effective CNF queries Le Zhao*, Xiaozhong Liu #, Jamie Callan* *: Language Tech Institute, SCS, Carnegie Mellon University # : School of Information, University of Indiana 2012, Portland, OR

Status Quo Current open source search engines – Good at: attracting software apps/service providers Lucene, Lemur, Terrier, … – Lacking: end users study users’ search behavior Two necessary conditions for success: – Attract users with a unique feature/functionality (not offered by current Web search engines) – Retain users not easily copied by the current search engines

This Work Unique opportunity – The term mismatch problem – Significant problem w/ huge potential [Zhao10,12] – Web search engines: automatic expansion – Lots of room for manual expansion Solution – Conjunctive normal form expansion – Commonly & effectively used by expert searchers [Lancaster68,Harter86,Hearst96,Baron07] 3

Average term mismatch rate: 30-40% [Zhao10] A common cause of search failure [Harman03, Zhao10] Frequent user frustration [Feild10] More than % gain in retrieval accuracy [Zhao12] Term Mismatch Problem 4 Relevant docs not returned Web, short queries, stemmed, inlinks included

Term Mismatch & Boolean Conjunctive Normal Form (CNF) Expansion Keyword query: approval of logos on television watched by children Manual CNF (TREC Legal track 2006): (approval OR guideline OR strategy) AND (logos OR promotion OR signage OR brand OR mascot OR marque OR mark) AND (television OR TV OR cable OR network) AND (watched OR view OR viewer) AND (children OR child OR teen OR juvenile OR kid OR adolescent) – Expressive & compact (1 CNF == 100s alternatives) – Highly effective (50-300% over base keyword [Zhao12]) – Widely used by experts (library, legal, medical …) – But, tedious to create 5

Goal of WikiQuery To facilitate the – creation (a proper tool is lacking for practitioners) – storing (for refinding & sharing) – and sharing (for collaboration on query parts) of high quality CNF queries 6

One Wiki Page == One Search Need

Displaying CNF Query

Links to Search Result Pages Relying on existing search engines for results

Creating/Editing CNF Queries

Search Result Access When Editing Interactions with search results are usually necessary to ensure quality of the created query.

Looks Similar, but Different from Library advanced search: simple Boolean – For example, Library of Congress Advance search: – Term1 AND/OR Term2 AND/OR Term3 LexisNexis: free form Boolean ERIC: more flexible, joining with AND or OR

Evaluating the WikiQuery Interface 13

Experiment Setup Hypothesis – Users (with limited knowledge about Boolean queries) can create effective Boolean queries using WikiQuery Preliminary user studies – Classroom users 6 students, limited prior knowledge of Boolean queries or IR a 10 minute session of editing Wiki pages – TREC topics => user Boolean CNF queries Total 12 topics, 12 final Boolean queries Interacted with Google and Yahoo (Spring, 2010) (40 minutes per topic) Evaluating on the Web & TREC collections 14

Results :-) – on average 30-50% gain over keyword queries :-( – But, not consistent (Web eval > TREC eval) – not statistically significant over strong kw baseline :-O – 75% contain more restrictive formulations, unstable – < 50% of queries are CNF expanded, bit more stable Need better tools to check quality of expansion terms 15

Conclusions Conjunctive Normal Form expansion has great potential Users need guidance and learning to create effective CNF queries – need to be warned against more restrictive queries Uses of WikiQuery: – Expert searchers – Classroom: becoming experts – Whenever you face a hard query 16

Source Code Based on MediaWiki version 1.17 Available at Questions: now OR catch me at lunch! 17