GOOSE: A Goal-Oriented Search Engine with Commonsense

Slides:



Advertisements
Similar presentations
Common Sense Reasoning for Interactive Applications MAS 964: Common Sense Reasoning for Interactive Application.
Advertisements

Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.
CS 484 – Artificial Intelligence1 Announcements Choose Research Topic by today Project 1 is due Thursday, October 11 Midterm is Thursday, October 18 Book.
ConceptNet: A Wonderful Semantic World
6/2/ An Automatic Personalized Context- Aware Event Notification System for Mobile Users George Lee User Context-based Service Control Group Network.
Search Engines and Information Retrieval
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Help and Documentation zUser support issues ydifferent types of support at different times yimplementation and presentation both important yall need careful.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
1 Information Retrieval and Web Search Introduction.
Web Searching. Web Search Engine A web search engine is designed to search for information on the World Wide Web and FTP servers The search results are.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Common Sense Computing MIT Media Lab Interaction Challenges for Agents with Common Sense Henry Lieberman MIT Media Lab Cambridge, Mass. USA
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Search Engines and Information Retrieval Chapter 1.
Author: William Tunstall-Pedoe Presenter: Bahareh Sarrafzadeh CS 886 Spring 2015.
Push Singh & Tim Chklovski. AI systems need data – lots of it! Natural language processing: Parsed & sense-tagged corpora, paraphrases, translations Commonsense.
Implicit An Agent-Based Recommendation System for Web Search Presented by Shaun McQuaker Presentation based on paper Implicit:
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto.
Information Retrieval
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Artificial Intelligence
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Common Sense Inference Let’s distinguish between: Mathematical inference about common sense situations Example: Formalize theory of behavior of liquids.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Artificial Intelligence Knowledge Representation.
Session 5: How Search Engines Work. Focusing Questions How do search engines work? Is one search engine better than another?
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Information Retrieval and Web Search Vasile Rus, PhD websearch/
1 Commonsense Reasoning in and over Natural Language Hugo Liu Push Singh Media Laboratory Massachusetts Institute of Technology Cambridge, MA 02139, USA.
Information Retrieval in Practice
Human Computer Interaction Lecture 21 User Support
Recommender Systems & Collaborative Filtering
Information Organization: Overview
MAS 964: Common Sense Reasoning for Interactive Applications
Advanced AI Session 2 Rule Based Expert System
Human Computer Interaction Lecture 21,22 User Support
Information Retrieval and Web Search
Architecture Components
Information Retrieval and Web Search
Information Retrieval and Web Search
Information Retrieval
Semantic Web: Commercial Opportunities and Prospects
WIRED Week 2 Syllabus Update Readings Overview.
Information Retrieval
CSE 635 Multimedia Information Retrieval
MGS 4020 Business Intelligence Ch 1 – Introduction to DSS Jun 7, 2018
Nantawan Chuarayapratib Thammasat University
DESIGN PATTERNS : Introduction
Introduction to Information Retrieval
Lecture 8 Information Retrieval Introduction
Chapter 11 user support.
Code search & recommendation engines
The Perfect Search Engine Is Not Enough
CS246: Information Retrieval
Information Organization: Overview
Information Retrieval and Web Design
Recuperação de Informação
The Winograd Schema Challenge Hector J. Levesque AAAI, 2011
Information Retrieval and Web Search
Introduction to Search Engines
Presentation transcript:

GOOSE: A Goal-Oriented Search Engine with Commonsense Hugo Liu, Henry Lieberman, Ted Selker Software Agents Group MIT Media Laboratory AH2002 Talk 2002.5.31 Malaga, Spain

In a Nutshell Motivation: Response: Novice search engine users have trouble forming good queries. They more naturally express non-specific search goals (or intentions) rather than the particular keywords needed for an effective query to a search engine. Response: GOOSE (GOal-Oriented Search Engine) is an adaptive UI It combines natural language understanding and commonsense reasoning to transform a user’s search goal statement into an effective query.

Agenda What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

What’s wrong with web search UIs? Simple search text box is easy to use BUT often not focused enough The only way to improve focus is to use advanced syntax Boolean operators (AND, OR) inclusion/exclusion (+,-) Words vs. Phrases (e.g. james bond vs. “james bond”) Such syntax must be learned.. it is not intuitive to novice users

What’s wrong with web search UIs? User needs a priori knowledge of search hits Must anticipate structure of pages you expect to find, and exploit this structure when formulating query. e.g.: to find lyrics for a song: User should know lyrics web pages generally include: the lyrics, the song name, the songwriter’s name, the album name, the keyword “lyrics” Example +“I dreamed a dream” +“les miserables” +“lyrics” Novice search engine users don’t have this search expertise!

What’s wrong with web search UIs? What about hierarchical search directories like YAHOO! ? Easier syntax, BUT… Can only search pages that are categorized Some pages are hard to categorize Too many clicks in the task model Assumes users know what web pages they are looking for, and how it is categorized. Home > Arts > Performing Arts > Theater > Musicals > Shows > Les Misérables >

Can we do better? Yes! What UI is intuitive for novices? What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

What UI is intuitive for novices? We performed an experiment to see how novice users form queries (vs. advanced users) Use novice users’ natural querying behavior as a basis for the notion of “intuitive”

Experiment Design Participants Medium four novice users (never used one before) four advanced users (2+ years of routine use) Medium Yahoo! queries (not directory) Perform common search tasks like: Find someone’s homepage Find a product Research a topic Resolve a household problem (i.e. get vcr fixed)

Example from Experiment Instructions: Find someone online who likes movies. Novice User someone online who likes movies| Poor results: movie databases, no personal homepages Advanced User +‘movies’ +‘my homepage’ +‘my interests’| Relevant results: no movie databases

Experiment Observations Novice Users Revert to natural language Can’t explicitly identify topic keywords e.g. “movies” Don’t use context keywords e.g. “my homepage” State non-specific “goals” e.g. “I want to find someone online who likes movies” versus: find a page that is a personal homepage AND talks about owner’s interests AND has “movies” as an interest Experienced Users Use topic Keywords e.g. “movies” Use context keywords e.g. “my homepage”, “my interests” Performs inference from “goals” to query A lot of inference is “common sense” Some of inference is called “search expertise”

Inference Chain Example +‘movies’ +‘my interests’ +‘my homepage’ I want to find someone online who likes movies Movies are a type of interest that a person might have. People might talk about their interests on their homepage People’s homepages might contain the string “my homepage”

Experiment suggests that a UI intuitive for novices should… Allow natural language query Let user express query as a search goal Infer more specific search terms from non-specific search goals Both commonsense and search expertise are involved in inference Identify topic keywords Deduce appropriate context keywords

Use commonsense to reason from user’s non-specific search goals What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

What is commonsense? Commonsense is: People have a lot of commonsense Knowledge about the everyday world e.g. Books are often found in libraries People may take medicine when they are sick Obvious to most people So, often not explicitly stated Culturally specific e.g. “a bride has bridesmaids” and “weddings may take place in churches” are obvious to middle-class people in the USA, but not necessarily elsewhere. People have a lot of commonsense Split into different representations (large ontology) of knowledge On the order of 20 million facts, according to Minsky (2002)

How can commonsense help? Novice users prefer to express a non-specific or implicit search goal e.g. user types “my cat is sick” rather than +veterinarian +“boston, MA” Use commonsense reasoning (inference) to reformulate search goal Inference chaining over simple English sentences My cat is sick Cats are pets If a pet is sick, take it to a veterinarian So, search for “veterinarian”

What is our source of commonsense knowledge? Open Mind Common Sense (OMCS) (Singh, 2002) http://commonsense.media.mit.edu OMCS is: Publicly acquired through a web-community of collaborators Generic database of commonsense (not hand-crafted for any specific domain) Currently, has about 420,000+ commonsense facts Commonsense is represented as semi-structured English sentences

OMCS knowledge entry UI

OMCS entries Organized into an ontology of social commonsense (including but not limited to) Classification: A cat is a pet Spatial: San Francisco is part of California Scene: Things often found together are: restaurant, food, waiters, tables, seats Purpose: A vacation is for relaxation; Cough medicine is to help a cough. Causality: After the wedding ceremony comes the wedding reception. Emotion: Pet owners love their pets; Rollercoasters make you feel excited and scared.

More on Open Mind Comparision to Cyc OMCS advantages Cyc (Lenat, 2002) 3 million hand crafted assertions represented as logical formulas OMCS advantages Publicly and freely available Less granular (i.e. more knowledge about social-level interactions) Easy to add knowledge (using simple English), and integrate with personal commonsense

Open Mind Caveats More ambiguity than Cyc Word senses not disambiguated Coverage is uneven, and spotty at times acquisition process is responsible for this causes inference to be brittle at times Free-form English is difficult to parse robustly Most sentences can only be parsed into first-order predicate argument structures (binary relations) Due to loosely constrained templates in OMCS Therefore, inference is currently limited to first-order.

The GOOSE mechanism What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

Limitations of semantic understanding Semantic understanding of search goal statement needs a constrained domain Specification of goal type by user provides this constraint Each search goal has its own set of semantic frame templates For example “I want help solving this problem” e.g. (problem_object, problem_attributes, action)

Preparing the commonsense OMCS English sentences are first compiled into predicate-argument structures Pattern-matching rules compile OMCS from english sentences into pred-arg structures like “isKindOf(cat,pet).” First-order commonsense inference: chaining pred-arg structures through transitivity (mostly) (A relation1 B) and (B relation2 C) ( A relation3 C) relation1,relation2relation3 must be a valid inference pattern Application-level commonsense (search expertise) is also parsed into pred-arg structures e.g. lyrics pages are indicated by the keyword ‘lyrics’ == pageHasSalientKeyword(‘lyrics page’,’lyrics’)

Inference with commonsense Inference patterns for pairwise constraints (Singh, 2002) Describes allowable inferences between pairs of pred-arg structures. Inference begins with semantic frame representation of user’s search goal Ordinary commonsense (e.g. “cats are pets”) and application-level commonsense (e.g. “veterinarian is a type of local business”) rules fire. Path ends when no more rules fire (failed inference) or when an application-level rule has fired (successful). Context attacher uses search expertise metarules to decide the keywords to include, from the path.

Limitations of Inference OMCS can currently only be parsed into binary and ternary predicate argument relations, e.g. isKindOf(cat,pet) Inference is (mostly) monotonic, first-order Reasoning capabilities limited To manage the exponential explosion of search space for inference, commonsense is classified into subdomains. Reasoning within individual subdomains is faster Queries classified into subdomains by vector similarity to bag-of-keywords within the subdomain

How was GOOSE evaluated? What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

Preliminary evaluation Four users asked to perform common info seeking tasks using GOOSE interface. Tasks focus on the commonsense subdomains the system knows about Same query was passed to Google Users asked to rank the relevance of the first page search results from 1 to 10 (most relevant)

Interpreting Results Inference is brittle for semantically under-constrained goals (i.e. research a product; learn more about) When inference worked, relevance improved over baseline When inference failed, relevance is still comparable to baseline…  “fail soft”

Related work What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

Other solutions to improving search Three main types of query improvement: Thesaurus expansion: Doesn’t necessarily focus search Relevance feedback: Too many steps Hand-crafted question templates like Ask Jeeves “I want to go from A to B” Harder to scale GOOSE is different because Unlike ask jeeves, search goals only have to imply the query (e.g. “sick cat”, “people online who like movies”) GOOSE performs intent inference using commonsense and search expertise (via semantic frame templates and context attacher metarules)

Concluding Remarks What’s wrong with web search UIs? What UI is intuitive for novices? How can commonsense help? How does GOOSE work? Preliminary evaluation Other solutions Conclusions and Future Direction

Conclusions Demonstrated how a corpus of generic (not hardcoded) commonsense knowledge can be used to create an intuitive web search UI novices can use. Commonsense inference can be used to adapt a user’s search goal into a more effective query GOOSE’s commonsense-based interface assumes some of the burden of reasoning to arrive at good search terms. Though we cannot be assured of the coverage of the commonsense knowledge in OMCS, and thus, of the robustness of the inference, GOOSE in its current state still finds some opportunities to improve the query. It is “fail-soft” because: If inference chaining is successful, query is improved If not, original query is still passed to Google

Future Work Automate disambiguation of goals (simplifies UI) Personalize commonsense For example, for the search goal, “broken vcr,” personal commonsense (e.g.“The user is handy with electronics”) can help decide whether or not to show do-it-yourself repair pages, or electronics repair shop information (or both) Commonsense can be thought of as a generic user model (term used liberally) of stereotyped ways that most people think. This user model might be the foundation for all users This user model is customizable through the gathering of personal commonsense

An invitation There is now a substantial amount of publicly available commonsense knowledge (OMCS, OpenCyc, ThoughtTreasure) GOOSE and ARIA demonstrate “fail-soft” ways to incorporate commonsense into user interfaces Personalization integrates well with commonsense (same reasoning architecture for both) This personalized commonsense is well suited for adaptive agents operating in realistic domains (e.g. a travel recommender)

Pointers Commonsense-based interfaces GOOSE for web search ARIA for annotated photo retrieval (AH2002) MAKEBELIEVE for interactive storytelling (AAAI-2002) Access to papers and demos: google for “hugo liu” Publicly available commonsense corpora http://openmind.org/commonsense http://opencyc.org http://www.signiform.com