Supporting Knowledge Discovery: Next Generation of Search Engines Qiaozhu Mei 04/21/2005
What Traditional Search Engines Can Do: Knowledge are explicit: terms in document and queries User’s information need can be represented by independent documents Ranked relevant document as results Users know exactly and specifically what they want Semantics of queries are explicit and specific: “Xuehua Shen + UIUC + homepage”..
What Search Engines cannot Do: How about the knowledge user want is implicit or latent on the web? “find information about the stocks that are keeping rising in the last two months” How about independent documents cannot satisfy the user’s information need? “what was the routing of president Bush during the election?” How about users can only provide an abstract information need? “I want find documents which hold positive opinion of personalized search systems” “The impact of Asia tsunami” What the hell does this mean?
What are Expected? Current generation of search engine: based on information retrieval Next generation of search engine: support knowledge discovery on the top of information retrieval? Find a specific answer: Question and Answering What’s Prof. Zhai’s newest publication? Find latent patterns: Text/Web Mining What’s in common and what’s different in Xuehua’s and Jing’s research? Describe and demonstrate knowledge: Summarization What’s new on NLP in Bioinformatics?
Current Search Architecture results query Web Pages Query Rep Doc Rep Ranking judgments Feedback INDEXING User
A Possible New Search Architecture results Web Pages New Query Rep Doc Rep Ranking judgments Feedback INDEXING User Query analyzer Interpreted query KDD Module User interests
Interesting questions: What kind of mining tasks can be done offline and what to be done online? How to support knowledge discovery efficiently and precisely? How to understand what user wants and formulate it into specific retrieval & mining tasks When will this happen?