Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,

Slides:

Advertisements

Similar presentations

Date: 2014/05/06 Author: Michael Schuhmacher, Simon Paolo Ponzetto Source: WSDM’14 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Knowledge-based Graph Document.

Advertisements

A Phrase Mining Framework for Recursive Construction of a Topical Hierarchy Date ： 2014/04/15 Source ： KDD’13 Authors ： Chi Wang, Marina Danilevsky, Nihit.

Improved TF-IDF Ranker

Patch to the Future: Unsupervised Visual Prediction

Machine Learning and Data Mining Clustering

1 A scheme for racquet sports video analysis with the combination of audio-visual information Visual Communication and Image Processing 2005 Liyuan Xing,

Mining Query Subtopics from Search Log Data Date : 2012/12/06 Resource : SIGIR’12 Advisor : Dr. Jia-Ling Koh Speaker : I-Chih Chiu.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.

Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.

Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Creating Concept Hierarchies in a Customer Self-Help System Bob Wall CS /29/05.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University

Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.

CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling

The Problem Finding information about people in huge text collections or on-line repositories on the Web is a common activity Person names, however, are.

Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.

Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.

1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor ： Jia Ling, Koh Speaker ： SHENG HONG, CHUNG.

7-Speech Recognition Speech Recognition Concepts

COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.

A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.

Clustering Supervised vs. Unsupervised Learning Examples of clustering in Web IR Characteristics of clustering Clustering algorithms Cluster Labeling 1.

Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.

1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.

Adding Semantics to Clustering Hua Li, Dou Shen, Benyu Zhang, Zheng Chen, Qiang Yang Microsoft Research Asia, Beijing, P.R.China Department of Computer.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Information Retrieval Lecture 6 Introduction to Information Retrieval (Manning et al. 2007) Chapter 16 For the MSc Computer Science Programme Dell Zhang.

Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.

Clustering Gene Expression Data BMI/CS 576 Colin Dewey Fall 2010.

1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.

LOGO 1 Corroborate and Learn Facts from the Web Advisor ： Dr. Koh Jia-Ling Speaker ： Tu Yi-Lang Date ： Shubin Zhao, Jonathan Betz (KDD '07 )

Extracting Keyphrases to Represent Relations in Social Networks from Web Junichiro Mori and Mitsuru Ishizuka Universiry of Tokyo Yutaka Matsuo National.

Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.

2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.

Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.

Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.

V. Clustering 인공지능 연구실 이승희 Text: Text mining Page:82-93.

Compiled By: Raj Gaurang Tiwari Assistant Professor SRMGPC, Lucknow Unsupervised Learning.

Date: 2013/6/10 Author: Shiwen Cheng, Arash Termehchy, Vagelis Hristidis Source: CIKM’12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Predicting the Effectiveness.

Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.

Data Mining and Decision Support

1 Machine Learning Lecture 9: Clustering Moshe Koppel Slides adapted from Raymond J. Mooney.

Effective Anomaly Detection with Scarce Training Data Presenter: 葉倚任 Author: W. Robertson, F. Maggi, C. Kruegel and G. Vigna NDSS

CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Learning Analogies and Semantic Relations Nov William Cohen.

Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.

Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Clustering (Search Engine Results) CSE 454. © Etzioni & Weld To Do Lecture is short Add k-means Details of ST construction.

Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.

Data Mining and Text Mining. The Standard Data Mining process.

Introduction Task: extracting relational facts from text

Intent-Aware Semantic Query Annotation

Text Categorization Berlin Chen 2003 Reference:

Presentation transcript:

Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November, 2012

Outline Introduction Feature Representation Clustering Intent Summarization Instance Annotation Experiment Conclusion

Introduction In the past... we can get a list like this... One standard search result....

Introduction Now the modern search engine... not only improving ranking result allowing domain-dependent types of semantically enriched search result. For Example: [city] weather [movie] showtimes [location] What can we get ?

Introduction Key issue: To connect the surface-level query to a structured data source according to the intent of the user. An intent understanding system needs: Domain classification Intent detection Slot filling

Introduction Goal: Discover popular user intents and their associated slots, and to annotate query instances accordingly in a completely unsupervised manner. Define intent as represented by a pattern consisting of a sequence of semantic concepts or lexical items. The paper present a sequence clustering method for finding such clusters of queries with similar intent, and an intent summarization method for labeling clusters with such an intent pattern that describes the queries in the cluster. It discovers the patterns and slots in a domain and allows classification of new queries not used in clustering into the discovered patterns.

Introduction Q = Harry Potter showtimes Boston Domain - Movie Intent - FindShowtimes Slots: Title:”Harry Potter”, Location = “Boston” Q => [Movie.title] showtimes [Location] [Madagascar2] showtimes [London] [Batman] showtimes [Sydney]

Feature Representation Using feature vector to represent query q = (w 1,w 2,w 3,...,w M ) - has M word tokens x i = (x i,1,x i,2,...,x i,N ), i = 1,2,...,M ex. q = (2010 buick regal review), M = 4 Knowledge Base Freebase is a large knowledge base with structured data from many sources. We created a list of concepts for each surface form by merging the concepts of all entities for that surface form.

q = (2010, buick, regal, review) x 1 = [0,1,0,1,1,...,0,0] T X q = (x 1 q, x 2 q, x 3 q,...,x M q ) : N*M N = N S + N L N S : Based on the knowledge base data (FreeBase), ex. [car], [episode],... N L : Extract from Wall Street Journal corpus except for proper noun and cardinal numbers. ex. review, test Feature Representation Why we assign the value 1? Want to produce more useful intent summarization

Feature Representation REF(y) = 1/size(y), similar with IDF y is a Freebase concept or lexical item size(y) is 1 if y is a lexical item or the number of surface forms that concept y contains.

Clustering Goal - Discover groups of queries with similar intents. We adopt a bottom-up approach based on similarity metric and choose agglomerative clustering using a distance metric based on dynamic time warping(DTW) between a pair of sequence.

Distance Metric r i = (REF(1)*x i,1,...,REF(N)*x i,N ) T, the REF-weighted vector for x i - a distance on a pair of static feature vectors

Distance Metric Alignment - a set of edges E connecting the two sequences of nodes Every node has at least one edge, and There are no crossing edges The optimal alignment is the one that produces the smallest total distance computed by summing over all edges

Distance Metric = DTW

Agglomerative Clustering Single-link clustering complete-link clustering Stop criterion Set the threshold of the minimum inter-cluster distance of remaining clusters set a fixed number of merging iterations.

Intent Summarization Goal : Produce a pattern that describe the intent of clusters Pattern: a label sequence of slots and lexical items p = (y 1,y 2,...,y L ), L is the length of label L <= the length of the queries being summarized. Advantage: It can discover intents and inform development of structured search and a human-readable to connect the intents to a structured database useful. The patterns produced by intent summarization would allow generalization of the intents to new query instances.

Intent Summarization Step I - The queries are segmented and aligned Words that belong to the same entity in Freebase have very similar feature vector representations(ex. San Francisco) Merging words into a segment if the cosine similarity is below a low threshold.

Intent Summarization Step II - For each position in the pattern, we select a Freebase concept or lexical item to represent the segments at that position. Suppose we have a cluster of K queries y i = generalize(S i =S i 1,...,s i K ) Free-base concept or lexical item that summarizes the input segments. One natural way to do generalization is to pick the concept that contains the most input segments according to Free-base. Not to be robust.

Intent Summarization From the Freebase concept omissions and ambiguity of the segments can lead to incorrect generalizations. Ex.“2004 audi s4 review”, “2010 buick regal review”, and “09 acura rl review” : The Freebase concept for [year] contains “2004” and “2010”, but not “09” Noise.

Intent Summarization The following model gives the joint probability of the segments and the concept: Each of the segments in the cluster can be modeled independently of the others To solve the problem above:(using REF score to initialize parameter) F represents the total number of surface forms in Freebase.

Intent Summarization This initialization prefers concepts with higher REF scores and lexical items, it is not ideal because it is susceptible to overly-specific concepts. Ex.[breed origin] or [olympic participating country]=> [country] concept. It can solve by EM algorithm.

Intent Summarization The maximized expression is simple to calculate for each concept and lexical item => E-step M-step, we reestimate model parameters count(y) is the number of times y appeared in the E-step’s classification, and the denominator is the total number of classifications made.

Intent Summarization Any clusters that have the same pattern after intent summarization are merged. It shows the output of intent summarization on the example query cluster. We could further merge clusters based on semantic intent, as an intent may be expressed by several different, but related patterns. [city] map or map of [city]

Instance Annotation For each pattern p = (y 1,y 2,...,y L ), we define z i to be a feature vector in the same space as the word tokens that represents the i-th element in the pattern. Given this representation of a pattern, a new instance can be classified into one of the existing patterns, Z ∗.If the minimum distance is above a threshold τ, the query is judged to be not similar enough to any of the existing patterns and not classified into one.

Experiment Cluster Pattern Precision Query Labeling Precision Domain Coverage

Evaluation1 Cluster Pattern Precision The single-link clusters were more precise the complete-link algorithm discovered more clusters.

Evaluation 2 Query Labeling Precision

Evaluation 3 Domain Coverage

Conclusion We have presented an unsupervised method to find clusters of queries with similar intent, induce patterns to describe these clusters, and classify new queries into the cluster that are discovered, all using only domain- independent resources. Our work can facilitate structured search development by reducing the amount of manual labour needed to extend structured search into a new domain.