Computational User Intent Modeling

Computational User Intent Modeling
Hongning Wang March 6, 2013

Research Summary Joint relevance and freshness learning [WWW’12]
Content-Aware Click Modeling [WWW’13] Cross-Session Search Task Extraction [WWW’13] 9/18/2018

Understanding User Intent is Important
“Apple Oct. 4, 2011 Release of iPhone 4S Let’s take the event happened on “apple company” last October as an example. When a user submit the query “Apple Company” into a news search engine, on Oct. 4, 2011, she was expecting to find a list of news reports covering the new release of Apple’s iPhone4S.

Understanding User Intent is Important
“Apple Oct. 5, 2011 Steve Jobs passed away However, the news talking about release of iPhone 4s became less relevant to the query just one day after, when Apple Inc.’s former CEO Steve Jobs passed away on Oct. 5, Such case is quite common in news search, and urges us to give seriously consider and model freshness and the trade-off between relevance and freshness in news search scenario. Release of iPhone 4S

Relevance v.s. Freshness
Topical relatedness Metric: tf*idf, BM25, Language Model Freshness Temporal closeness Metric: age, elapsed time Trade-off Query specific To meet user’s information need In traditional web search, relevance, is narrowly defined as topical relatedness, where various ranking features have been proposed to capture such relevance, including tf*idf, BM25 and language modeling approaches. While in news search, the increased importance of freshness is one distinctive characteristic. The emphasis on freshness is crucial for news search, as even the seemingly current event can quickly become outdated, dwarfed by the importance of new developments. In addition, the relative emphasis, or trade-off, between these two aspects are highly specific to the query and time when the query was issued. And such trade-off reflects the end-user’s underlying information need.

Joint Relevance and Freshness Learning
Our Contribution Joint Relevance and Freshness Learning JRFL: (Relevance, Freshness) -> Click Query => trade-off URL => relevance/freshness Click => overall impression Inspired by this example, we proposed to model the users' click behavior in news search as a direct consequence of examining the relevance and freshness aspects of the returned documents. And we name our method as Joint Relevance and Freshness Learning, or JRFL in short. We can abstract the user’s decision process discussed in the last slide by this figure. In addition to give a good estimate of a candidate URL’s relevance and freshness aspects, a good ranking function should be able to infer the trade-off and return the optimally combined ranking results for individual queries as well, because the relative emphasis the users put over these two aspects can vary substantially, reflecting the searching intention for the specific news event. However, we cannot explicitly obtain the users' relevance/freshness judgments and the preferences over these two aspects, since their evaluation process is not directly observable from the search engine. Fortunately, the users' click patterns are recorded, from which we can assume the clicked documents are more meaningful to her than the non-clicked ones. Therefore, we model relevance and freshness as two latent factors and assume a linear combination of these two, which is also latent, generates the observed click preferences.

Quantitative Comparison
Ranking performance Random bucket clicks To validate the effectiveness of the proposed JRFL model in real news search tasks, we quantitatively compare it with all our baseline methods on: random bucket clicks, normal clicks, and editorial judgments. All the click-based learning algorithms are trained on all 150K click preferences. Since all these models have several parameters to be tuned, we report their best performance on the corresponding testing set according to metric in the following results and perform t-test to validate the significance of improvement (against the second best performance accordingly). On this unbiased random bucket clicks, we can find the proposed JRFL model achieves encouraging improvement over the second best GBRank model, especially on the relative improvement is over 5.88%. This improvement confirms that properly integrating relevance and freshness can indeed improve the user's search satisfaction.

Content-Aware Click Modeling
Study the underlying mechanism of user clicks Freshness weight=0.8 R=0.39 F=2.34 Y=1.95 R=1.72 F=2.18 Y=2.01 R=2.41 F=1.76 Y=2.09

Modeling User Clicks Match my query? Redundant doc? Shall I move on?

Content-Aware Click Modeling
Our Contribution Content-Aware Click Modeling Encode rich dependency within user browsing behaviors via descriptive features Chance to further examine the result documents: e.g., position, # clicks, distance to last click Chance to click on an examined and relevant document: e.g., clicked/skipped content similarity Relevance quality of a document: e.g., ranking features

Experimental Results Take advantage of both counting-based and feature-based methods

Learning to Extract Search Tasks
An atomic information need that may result in one or more queries 5/29/2012 S1 5/29/2012 5:26 bank of america 5/29/2012 S2 5/29/ :11 macy's sale 5/29/ :12 sas shoes 5/30/2012 S1 5/30/ :19 credit union 5/30/2012 S2 5/30/ :25 6pm.com 5/30/ :49 coupon for 6pm shoes 5/29/2012 S1 5/29/2012 5:26 bank of america 5/29/2012 S2 5/29/ :11 macy's sale 5/29/ :12 sas shoes 5/30/2012 S1 5/30/ :19 credit union 5/30/2012 S2 5/30/ :25 6pm.com 5/30/ :49 coupon for 6pm shoes

Solution Heuristic constraints Structural knowledge Semi-supervised
Our Contribution Solution Heuristic constraints Structural knowledge Identical queries Sub-queries Identical clicked URLs Same task => tasks sharing related queries Latent Semi-supervised Structural Learning

Semi-supervised Structural Learning
Our Contribution Semi-supervised Structural Learning Structural inference Hierarchical clustering on best links Flexibility Exact inference exists

Experimental Results

plausible explanation of task structure
1 il volo singing tous les visages de l'amour 1.1 french version of album by il volo 1.1.1 french version of album by il volo french version of album by il volo 2 amazon.com international sites 2.1 amazon.com international 3 pottery barn warehouse clearance sale 4 amazon.com phone number 4.1 amazon.com phone number 4.1.1 amazon customer service phone number amazon customer service phone number 5 condo rentals in salter path, n.c. 6 piero barone's 19th birthday plans 6.1 piero barone family 6.1.1 piero barone family 6.2 piero barone's 19th birthday plans piero barone's 19th birthday plans 6.2.2 piero barone's 19th birthday plans piero barone singing piove piero barone singing piove

Publications Hongning Wang, Anlei Dong, Lihong Li, Yi Chang and Evgeniy Gabrilovich. Joint Relevance and Freshness Learning From Clickthroughs for News Search. The 2012 World Wide Web Conference (WWW'2012), p Hongning Wang, ChengXiang Zhai, Anlei Dong and Yi Chang. Content-Aware Click Modeling. The 23rd International World-Wide Web Conference (WWW'2013) (To Appear) Hongning Wang, Yang Song, Ming-Wei Chang, Xiaodong He, Ryen White and Wei Chu. Learning to Extract Cross-Session Search Tasks. The 23rd International World-Wide Web Conference (WWW'2013) (To Appear) Yang Song, Hao Ma, Hongning Wang and Kuansan Wang. Exploring and Exploiting User Search Behaviors on Mobile and Tablet Devices to Improve Search Relevance. The 23rd International World-Wide Web Conference (WWW'2013) (To Appear) Ryen White, Wei Chu, Ahmed Hassan, Xiaodong He, Yang Song and Hongning Wang. Enhancing Personalized Search by Mining and Modeling Task Behavior. The 23rd International World-Wide Web Conference (WWW'2013) (To Appear) Chi Wang, Hongning Wang, Jialu Liu, Ming Ji, Lu Su, Yuguo Chen, Jiawei Han. On the Detectability of Node Grouping in Networks. SIAM International Conference on Data Mining (SDM'2013) (To Appear) Hongbo Deng, Jiawei Han, Hao Li, Heng Ji, Hongning Wang and Yue Lu. Exploring and Inferring User-User Pseudo-Friendship for Sentiment Analysis with Heterogeneous Networks. SIAM International Conference on Data Mining (SDM'2013) (To Appear) Mianwei Zhou, Hongning Wang and Kevin Chen-Chuan Chang. Learning to Rank from Distant Supervision: Exploiting Noisy Redundancy for Relational Entity Search. The 29th IEEE International Conference on Data Engineering (ICDE'2013) Yue Lu, Hongning Wang, ChengXiang Zhai and Dan Roth. Unsupervised Discovery of Opposing Opinion Networks From Forum Discussions. The 21st ACM International Conference on Information and Knowledge Management (CIKM'2012), p

Thank you! Q&A 9/18/2018

User’s Judgment on Relevance and Freshness
v.s. Relevance User’s searching behavior Freshness weight=0.8 R=1.72 F=2.18 Y=2.01 R=2.41 F=1.76 Y=2.09 R=0.39 F=2.34 Y=1.95 Suppose, when a user submits a query to a news search engine and gets an according list of ranked news documents, she has a clear mind of her emphasis over relevance and freshness on the expected results. Then, she would sequencially judge the usefulness of each document by her underlining sense of relevance and freshness, and gives it an overall impression grade by her preference over relevance and freshness at that particular time. Once she has such impressions in mind, she would deliberately click the documents most interesting to her and skip all the others.

User Clicks Are Biased Position-bias Higher position More clicks
Not necessarily relevant Modeling Clicks => Decompose relevance-driven clicks from position-driven clicks

Learning to Extract Search Tasks
An atomic information need that may result in one or more queries An impression tѱ = 30 minutes

Computational User Intent Modeling

Similar presentations

Presentation on theme: "Computational User Intent Modeling"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computational User Intent Modeling

Similar presentations

Presentation on theme: "Computational User Intent Modeling"— Presentation transcript:

Similar presentations

About project

Feedback