Hongbo Deng, Michael R. Lyu and Irwin King

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Incorporating Participant Reputation in Community-driven Question Answering Systems Liangjie Hong, Zaihan Yang and Brian D. Davison Computer Science and.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Site Level Noise Removal for Search Engines André Luiz da Costa Carvalho Federal University of Amazonas, Brazil Paul-Alexandru Chirita L3S and University.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
A New Suffix Tree Similarity Measure for Document Clustering Hung Chim, Xiaotie Deng City University of Hong Kong WWW 2007 Session: Similarity Search April.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Simrank++: Query Rewriting through link analysis of the click graph Ioannis Antonellis Hector Garcia-Molina
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
Ph.D. Thesis Defense 1 Web Mining Techniques for Query Log Analysis and Expertise Retrieval Hongbo Deng Department of Computer Science and Engineering.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
ACM Multimedia th Annual Conference, October , 2004
Author: Jason Weston et., al PANS Presented by Tie Wang Protein Ranking: From Local to global structure in protein similarity network.
VIPAS: Virtual Link Powered Authority Search in the Web Chi-Chun Lin and Ming-Syan Chen Network Database Laboratory National Taiwan University.
Efficient Convex Relaxation for Transductive Support Vector Machine Zenglin Xu 1, Rong Jin 2, Jianke Zhu 1, Irwin King 1, and Michael R. Lyu 1 4. Experimental.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.
1 Extending Link-based Algorithms for Similar Web Pages with Neighborhood Structure Allen, Zhenjiang LIN CSE, CUHK 13 Dec 2006.
1 PageSim: A Link-based Similarity Measure for the World Wide Web Zhenjiang Lin, Irwin King, and Michael, R., Lyu Computer Science & Engineering, The Chinese.
1 COMP4332 Web Data Thanks for Raymond Wong’s slides.
Optimizing Learning with SVM Constraint for Content-based Image Retrieval* Steven C.H. Hoi 1th March, 2004 *Note: The copyright of the presentation material.
SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Liang Xiang, Quan Yuan, Shiwan Zhao, Li Chen, Xiatian Zhang, Qing Yang and Jimeng Sun Institute of Automation Chinese Academy of Sciences, IBM Research.
Table 3:Yale Result Table 2:ORL Result Introduction System Architecture The Approach and Experimental Results A Face Processing System Based on Committee.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Using Hyperlink structure information for web search.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
An Efficient Linear Time Triple Patterning Solver Haitong Tian Hongbo Zhang Zigang Xiao Martin D.F. Wong ASP-DAC’15.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
1 Heat Diffusion Classifier on a Graph Haixuan Yang, Irwin King, Michael R. Lyu The Chinese University of Hong Kong Group Meeting 2006.
Institute of Computing Technology, Chinese Academy of Sciences 1 A Unified Framework of Recommending Diverse and Relevant Queries Speaker: Xiaofei Zhu.
Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling Zhengchen Zhang , Shuzhi Sam Ge , Hongsheng.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Post-Ranking query suggestion by diversifying search Chao Wang.
Recommender Systems with Social Regularization Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu The Chinese University of Hong Kong Irwin.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Autumn Web Information retrieval (Web IR) Handout #14: Ranking Based on Click Through data Ali Mohammad Zareh Bidoki ECE Department, Yazd University.
Experience Report: System Log Analysis for Anomaly Detection
A Collaborative Quality Ranking Framework for Cloud Components
Xiang Li,1 Lili Mou,1 Rui Yan,2 Ming Zhang1
Deeply learned face representations are sparse, selective, and robust
HITS Hypertext-Induced Topic Selection
Methods and Apparatus for Ranking Web Page Search Results
Video Summarization by Spatial-Temporal Graph Optimization
Compact Query Term Selection Using Topically Related Text
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Wikitology Wikipedia as an Ontology
Postdoc, School of Information, University of Arizona
Cross-library API Recommendation Using Web Search Engines
Zhenjiang Lin, Michael R. Lyu and Irwin King
Web Information retrieval (Web IR)
WorkShop on Community Question Answering on the Web
Personalized Celebrity Video Search Based on Cross-space Mining
Junghoo “John” Cho UCLA
Web Information retrieval (Web IR)
--WWW 2010, Hongji Bao, Edward Y. Chang
Presentation transcript:

A Generalized Co-HITS Algorithm and Its Application to Bipartite Graphs Hongbo Deng, Michael R. Lyu and Irwin King Department of Computer Science and Engineering The Chinese University of Hong Kong July 1st, 2009

Incorporate Content with Graph Introduction Many data can be modeled as bipartite graphs IR Models for Link Analysis for Content Graph - VSM - Language Model - etc. - HITS - PageRank - etc. Relevance Semantic relations Incorporate Content with Graph - Personalized PageRank (PPR) - Linear Combination - etc.

An Illustration HITS PPR More reasonable Noisy link data Query suggestion for query “map”: Noisy link data Lack of relevance constraints HITS PPR More reasonable mapquest united states map map of florida us map world map google mapquest google.com map quest weather mapquest map quest google united states map mapquest.com

Outline Introduction Generalized Co-HITS Experiments Conclusion Preliminaries Iterative Framework Regularization Framework Experiments Conclusion

Preliminaries Content Graph X Y Explicit links: Hidden links:

Generalized Co-HITS Basic idea Incorporate the bipartite graph with the content information from both sides Initialize the vertices with the relevance scores x0, y0 Propagate the scores (mutual reinforcement) Initial scores Score propagation

Generalized Co-HITS Iterative framework

Iterative  Regularization Framework Consider the vertices on one side Smoothness Fit initial scores

Generalized Co-HITS Regularization Framework R1 R3 R2 Wuu Wvv Intuition: the highly connected vertices are most likely to have similar relevance scores.

Generalized Co-HITS Regularization Framework The cost function: Optimization problem: Solution:

Application to Query-URL Bipartite Graphs Bipartite graph construction Edge weighted by the click frequency Normalize to obtain the transition matrix Overall Algorithm In this paper, we apply the proposed method to the query-URL bipartite graph for query suggestion. Please refer to our paper for more details.

Outline Introduction Preliminaries Generalized Co-HITS Experiments Iterative Framework Regularization Framework Experiments Conclusion

Experimental Evaluation Data collection AOL query log data Cleaning the data Removing the queries that appear less than 2 times Combining the near-duplicated queries 883,913 queries and 967,174 URLs 4,900,387 edges 250,127 unique terms

Evaluation: ODP Similarity A simple measure of similarity among queries using ODP categories (query  category) Definition: Example: Q1: “United States”  “Regional > North America > United States” Q2: “National Parks”  “Regional > North America > United States > Travel and Tourism > National Parks and Monuments” Precision at rank n (P@n): 300 distinct queries 3/5

Experimental Results Comparison of Iterative Framework Result 1: personalized PageRank one-step propagation general Co-HITS Result 1: The improvements of OSP and CoIter over the baseline (the dashed line) are promising when compared to the PPR. The initial relevance scores from both sides provide valuable information.

Experimental Results Comparison of Regularization Framework Result 2: single-sided regularization double-sided regularization Result 2: SiRegu can improve the performance over the baseline. CoRegu performs better than SiRegu, which owes to the newly developed cost function R3. Moreover, CoRegu is relatively robust.

Experimental Results Detailed Results Result 3: The CoRegu-0.5 achieves the best performance. It is very essential and promising to consider the double-sided regularization framework for the bipartite graph.

Conclusions Propose the Co-HITS algorithm to incorporate the bipartite graph with the content information from both sides. The Co-HITS algorithm is more general, which includes HITS and personalized PageRank as special cases. The CoRegu is more robust with the newly developed cost function, which achieves the best performance with consistent and promising improvements.

Q&A Thanks!