2010 © University of Michigan 1 DivRank: Interplay of Prestige and Diversity in Information Networks Qiaozhu Mei 1,2, Jian Guo 3, Dragomir Radev 1,2 1.

Slides:



Advertisements
Similar presentations
Markov Models.
Advertisements

Analysis and Modeling of Social Networks Foudalis Ilias.
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements Raju Balakrishnan (Arizona State University)
Ranking Web Sites with Real User Traffic Mark Meiss Filippo Menczer Santo Fortunato Alessandro Flammini Alessandro Vespignani Web Search and Data Mining.
Generative Models for the Web Graph José Rolim. Aim Reproduce emergent properties: –Distribution site size –Connectivity of the Web –Power law distriubutions.
Information Retrieval Lecture 8 Introduction to Information Retrieval (Manning et al. 2007) Chapter 19 For the MSc Computer Science Programme Dell Zhang.
Graph-based Text Summarization
Absorbing Random walks Coverage
The influence of search engines on preferential attachment Dan Li CS3150 Spring 2006.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly.
TSF: Trajectory-based Statistical Forwarding for Infrastructure-to-Vehicle Data Delivery in Vehicular Networks Jaehoon Jeong, Shuo Guo, Yu Gu, Tian He,
Evaluating Visual and Statistical Exploration of Scientific Literature Networks Robert Gove 1,3, Cody Dunne 1,3, Ben Shneiderman 1,3, Judith Klavans 2,
Maggie Zhou COMP 790 Data Mining Seminar, Spring 2011
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, Slides for Chapter 1:
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
2010 © University of Michigan 1 Text Retrieval and Data Mining in SI - An Introduction Qiaozhu Mei School of Information Computer Science and Engineering.
PageRank Identifying key users in social networks Student : Ivan Todorović, 3231/2014 Mentor : Prof. Dr Veljko Milutinović.
HCC class lecture 22 comments John Canny 4/13/05.
Using Social Networking Techniques in Text Mining Document Summarization.
Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web.
Generating Impact-Based Summaries for Scientific Literature Qiaozhu Mei, ChengXiang Zhai University of Illinois at Urbana-Champaign 1.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα
1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
Tag Ranking Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio.
Diversity in Ranking via Resistive Graph Centers Avinava Dubey IBM Research India Soumen Chakrabarti IIT Bombay Chiranjib Bhattacharyya IISc Bangalore.
LexRank: Graph-based Centrality as Salience in Text Summarization
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
The PageRank Citation Ranking: Bringing Order to the Web Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd Presented by Anca Leuca, Antonis Makropoulos.
LexPageRank: Prestige in Multi- Document Text Summarization Gunes Erkan and Dragomir R. Radev Department of EECS, School of Information University of Michigan.
Snowballing Effects in Preferential Attachment: The Impact of The Initial Links Huanyang Zheng and Jie Wu Computer and Information Sciences Temple University.
Ch 14. Link Analysis Padmini Srinivasan Computer Science Department
Link Analysis Rong Jin. Web Structure  Web is a graph Each web site correspond to a node A link from one site to another site forms a directed edge 
Ranking Link-based Ranking (2° generation) Reading 21.
RESEARCH EVALUATION - THE METRICS UNITED KINGDOM OCTOBER 2010.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
The Structure of Broad Topics on the Web Soumen Chakrabarti Mukul M. Joshi Kunal Punera (IIT Bombay) David M. Pennock (NEC Research Institute)
A Novel Relational Learning-to- Rank Approach for Topic-focused Multi-Document Summarization Yadong Zhu, Yanyan Lan, Jiafeng Guo, Pan Du, Xueqi Cheng Institute.
CSE 450 – Web Mining Seminar Professor Brian D. Davison Fall 2005 A PRESENTATION on What is this Page Known for? Computing Web Page Reputations D. Rafiei.
LexPageRank: Prestige in Multi-Document Text Summarization Gunes Erkan, Dragomir R. Radev (EMNLP 2004)
CS 540 Database Management Systems Web Data Management some slides are due to Kevin Chang 1.
Extrapolation to Speed-up Query- dependent Link Analysis Ranking Algorithms Muhammad Ali Norozi Department of Computer Science Norwegian University of.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
GRAPH BASED MULTI-DOCUMENT SUMMARIZATION Canan BATUR
Competition under Cumulative Advantage
Topics In Social Computing (67810)
Link-Based Ranking Seminar Social Media Mining University UC3M
Text Retrieval and Data Mining in SI - An Introduction
Generative Model To Construct Blog and Post Networks In Blogosphere
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Author: Kazunari Sugiyama, etc. (WWW2004)
The likelihood of linking to a popular website is higher
CS 440 Database Management Systems
Bring Order to The Web Ruey-Lung, Hsiao May 4 , 2000.
Department of Computer Science University of York
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval"
Peer-to-Peer and Social Networks
Jiawei Han Department of Computer Science
Competition under Cumulative Advantage
Graph and Link Mining.
Diffusion in Networks
Sequences II Prof. Noah Snavely CS1114
Presented by Nick Janus
Presentation transcript:

2010 © University of Michigan 1 DivRank: Interplay of Prestige and Diversity in Information Networks Qiaozhu Mei 1,2, Jian Guo 3, Dragomir Radev 1,2 1. School of Information 2. Computer Science and Engineering 3. Department of Statistics University of Michigan

2010 © University of Michigan Diversity in Ranking 2 Ranking papers, people, web pages, movies, restaurants… Web search; ads; recommender systems … Network based ranking – centrality/prestige

2010 © University of Michigan Ranking by Random Walks 3 a d c b ? Ranking using stationary distribution E.g., PageRank

2010 © University of Michigan Reinforcements in Random Walks Random walks are not random - rich gets richer; –e.g., civilization/immigration – big cities attract larger population; –Tourism – busy restaurants attract more visitors; 4 Source - Conformity!

2010 © University of Michigan Vertex-Reinforced Random Walk (Pemantle 92) 5 a d c b Reinforced random walk: transition probability is reinforced by the weight (number of visits) of the target state transition probabilities change over time

2010 © University of Michigan DivRank A smoothed version of Vertex-reinforced Random Walk Adding self-links; Efficient approximations: use to approximate 6 Random jump, could be personalized “organic” transition probability a c b Cumulative DivRank: Pointwise DivRank:

2010 © University of Michigan Experiments Three applications –Ranking movie actors (in co-star network) –Ranking authors/papers (in author/paper-citation network) –Text summarization (ranking sentences) Evaluation metrics: –diversity: density of subgraph; country coverage (actors) –quality: h-index (authors) ; # citation (papers); –quality + diversity: movie coverage (actors) ; impact coverage (papers) ; ROUGE (text summarization) 7

2010 © University of Michigan Results Divrank >> Grasshopper/MMR >> Pagerank 8 Divrank Grasshopper Pagerank Density Impact coverage Paper citation: Text Summarization:

2010 © University of Michigan Why Does it Work? Rich gets richer –Related to Polya’s urn and preferential attachment Compete for resource in neighborhood –Prestigious node absorbs weights of its neighbors An optimization explanation 9 a b Stay here or go to neighbors? c b

2010 © University of Michigan Summary DivRank – Prestige/Centrality + Diversity Mathematical foundation: vertex-reinforced random walk Connections: –Polya’s Urn –Preferential Attachments –Word burstiness Why it works? –Rich-gets-richer –Local resource competition Future work: Query dependent DivRank; 10

2010 © University of Michigan Thanks! 11

2010 © University of Michigan Optimization Interpretation Every time, the process distributes the weight towards the ratio of the reinforced degree 12