Link Building Martin Olsen Department of Computer Science Aarhus University 1.

Slides:



Advertisements
Similar presentations
Complex Networks Advanced Computer Networks: Part1.
Advertisements

1 A Graph-Theoretic Network Security Game M. Mavronicolas , V. Papadopoulou , A. Philippou  and P. Spirakis § University of Cyprus, Cyprus  University.
Social network partition Presenter: Xiaofei Cao Partick Berg.
1 Material to Cover  relationship between different types of models  incorrect to round real to integer variables  logical relationship: site selection.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Spread of Influence through a Social Network Adapted from :
A Simple Distribution- Free Approach to the Max k-Armed Bandit Problem Matthew Streeter and Stephen Smith Carnegie Mellon University.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Balázs Sziklai Selfish Routing in Non-cooperative Networks.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Lecture 1 - Introduction 1.  Introduction to Game Theory  Basic Game Theory Examples  Strategic Games  More Game Theory Examples  Equilibrium  Mixed.
On the Topologies Formed by Selfish Peers Thomas Moscibroda Stefan Schmid Roger Wattenhofer IPTPS 2006 Santa Barbara, California, USA.
Estimating the Global PageRank of Web Communities Paper by Jason V. Davis & Inderjit S. Dhillon Dept. of Computer Sciences University of Texas at Austin.
CS246: Page Selection. Junghoo "John" Cho (UCLA Computer Science) 2 Page Selection Infinite # of pages on the Web – E.g., infinite pages from a calendar.
CS246 Search Engine Bias. Junghoo "John" Cho (UCLA Computer Science)2 Motivation “If you are not indexed by Google, you do not exist on the Web” --- news.com.
An Introduction to Black-Box Complexity
2-Layer Crossing Minimisation Johan van Rooij. Overview Problem definitions NP-Hardness proof Heuristics & Performance Practical Computation One layer:
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Lecture 3 Power Law Structure Ding-Zhu Du Univ of Texas at Dallas.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.
1 Introduction to Approximation Algorithms Lecture 15: Mar 5.
1 Refined Search Tree Technique for Dominating Set on Planar Graphs Jochen Alber, Hongbing Fan, Michael R. Fellows, Henning Fernau, Rolf Niedermeier, Fran.
Network Formation Games. Netwok Formation Games NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models:
Chapter 8 Web Structure Mining Part-1 1. Web Structure Mining Deals mainly with discovering the model underlying the link structure of the web Deals with.
More Algorithms for Trees and Graphs Eric Roberts CS 106B March 11, 2013.
Motivation When searching for information on the WWW, user perform a query to a search engine. The engine return, as the query’s result, a list of Web.
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Search Engine Optimization
On the approximability of the link building problem Author - MartinOlsena,AnastasiosViglasb, ∗ Speaker - Wayne Yang.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
The PageRank Citation Ranking: Bringing Order to the Web Presented by Aishwarya Rengamannan Instructor: Dr. Gautam Das.
By: Gang Zhou Computer Science Department University of Virginia 1 A Game-Theoretic Framework for Congestion Control in General Topology Networks SYS793.
Piyush Kumar (Lecture 2: PageRank) Welcome to COT5405.
Fixed Parameter Complexity Algorithms and Networks.
Approximation Algorithms for NP-hard Combinatorial Problems Magnús M. Halldórsson Reykjavik University
Graph-based Algorithms in Large Scale Information Retrieval Fatemeh Kaveh-Yazdy Computer Engineering Department School of Electrical and Computer Engineering.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Presenter: Jen Hua Chi Adviser: Yeong Sung Lin Network Games with Many Attackers and Defenders.
Inoculation Strategies for Victims of Viruses and the Sum-of-Squares Partition Problem Kevin Chang Joint work with James Aspnes and Aleksandr Yampolskiy.
1 Network Coding and its Applications in Communication Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
Approximation Algorithms
1 The Price of Defense M. Mavronicolas , V. Papadopoulou , L. Michael ¥, A. Philippou , P. Spirakis § University of Cyprus, Cyprus  University of Patras.
The Dominating Set and its Parametric Dual  the Dominated Set  Lan Lin prepared for theory group meeting on June 11, 2003.
A Membrane Algorithm for the Min Storage problem Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano – Bicocca WMC.
Beyond selfish routing: Network Games. Network Games NGs model the various ways in which selfish users (i.e., players) strategically interact in using.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu Yu Kang, Yangfan Zhou, Zibin Zheng, and Michael R. Lyu {ykang,yfzhou,
Hedonic Clustering Games Moran Feldman Joint work with: Seffi Naor and Liane Lewin-Eytan.
An Effective Method to Improve the Resistance to Frangibility in Scale-free Networks Kaihua Xu HuaZhong Normal University.
Vasilis Syrgkanis Cornell University
Link Building and Communities in Large Networks Martin Olsen University of Aarhus Link Building Link Building is NP-Hard The dashed links show the set.
Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.
STATIC ANALYSIS OF UNCERTAIN STRUCTURES USING INTERVAL EIGENVALUE DECOMPOSITION Mehdi Modares Tufts University Robert L. Mullen Case Western Reserve University.
Network Theory: Community Detection Dr. Henry Hexmoor Department of Computer Science Southern Illinois University Carbondale.
The geometric GMST problem with grid clustering Presented by 楊劭文, 游岳齊, 吳郁君, 林信仲, 萬高維 Department of Computer Science and Information Engineering, National.
Network Formation Games. NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models: Global Connection Game.
Link Building and Communities in Large Networks Link Building Link Building is W[1]-Hard and allows no FPTAS The dashed links show the set of two new links.
Mathematics of the Web Prof. Sara Billey University of Washington.
Dynamic Network Analysis Case study of PageRank-based Rewiring Narjès Bellamine-BenSaoud Galen Wilkerson 2 nd Second Annual French Complex Systems Summer.
Network Formation Games. NFGs model distinct ways in which selfish agents might create and evaluate networks We’ll see two models: Global Connection Game.
Fixed Parameter Tractability for Graph Drawing Sue Whitesides Computer Science Department.
DTMC Applications Ranking Web Pages & Slotted ALOHA
Bin Fu Department of Computer Science
Piyush Kumar (Lecture 2: PageRank)
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu
Network Formation Games
Presentation transcript:

Link Building Martin Olsen Department of Computer Science Aarhus University 1

Outline Motivation and Introduction Contribution Link Building Communities in Networks Hedonic Games Simple Games 2

... in 2012, companies will spend almost $9 billion on search engine optimization … The New York Times, January 2009 Objective of SEO: A link to your page appears here on page 1 What is Search Engine Optimization (SEO) ? 3

www as a Graph 4 = =

PageRank. Random Surfer Perspective random surfers Random Surfer Zaps with probability 0.15

PageRank. Random Surfer Perspective Random Surfer Zaps with probability = / = 4  random surfers Distribution after one tick

PageRank. Random Surfer Perspective random surfers Stationary distribution after 50 ticks Random Surfer Zaps with probability 0.15

PageRank. Random Surfer Perspective Random Surfer Zaps with probability 0.15

PageRank. Random Surfer Perspective PageRank Ranking: 1, 2, 4, 3, 6 PageRank is an important ingredient of the ranking mechanism Relevance counts as well! Random Surfer Zaps with probability 0.15

Link Building is an Important Aspect of SEO 10

Contribution/Link Building The Computational Complexity of Link Building (Cocoon ´08) Olsen Maximizing PageRank with new Backlinks (submitted) Olsen MILP for Link Building (In preparation) Olsen, Viglas 11

12 The Link Building Problem. Formal Definition LINK BUILDING Instance : G(V, E), t  V, k  Z + Solution : S  V  { t } with  S   k maximizing  t after adding S  { t } to E

13 Link Building is not Trivial

PageRank Topology Theorem *) : The expected number of visits to p for a random surfer starting at u prior to the first zapping event 14 i j 1  1  increase in PageRank

Does the graph contain an independent set of size k ? Can we turn this question into a Link Building problem? k -REGULAR INDEPENDENT SET ≤ FPT LINK BUILDING 15 j i

16 k -REGULAR INDEPENDENT SET ≤ FPT LINK BUILDING 1 x y OPT! i j Basic idea: Make z ij relatively big

17 k -REGULAR INDEPENDENT SET ≤ FPT LINK BUILDING 1 x y OPT! i j Basic idea: Make z ij relatively big LINK BUILDING is W[1]-hard *) : LINK BUILDING solvable in time f ( k )  n c  k -REGULAR INDEPENDENT SET solvable in time f ( k )  n c  W[1] = FPT Another result: FPTAS for LINK BUILDING  NP = P

Upper Bound: k = 1 fixed The dashed link can be found in time corresponding to O(1) PageRank computations with a randomized scheme *)

Upper Bound: Mixed Integer Linear Programming Approach *) Price for link from i Compute the cheapest set of new incoming links that would make node 5 rank highest

A Quiz: Which of the two situations would be optimal for Martin? 20

Contribution/Communities in Networks Communities in Large Networks: Identification and Ranking (WAW ´06) Olsen 21

22 Communities in Networks Dolphins in Doubtful Sound [Newman, Girvan ´04]:

23 What is a Community? Informally: A community C is a set of nodes with relatively many links between them Assumption/Observation: A CS site has relatively many CS links! Formal definition based on assumption *) :  v  C,  u  C: w vC ≤ w uC C

24 A Greedy Approach for Detecting Members of a Community *) Repeat until C is a Community: Find v  C with maximum attention to C C  C  {v} Update attentions Use two priority queues holding elements in C and V  C 1) Old C 2) New C

25 An Experiment. A Danish CS Community Crawl of the dk-domain with sites in total Representatives = 4 CS sites CS-Community with 556 sites Minimum attention, : 15.8% Maximum attention, : 15.4% Ranking: 1) (CS U Aarhus) 2) (CS U Copenhagen) 3) (ITU Copenhagen) 4) (CS U Aalborg) 5) (CS PhD School) 6) (Informatics/Mathematical modeling DTU Copenhagen) … 17) (CS/Mathematics U Southern Denmark)

26 Other Results Computing non trivial communities by the definition given is NP-hard A simple model for the evolution of communities is presented. These communities are probably obeying the definition for large n if the out degree of the nodes is  (log n ).

Contribution/Hedonic Games Nash Stability in Additively Separable Hedonic Games Is NP-Hard (CiE ´07) Olsen Extended version: Nash Stability in Additively Separable Hedonic Games and Community Structures (Theory of Computing Systems ´09) Olsen 27

An Additively Separable Hedonic Game Five waterholes w1, …, w5 with capacities 1, 2, 3, 4 and 8 l / h respectively. Two buffaloes b1 and b2 that hate each other. They are only thirsty if they have a parasite on their back in which case they have to drink 9 l / h. Two gigantic parasites p1 and p2. They only want to sit on b1 and b2 respectively. 28

An Additively Separable Hedonic Game One Nash Equilibrium for the game: PARTITION ≤  NE in ASHG  NPC *) 29

30 Community Structures in Networks Put a 1 on each connection between two dolphins. The community structure is a NE! NE  community structure? NE’s are NP-hard to compute even with symmetric and positive payoffs *)

Contribution/Simple Games On the Complexity of Problems on Simple Games (submitted) Freixas, Molinero, Olsen, Serna 31

32 Open Problems/Future Work In the thesis we show LINK BUILDING  APX. Is there a PTAS for LINK BUILDING? Surgical Link Building: Isolate the Community C Model all pages in V  C as one page Use MILP Use information on distribution of PageRank Does the stuff presented really work? Thank You!

Link Building. A Real World Example Dear X We are trying to get more links to our website to help improve its rating on the search engines. We were wondering if you could put a link to our site … on your webpage or blog. If you have a website or a Blog and put a link to our page on it then to say thank you for each month it is up, I will give you … Source: An to a colleague X 33

34 Link Building is not Trivial. 2nd Example Assumption: Obtaining a link from one green node is slightly better for node 1 compared to obtaining a link from one blue node. Now node 1 can pick three incoming links for free. What should node 1 choose? 1

35 No FPTAS for LINK BUILDING if NP ≠ P *) 1 x y OPT! i j

36 Power Law

37 Fixed Parameter Tractability: FPT and W[1] W[1] FPT k-VERTEX COVER k-REGULAR INDEPENDENT SET k-INDEPENDENT SET Complete for W[1] LINK BUILDING is W[1]-hard *) Solvable in time f ( k )  n c

38 Other Results Computing non trivial communities by the definition given is NP-hard A simple model for the evolution of communities is presented. These communities are probably obeying the definition for large n if the out degree of the nodes is  (log n ). C

Upper Bound: Mixed Integer Linear Programming Approach *) The dashed links show the cheapest modification that will bring node 5 to the top of the ranking. Computed using a MILP approach. Alternatively we could go for the maximum improvement in the ranking for a given budget. price for