Online Social Networks and Media Absorbing random walks Label Propagation Opinion Formation.

Slides:



Advertisements
Similar presentations
AI Pathfinding Representing the Search Space
Advertisements

How to Schedule a Cascade in an Arbitrary Graph F. Chierchetti, J. Kleinberg, A. Panconesi February 2012 Presented by Emrah Cem 7301 – Advances in Social.
Exact Inference in Bayes Nets
Graphs, Node importance, Link Analysis Ranking, Random walks
What is Probability? The study of probability helps us figure out the likelihood of something happening. In math we call this “something happening” or.
Hierarchical Decompositions for Congestion Minimization in Networks Harald Räcke 1.
Absorbing Random walks Coverage
More on Rankings. Query-independent LAR Have an a-priori ordering of the web pages Q: Set of pages that contain the keywords in the query q Present the.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Lecture 21: Spectral Clustering
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Maggie Zhou COMP 790 Data Mining Seminar, Spring 2011
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Spanning Trees. 2 Spanning trees Suppose you have a connected undirected graph Connected: every node is reachable from every other node Undirected: edges.
Link Analysis, PageRank and Search Engines on the Web
Measuring and Extracting Proximity in Networks By - Yehuda Koren, Stephen C.North and Chris Volinsky - Rahul Sehgal.
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
CS Ensembles and Bayes1 Semi-Supervised Learning Can we improve the quality of our learning by combining labeled and unlabeled data Usually a lot.
Monte Carlo Methods in Partial Differential Equations.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
The Cereal Box Problem By Narineh Movsessian. Random or Chance Outcomes SHow much precipitation is expected next year in the Kansas wheat belt? SHow many.
Clustering Unsupervised learning Generating “classes”
Interactive Image Segmentation of Non-Contiguous Classes using Particle Competition and Cooperation Fabricio Breve São Paulo State University (UNESP)
Abstract - Many interactive image processing approaches are based on semi-supervised learning, which employ both labeled and unlabeled data in its training.
School of Electronics Engineering and Computer Science Peking University Beijing, P.R. China Ziqi Wang, Yuwei Tan, Ming Zhang.
Adversarial Information Retrieval The Manipulation of Web Content.
Online Social Networks and Media Absorbing Random Walks Link Prediction.
MapReduce and Graph Data Chapter 5 Based on slides from Jimmy Lin’s lecture slides ( (licensed.
WAES 3308 Numerical Methods for AI
Random Walks and Semi-Supervised Learning Longin Jan Latecki Based on : Xiaojin Zhu. Semi-Supervised Learning with Graphs. PhD thesis. CMU-LTI ,
DATA MINING LECTURE 13 Absorbing Random walks Coverage.
Theory Revision Chris Murphy. The Problem Sometimes we: – Have theories for existing data that do not match new data – Do not want to repeat learning.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
Ohm’s law and Kirchhoff's laws
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
Network Flow How to solve maximal flow and minimal cut problems.
Introductory Statistics Lesson 3.1 C Objective: SSBAT find experimental probability. Standards: M11.E
Networks Igor Segota Statistical physics presentation.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
The new protocol of freenet Taken from Ian Clarke and Oskar Sandberg (The Freenet Project)
Relevant Subgraph Extraction Longin Jan Latecki Based on : P. Dupont, J. Callut, G. Dooms, J.-N. Monette and Y. Deville. Relevant subgraph extraction from.
Graphs David Johnson. Describing the Environment How do you tell a person/robot how to get somewhere? –Tell me how to get to the student services building…
Measuring Behavioral Trust in Social Networks
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
Presented by Dajiang Zhu 11/1/2011.  Introduction of Markov chains Definition One example  Two problems as examples 2-SAT Algorithm (simply introduce.
Mix networks with restricted routes PET 2003 Mix Networks with Restricted Routes George Danezis University of Cambridge Computer Laboratory Privacy Enhancing.
Learning with Green’s Function with Application to Semi-Supervised Learning and Recommender System ----Chris Ding, R. Jin, T. Li and H.D. Simon. A Learning.
Anonymous communication over social networks Shishir Nagaraja and Ross Anderson Security Group Computer Laboratory.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Sofus A. Macskassy Fetch Technologies
Dijkstra’s shortest path Algorithm
Search Engines and Link Analysis on the Web
Link-Based Ranking Seminar Social Media Mining University UC3M
DTMC Applications Ranking Web Pages & Slotted ALOHA
Special Graphs: Modeling and Algorithms
Community detection in graphs
Degree and Eigenvector Centrality
Network Science: A Short Introduction i3 Workshop
Department of Computer Science University of York
Lecture 2: Complex Networks
Problem definition Given a data set X = {x1, x2, …, xn} Rm and a label set L = {1, 2, … , c}, some samples xi are labeled as yi  L and some are unlabeled.
CSE 373: Data Structures and Algorithms
Graph and Link Mining.
Special Graphs: Modeling and Algorithms
Presentation transcript:

Online Social Networks and Media Absorbing random walks Label Propagation Opinion Formation

Random walk with absorbing nodes What happens if we do a random walk on this graph? What is the stationary distribution? All the probability mass on the red sink node: – The red node is an absorbing node

Random walk with absorbing nodes What happens if we do a random walk on this graph? What is the stationary distribution? There are two absorbing nodes: the red and the blue. The probability mass will be divided between the two

Absorption probability If there are more than one absorbing nodes in the graph a random walk that starts from a non-absorbing node will be absorbed in one of them with some probability – The probability of absorption gives an estimate of how close the node is to red or blue

Absorption probability Computing the probability of being absorbed: – The absorbing nodes have probability 1 of being absorbed in themselves and zero of being absorbed in another node. – For the non-absorbing nodes, take the (weighted) average of the absorption probabilities of your neighbors if one of the neighbors is the absorbing node, it has probability 1 – Repeat until convergence (= very small change in probs)

Absorption probability Computing the probability of being absorbed: – The absorbing nodes have probability 1 of being absorbed in themselves and zero of being absorbed in another node. – For the non-absorbing nodes, take the (weighted) average of the absorption probabilities of your neighbors if one of the neighbors is the absorbing node, it has probability 1 – Repeat until convergence (= very small change in probs)

Why do we care? Why do we care to compute the absorbtion probability to sink nodes? Given a graph (directed or undirected) we can choose to make some nodes absorbing. – Simply direct all edges incident on the chosen nodes towards them. The absorbing random walk provides a measure of proximity of non-absorbing nodes to the chosen nodes. – Useful for understanding proximity in graphs – Useful for propagation in the graph E.g, some nodes have positive opinions for an issue, some have negative, to which opinion is a non-absorbing node closer?

Example In this undirected graph we want to learn the proximity of nodes to the red and blue nodes

Example Make the nodes absorbing

Absorption probability Compute the absorbtion probabilities for red and blue

Penalizing long paths The orange node has the same probability of reaching red and blue as the yellow one Intuitively though it is further away

Penalizing long paths Add an universal absorbing node to which each node gets absorbed with probability α. 1-α α α α α With probability α the random walk dies With probability (1-α) the random walk continues as before The longer the path from a node to an absorbing node the more likely the random walk dies along the way, the lower the absorbtion probability

Propagating values Assume that Red has a positive value and Blue a negative value – Positive/Negative class, Positive/Negative opinion We can compute a value for all the other nodes in the same way – This is the expected value for the node

Electrical networks and random walks Our graph corresponds to an electrical network There is a positive voltage of +1 at the Red node, and a negative voltage -1 at the Blue node There are resistances on the edges inversely proportional to the weights (or conductance proportional to the weights) The computed values are the voltages at the nodes

Opinion formation

Example Social network with internal opinions s = +0.5 s = -0.3 s = -0.1s = +0.2 s = +0.8

Example s = +0.5 s = -0.3 s = -0.1 s = -0.5 s = +0.8 The external opinion for each node is computed using the value propagation we described before Repeated averaging Intuitive model: my opinion is a combination of what I believe and what my social network believes. One absorbing node per user with value the internal opinion of the user One non-absorbing node per user that links to the corresponding absorbing node z = z = z = z = 0.04 z = -0.01

Transductive learning If we have a graph of relationships and some labels on some nodes we can propagate them to the remaining nodes – Make the labeled nodes to be absorbing and compute the probability for the rest of the graph – E.g., a social network where some people are tagged as spammers – E.g., the movie-actor graph where some movies are tagged as action or comedy. This is a form of semi-supervised learning – We make use of the unlabeled data, and the relationships It is also called transductive learning because it does not produce a model, but just labels the unlabeled data that is at hand. – Contrast to inductive learning that learns a model and can label any new example

Implementation details