Using Friendship Ties and Family Circles for Link Prediction

Slides:



Advertisements
Similar presentations
‘Small World’ Networks (An Introduction) Presenter : Vishal Asthana
Advertisements

Suleyman Cetintas 1, Monica Rogati 2, Luo Si 1, Yi Fang 1 Identifying Similar People in Professional Social Networks with Discriminative Probabilistic.
1 Intrusion Monitoring of Malicious Routing Behavior Poornima Balasubramanyam Karl Levitt Computer Security Laboratory Department of Computer Science UCDavis.
Decision Tree Approach in Data Mining
Data Mining Classification: Alternative Techniques
Analysis and Modeling of Social Networks Foudalis Ilias.
(Social) Networks Analysis I
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Decision Tree Algorithm
To Join or Not to Join: The Illusion of Privacy in Social Networks with Mixed Public and Private User Profiles By Elena Zheleva, Lise Getoor Presented.
Data classification based on tolerant rough set reporter: yanan yean.
Chapter 5 Data mining : A Closer Look.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling
Using Friendship Ties and Family Circles for Link Prediction Elena Zheleva, Lise Getoor, Jennifer Golbeck, Ugur Kuter (SNAKDD 2008)
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Dimacs Graph Mining (via Similarity Measures) Ye Zhu Stephanie REU-DIMACS, July 17, 2009 Mentor : James Abello.
© Copyright McGraw-Hill CHAPTER 1 The Nature of Probability and Statistics.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Principles of Social Network Analysis. Definition of Social Networks “A social network is a set of actors that may have relationships with one another”
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
南台科技大學 資訊工程系 A web page usage prediction scheme using sequence indexing and clustering techniques Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2010/10/15.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Taxonomy of Similarity Mechanisms for Case-Based Reasoning.
A Graph-based Friend Recommendation System Using Genetic Algorithm
You Are What You Tag Yi-Ching Huang and Chia-Chuan Hung and Jane Yung-jen Hsu Department of Computer Science and Information Engineering Graduate Institute.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
LOGO Finding High-Quality Content in Social Media Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis and Gilad Mishne (WSDM 2008) Advisor.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Algorithmic Detection of Semantic Similarity WWW 2005.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.
Dependency Networks for Collaborative Filtering and Data Visualization UAI-2000 발표 : 황규백.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
Using category-Based Adherence to Cluster Market-Basket Data Author : Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen Graduate : Chien-Ming Hsiao.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
Computing and Information Sciences Kansas State University ANNIE Conference November 10, 2008 Predicting Links and Link Change in Friends Networks: Supervised.
Biological Approach Methods. Other METHODS of studying biological traits??? How else can you examine biological links to behaviour? Brain storm.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Multivariate Analysis - Introduction. What is Multivariate Analysis? The expression multivariate analysis is used to describe analyses of data that have.
Link Prediction Class Data Mining Technology for Business and Society
Introduction to Machine Learning, its potential usage in network area,
Introduction to Survey Research
Social Networks Analysis
Multivariate Analysis - Introduction
User Joining Behavior in Online Forums
Comparison of Social Networks by Likhitha Ravi
E-Commerce Theories & Practices
Link Prediction Seminar Social Media Mining University UC3M
Adoption of Health Information Exchanges and Physicians’ Referral Patterns: Are they Mutually Reinforcing? SAEEDE EFTEKHARI*, School of Management, State.
Machine Learning Basics
Community detection in graphs
Network Science: A Short Introduction i3 Workshop
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Network Approaches John D. Prochaska, DrPH, MPH
The Nature of Probability and Statistics
Emotions in Social Networks: Distributions, Patterns, and Models
CS 594: Empirical Methods in HCC Social Network Analysis in HCI
Korea University of Technology and Education
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Microarray Data Set The microarray data set we are dealing with is represented as a 2d numerical array.
Discriminative Probabilistic Models for Relational Data
Multivariate Analysis - Introduction
Dong Xuan*, Sriram Chellappan*, Xun Wang* and Shengquan Wang+
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

Using Friendship Ties and Family Circles for Link Prediction Elena Zheleva, Lise Getoor, Jennifer Golbeck, Ugur Kuter (SNAKDD 2008)

OUTLINE Introduction Social Network Model Predicting Links in Social Networks A Feature Taxonomy Experimental Evaluation Conclusions and Future Work

INTRODUCTION There is a growing interest in social media and in data mining methods which can be used to analyze, support and enhance the effectiveness and utility of social media sites. Social network analysis has focused on actors and relationships between them, such as friendships and family. There has also been much work in community finding, where densely connected groups of actors are clustered together into communities.

INTRODUCTION This paper investigate the power of combining friendship and affiliation networks. The approach here is an attempt to bridge approaches based on structural equivalence and community detection. Structural equivalence: when two actors are similar based on participating in equivalent relationships. Two nodes are structurally equivalent if they have the same links to all other actors.

INTRODUCTION

INTRODUCTION

SOCIAL NETWORK MODEL Social networks describe actors and their relationships. This paper considers friendship relationships and family group memberships. The relationships here are undirected, unweighted relationships.

SOCIAL NETWORK MODEL The networks consist of: The relationships: actors: a set of actors A = {a1, . . . , an}. groups: a group of individuals connected through a common affiliation. The affiliations group the actors into sets G = {G1, . . . , Gm}. The relationships: friends: F {ai, aj} denotes that ai is friends with aj. family: M {ai, Gk} denotes that ai is a part of family Gk. Attribute b of actor ai : ai.b The set of friends of actor ai : ai.F The set of family members of actor ai : ai.M

SOCIAL NETWORK MODEL

SOCIAL NETWORK MODEL

PREDICTING LINKS IN SOCIAL NETWORKS Link prediction is useful for a variety of tasks. It is a core component of any system for dynamic network modeling — the dynamic model can predict which actors are likely to gain popularity, and which are likely to become central according to various social network metrics.

PREDICTING LINKS IN SOCIAL NETWORKS Link prediction is challenging for a number of reasons. When it is posed as a pair-wise classification problem, one of the fundamental challenges is dealing with the large outcome space; if there are n actors, there are n2 possible relations. In addition, because most social networks are sparsely connected, the prior probability of any link a priori is extremely small.

A FEATURE TAXONOMY This paper identified three classes of features in these networks that describe characteristics of potential links in the social network: Descriptive attributes Structural attributes Group attributes

A FEATURE TAXONOMY Descriptive attributes The descriptive attributes are attributes of nodes in the social network that do not consider the link structure of the network. Actor features: Breed Breed category Single Breed Purebred Actor-pair features Same breed

A FEATURE TAXONOMY Structural features These features introduced here describe features of network structure. Actor features: Number of friends : |ai.F| Actor-pair features: Number of common friends : |ai.F ∩ aj.F| Jaccard coefficient of the friend sets Density of common friends

A FEATURE TAXONOMY Structural features Jaccard coefficient of the friend sets: The Jaccard coefficient is a standard metric for measuring the similarity of two sets. Density of common friends: The number of friendship links between the common friends over the number of all possible friendship links in the set. The density of common friends of two nodes describes the strength in the community of common friends.

A FEATURE TAXONOMY Group features Actor features: Family Size : |ai.M| Actor-pair features: Number of friends in the family : The number of friends ai has in the family of aj : |ai.F ∩ aj.M|. Portion of friends in the family : The ratio between the number of friends that ai has in aj’s family and the size of aj’s family.

EXPERIMENTAL EVALUATION Data description Data: a random sample of 10,000 pets each from Dogster and Catster, and all 2059 pets registered with Hamsterster. For Dogster, the sample of 10,000 dogs had around 17,000 links among themselves, and sample from the non-existing links at a 1:10 ratio. Using the decision tree classifier from Weka. The accuracy was measured by computing F1 score.

EXPERIMENTAL EVALUATION Link-prediction results

EXPERIMENTAL EVALUATION Link-prediction results

EXPERIMENTAL EVALUATION Alternative network overlays This paper used the alternative network overlays to test whether there was an advantage to keeping the different types of links and the affiliation groups. Different-link and affiliation overlay Same-link and no affiliation overlay Same-link and affiliation overlay

EXPERIMENTAL EVALUATION Link-prediction results

CONCLUSIONS AND FUTURE WORK This research found that overlaying friendship and affiliation networks were very effective for link prediction. The experiments show that using affiliation information can achieve significantly higher prediction accuracy. As future work, investigation on the usage of edge weights and thresholds to define strongly connected clusters, and see if it works as well in link prediction as the family groups did here.